Journal bearings are the most common type of bearings in which a shaft freely rotates in a metallic sleeve. They find a lot of applications in industry, especially where extremely high loads are involved. Proper analysis of the various bearing faults and predicting the modes of failure beforehand are essential to increase the working life of the bearing. In the current study, the vibration data of a journal bearing in the healthy condition and in five different fault conditions are collected. A feature extraction method is employed to classify the different fault conditions. Automatic fault classification is performed using artificial neural networks (ANN). As the probability of a correct prediction goes down for a higher number of faults in ANN, the method is made more robust by incorporating deep neural networks (DNN) with the help of autoencoders. Training was done using the scaled conjugate gradient algorithm and the performance was calculated by the cross entropy method. Due to the increased number of hidden layers in DNN, it is possible to achieve a high efficiency of 100% with the feature extraction method.
Speech enhancement is fundamental for various real time speech applications and it is a challenging task in the case of a single channel because practically only one data channel is available. We have proposed a supervised single channel speech enhancement algorithm in this paper based on a deep neural network (DNN) and less aggressive Wiener filtering as additional DNN layer. During the training stage the network learns and predicts the magnitude spectrums of the clean and noise signals from input noisy speech acoustic features. Relative spectral transform-perceptual linear prediction (RASTA-PLP) is used in the proposed method to extract the acoustic features at the frame level. Autoregressive moving average (ARMA) filter is applied to smooth the temporal curves of extracted features. The trained network predicts the coefficients to construct a ratio mask based on mean square error (MSE) objective cost function. The less aggressive Wiener filter is placed as an additional layer on the top of a DNN to produce an enhanced magnitude spectrum. Finally, the noisy speech phase is used to reconstruct the enhanced speech. The experimental results demonstrate that the proposed DNN framework with less aggressive Wiener filtering outperforms the competing speech enhancement methods in terms of the speech quality and intelligibility.
Self-aligning roller bearings are an integral part of the industrial machinery. The proper analysis and prediction of the various faults that may happen to the bearing beforehand contributes to an increase in the working life of the bearing. This study aims at developing a novel method for the analysis of the various faults in self-aligning bearings as well as the automatic classification of faults using artificial neural network (ANN) and deep neural network (DNN). The vibration data is collected for six different faults as well as for the healthy bearing. Empirical mode decomposition (EMD) followed by Hilbert Huang transform is used to extract instantaneous frequency peaks which are used for fault analysis. Time domain and time-frequency domain features are then extracted which are used to implement the neural networks through the pattern recognition tool in MATLAB. A comparative study of the outputs from the two neural networks is also performed. From the confusion matrix, the efficiency of the ANN has been found to be 95.7% and using DNN has been found to be 100%.
Skin cancer is the most common form of cancer affecting humans. Melanoma is the most dangerous type of skin cancer; and early diagnosis is extremely vital in curing the disease. So far, the human knowledge in this field is very limited, thus, developing a mechanism capable of identifying the disease early on can save lives, reduce intervention and cut unnecessary costs. In this paper, the researchers developed a new learning technique to classify skin lesions, with the purpose of observing and identifying the presence of melanoma. This new technique is based on a convolutional neural network solution with multiple configurations; where the researchers employed an International Skin Imaging Collaboration (ISIC) dataset. Optimal results are achieved through a convolutional neural network composed of 14 layers. This proposed system can successfully and reliably predict the correct classification of dermoscopic lesions with 97.78% accuracy.
This paper describes a Deep Belief Neural Network (DBNN) and Bidirectional Long-Short Term Memory (LSTM) hybrid used as an acoustic model for Speech Recognition. It was demonstrated by many independent researchers that DBNNs exhibit superior performance to other known machine learning frameworks in terms of speech recognition accuracy. Their superiority comes from the fact that these are deep learning networks. However, a trained DBNN is simply a feed-forward network with no internal memory, unlike Recurrent Neural Networks (RNNs) which are Turing complete and do posses internal memory, thus allowing them to make use of longer context. In this paper, an experiment is performed to make a hybrid of a DBNN with an advanced bidirectional RNN used to process its output. Results show that the use of the new DBNN-BLSTM hybrid as the acoustic model for the Large Vocabulary Continuous Speech Recognition (LVCSR) increases word recognition accuracy. However, the new model has many parameters and in some cases it may suffer performance issues in real-time applications.
Laughter is one of the most important paralinguistic events, and it has specific roles in human conversation. The automatic detection of laughter occurrences in human speech can aid automatic speech recognition systems as well as some paralinguistic tasks such as emotion detection. In this study we apply Deep Neural Networks (DNN) for laughter detection, as this technology is nowadays considered state-of-the-art in similar tasks like phoneme identification. We carry out our experiments using two corpora containing spontaneous speech in two languages (Hungarian and English). Also, as we find it reasonable that not all frequency regions are required for efficient laughter detection, we will perform feature selection to find the sufficient feature subset.