The Gaussian mixture model (GMM) method is popular and efficient for voice conversion (VC), but it is often subject to overfitting. In this paper, the principal component regression (PCR) method is adopted for the spectral mapping between source speech and target speech, and the numbers of principal components are adjusted properly to prevent the overfitting. Then, in order to better model the nonlinear relationships between the source speech and target speech, the kernel principal component regression (KPCR) method is also proposed. Moreover, a KPCR combined with GMM method is further proposed to improve the accuracy of conversion. In addition, the discontinuity and oversmoothing problems of the traditional GMM method are also addressed. On the one hand, in order to solve the discontinuity problem, the adaptive median filter is adopted to smooth the posterior probabilities. On the other hand, the two mixture components with higher posterior probabilities for each frame are chosen for VC to reduce the oversmoothing problem. Finally, the objective and subjective experiments are carried out, and the results demonstrate that the proposed approach shows greatly better performance than the GMM method. In the objective tests, the proposed method shows lower cepstral distances and higher identification rates than the GMM method. While in the subjective tests, the proposed method obtains higher scores of preference and perceptual quality.
It is meaningful to study the issues of CO migration and its concentration distribution in a blind gallery to provide a basis for CO monitoring and calculation of fume-drainage time, which is of a great significance to prevent fume-poisoning accidents and improve efficiency of an excavation cycle. Based on a theoretical analysis of a differential change of CO mass concentration and the CO dispersion model in a fixed site, this paper presents several blasting fume monitoring test experiments, carried out with the test location to the head LP in arrange of 40-140 m. Studies have been done by arranging multiple sensors in the arch cross-section of the blind gallery, located at the Guilaizhuang Gold Mine, Shandong Province, China. The findings indicate that CO concentrations in the axial directions are quadratic functions with the Y and Z coordinate values of the cross-section of the blind gallery in an ascending stage of CO time- -concentration curve, with the maximum CO concentrations in Y = 150 cm and Z = 150 cm. Also, the gradients of CO concentration in the gallery are symmetrical with the Y = 150 cm and Z = 150 cm. In the descending stage of CO time-concentration curve, gradients of CO concentration decrease in lateral sides and increase in the middle, then gradually decrease at last. The rules of CO concentration distribution in the cross-section are that airflow triggers the turbulent change of the CO distribution volume concentration and make the CO volume concentration even gradually in the fixed position of the gallery. Moreover, the CO volume concentrations decrease gradually, as well as volume concentration gradients in the cross-section. The uniformity coefficients of CO concentration with duct airflow velocities of 12.5 m/s, 17.7 m/s and 23.2 m/s reach near 0.9 at 100-140 m from the heading to the monitoring spot. The theoretical model of a one-dimensional migration law of CO basically coincides with the negative exponential decay, which is verified via fitting. The average effective turbulent diffusion coefficient of CO in the blind gallery is approximate to 0.108 m2/s. There are strong linear relationships between CO initial concentration, CO peak concentrations and mass of explosive agent, which indicates that the CO initial concentration and the CO peak concentration can be predicted, based on the given range of the charging mass. The above findings can provide reliable references to the selection, installation of CO sensors and prediction of the fume-drainage time after blasting.
A novel VC (voice conversion) method based on hybrid SVR (support vector regression) and GMM (Gaussian mixture model) is presented in the paper, the mapping abilities of SVR and GMM are exploited to map the spectral features of the source speaker to those of target ones. A new strategy of F0 transformation is also presented, the F0s are modeled with spectral features in a joint GMM and predicted from the converted spectral features using the SVR method. Subjective and objective tests are carried out to evaluate the VC performance; experimental results show that the converted speech using the proposed method can obtain a better quality than that using the state-of-the-art GMM method. Meanwhile, a VC method based on non-parallel data is also proposed, the speaker-specific information is investigated using the SVR method and preliminary subjective experiments demonstrate that the proposed method is feasible when a parallel corpus is not available.
In this paper, 3 typical organic fluids were selected as working fluids for a sample slag washing water binary power plants. In this system, the working fluids obtain the thermal energy from slag washing water sources. Thus, it plays a significant role on the cycle performance to select the suitable working fluid. Energy and exergy efficiencies of 3 typical organic fluids were calculated. Dry type fluids (i.e., R227ea) showed higher energy and exergy efficiencies. Conversely, wet fluids (i.e., R143a and R290) indicated lower energy and exergy efficiencies, respectively. Słowa kluczowe