When the distribution of water quality samples is roughly balanced, the Bayesian criterion model of water-inrush source generally can obtain relatively accurate results of water-inrush source identification. However, it is often difficult to achieve desired classification results when training samples are imbalanced. Sample imbalance is common in the source identification of mine water-inrush. Therefore, we propose a three-dimensional (3D) spatial resampling method based on rare water quality samples, which achieves the balance of water quality samples. Based on the virtual water sample points distributed by the 3D grid, the method uses the 3D Inverse Distance Weighting (IDW) method to interpolate the groundwater ion concentration of the virtual water samples to achieve oversampling of rare water samples. Case study in Gubei Coal Mine shows that the method improves overall discriminant accuracy of the Bayesian criterion model by 5.26%, from 85.26% to 90.69%. In particular, the discriminative precision of the rare class is improved from 0% to 83.33%, which indicates that the method can improve the discriminant accuracy of the rare class to large extent. In addition, this method increases the Kappa coefficient of the model by 19.92%, from 52.26% to 72.19%, increasing the degree of consistency from “general” to “significant”. Our research is of significance to enriching and improving the theory of prevention and treatment of mine water damage.
The purpose of this paper was testing suitability of the time-series analysis for quality control of the continuous steel casting process in production conditions. The analysis was carried out on industrial data collected in one of Polish steel plants. The production data concerned defective fractions of billets obtained in the process. The procedure of the industrial data preparation is presented. The computations for the time-series analysis were carried out in two ways, both using the authors’ own software. The first one, applied to the real numbers type of the data has a wide range of capabilities, including not only prediction of the future values but also detection of important periodicity in data. In the second approach the data were assumed in a binary (categorical) form, i.e. the every heat(melt) was labeled as ‘Good’ or ‘Defective’. The naïve Bayesian classifier was used for predicting the successive values. The most interesting results of the analysis include good prediction accuracies obtained by both methodologies, the crucial influence of the last preceding point on the predicted result for the real data time-series analysis as well as obtaining an information about the type of misclassification for binary data. The possibility of prediction of the future values can be used by engineering or operational staff with an expert knowledge to decrease fraction of defective products by taking appropriate action when the forthcoming period is identified as critical.