The aim of the article is to analyze Russian words transcribed into the Polish alphabet extracted from the texts of a Polish conservative-liberal author, S. Michalkiewicz, from the years 2003−2015. The lists of both correctly and incorrectly transcribed units are presented and the mistranscribed words are examined. The categories of transcription errors are provided along with the examples of words in which they occur. The results of the analysis may serve as a point of reference in further studies concerning adherence to the transcription rules of Russian performed on a larger number of texts written by a greater variety of authors.
This paper describes research behind a Large-Vocabulary Continuous Speech Recognition (LVCSR) system for the transcription of Senate speeches for the Polish language. The system utilizes severalcomponents: a phonetic transcription system, language and acoustic model training systems, a Voice Activity Detector (VAD), a LVCSR decoder, and a subtitle generator and presentation system. Some of the modules relied on already available tools and some had to be made from the beginning but the authors ensured that they used the most advanced techniques they had available at the time. Finally, several experiments were performed to compare the performance of both more modern and more conventional technologies.
In the paper, various approaches to automatic music audio summarization are discussed. The project described in detail, is the realization of a method for extracting a music thumbnail - a fragment of continuous music of a given duration time that is most similar to the entire music piece. The results of subjective assessment of the thumbnail choice are presented, where four parameters have been taken into account: clarity (representation of the essence of the piece of music), conciseness (the motifs are not repeated in the summary), coherence of music structure, and overall quality of summary usefulness.