Analysis of procedures for voice signal normalization and segmentation in information systems

Pastushenko, M. S.Pastushenko, О. М.Faizulaiev, T. А.Пастушенко, M. С.Пастушенко, О. М.Файзулаєв, T. А.2026-03-052026-03-052025https://eir.zp.edu.ua/handle/123456789/27153Pastushenko M. S. Analysis of procedures for voice signal normalization and segmentation in information systems / M. S. Pastushenko, O. M. Pastushenko, T. A. Faizulaiev // Радіоелектроніка, інформатика, управління. – 2025. – № 4 (75). – C. 194-201.EN: Context. The current task of evaluating formant data (formant frequencies, their spectral density level, amplitude-frequency spectrum envelope, formant frequency spectrum width) in voice authentication systems is considered. The object of the study is the process of digital preprocessing of the voice signal when extracting formant data. Objective. Evaluation of the effectiveness of traditional procedures for digital preprocessing of a user voice signal and development of proposals for improving the quality of formant data extraction. Method. A mathematical model for extracting formant data from an experimental voice signal has been developed to study the influence of normalization and segmentation procedures on the quality of the resulting estimates. By modeling the process of extracting formant data, the results of digital processing of normalized and non-normalized voice signals are compared. The influence of the processed frame duration of the experimental voice signal on the quality of the formant frequencies assessment is estimated. The results are obtained for the experimental phoneme and morpheme. Results. The obtained results show that when processing a voice signal with a sufficient signal-to-noise ratio, normalization procedures are not mandatory when extracting formant data. Moreover, normalization leads to a less accurate measurement of the spectrum width of formant frequencies. It is also unacceptable to use a processed frame duration of less than 40 ms. These results allow us to modify the traditional method of voice signal preprocessing. The use of the modeling method in the study of the experimental voice signal confirms the reliability of the results obtained. Conclusions. The scientific novelty of the research results lies in the modification of the voice signal preprocessing methodology in authentication systems. Eliminating normalization procedures at high signal-to-noise ratios of the voice signal, which occurs in user authentication systems, makes it possible to increase the speed of formant data extraction and more accurately estimate the width of the formant frequency spectrum. Selecting a frame duration of at least 40 ms for the processed signal significantly improves the accuracy of formant frequency determination. Otherwise, the estimates of the formant frequencies will be high. Moreover, when processing phonemes, the processed voice signal cannot be divided into frames. Practical application of research results allows to increase the efficiency and accuracy of the formant data generation. Prospects for further research may be studies of the influence of normalization and framing procedures on other elements of a template of the authentication system user. UK: Актуальність. Розглядається актуальне завдання оцінки формантних даних (формантних частот, рівня їхньої спектральної щільності, огинаючу амплітудно-частотного спектру, ширини спектрів формантних частот) у системах голосової автентифікації. Об’єктом дослідження був процес цифрової попередньої обробки голосового сигналу під час вилучення формантних даних. Мета роботи – оцінка ефективності традиційних процедур цифрової попередньої обробки голосового сигналу користувача та розробка пропозицій щодо підвищення якості вилучення формантних даних. Метод. Розроблено математичну модель вилучення формантних даних з експериментального голосового сигналу для дослідження впливу процедур нормалізації та сегментації на якість одержуваних оцінок. Шляхом моделювання процесу отримання формантних даних порівнюються результати цифрової обробки нормалізованого і ненормалізованого голосового сигналу. Оцінюється вплив тривалості обробленого кадру експериментального голосового сигналу якість оцінки формантних частот. Результати отримані для експериментальної фонеми та морфеми. Результати. Отримані результати свідчать, що при обробці голосового сигналу з достатнім співвідношенням сигнал/шум процедури нормалізації не є обов’язковими для отримання формантних даних. Більше того, нормалізація призводить до менш точного виміру ширини спектрів формантних частот. Неприпустимим є використання тривалості оброблюваного кадру менше 40 мс. Зазначені результати дозволяють модифікувати традиційну методику попередньої обробки голосового сигналу. Використання методу моделювання щодо експериментального голосового сигналу підтверджує достовірність отриманих результатів. Висновки. Проведені експериментальні дослідження показують доцільність виключення процедур нормалізації при високому співвідношенні сигнал/шум голосового сигналу, що має місце в системах автентифікації користувачів. Такий підхід дозволить підвищити оперативність отримання формантних даних і більш точно оцінювати ширину спектрів формантних частот. Результати експериментального дослідження тривалості оброблюваного кадру голосового сигналу свідчать, що його тривалість не має бути менше 40 мс. В іншому випадку оцінки формантних частот будуть завищеними. Більш того, при обробці фонем можна голосовий сигнал, що обробляється, не розбивати на фрейми. Практичне застосування результатів досліджень дозволяє підвищити оперативність та точність формування формантних даних. Перспективами подальших досліджень може бути дослідження впливу процедур нормалізації та фреймінгу на інші елементи шаблону користувача системи автентифікації.enauthenticationvoice signalnormalizationsegmentationformant dataавтентифікаціяголосовий сигналнормалізаціясегментаціяформантні даніAnalysis of procedures for voice signal normalization and segmentation in information systemsАналіз процедур нормалізації та сегментації голосового сигналу в інформаційних системахArticle