A novel voice feature ava and its application to the pathological voice detection through machine learning

Shazlyn Milleana Shaharudin

QR Code Link :
Type :	Article
Subject :	Q Science (General)
ISSN :	2158-107X
Main Author :	Shazlyn Milleana Shaharudin
Title :	A novel voice feature ava and its application to the pathological voice detection through machine learning
Hits :	204

Place of Production :	Tanjung Malim
Publisher :	Fakulti Sains dan Matematik
Year of Publication :	2023
Notes :	International Journal of Advanced Computer Science and Applications
Corporate Name :	Universiti Pendidikan Sultan Idris
HTTP Link :	Click to view web link
PDF Full Text :	You have no permission to view this item.

Abstract : Universiti Pendidikan Sultan Idris

Voice pathology is a universal problem which must be addressed. Traditionally, this malady is treated by using the surgical instruments in the varied healthcare settings. In the current era, machine learning experts have paid an increasing attention towards the solution of this problem by exploiting the signal processing of the voice. For this purpose, numerous voice features have been capitalized to classify the healthy and pathological voice signals. In particular, Mel-Frequency Cepstral Coefficients (MFCC) is a widely used feature in speech and audio signal processing. It denotes spectral characteristics of a voice signal, particularly of human speech. The modus operandi of MFCC is too time-consuming, which goes against the hasty and urgent nature of the modern times. This study has developed a yet another voice feature by utilizing the average value of the amplitudes (AVA) of the voice signals. Moreover, Gaussian Naive Bayes classifier has been employed to classify the given voice signals as healthy or pathological. Apart from that, the dataset has been acquired from the SVD (Saarbrucken Voice Database) to demonstrate the workability of the proposed voice feature and its usage in the classifier. The machine experimentation rendered very promising results. Particularly, Recall, F1 and accuracy scores obtained, are 100%, 83% and 80%, respectively. These results vividly imply that the proposed classifier can be installed in various healthcare settings. (2023), (Science and Information Organization). All Rights Reserved.

References

F. T. Al-Dhief, N. M. A. Latiff, N. N. N. A. Malik, N. S. Salim, M. M. Baki, M. A. A. Albadr, and M. A. Mohammed, “A survey of voice pathology surveillance systems based on internet of things and machine learning algorithms,” IEEE Access, vol. 8, pp. 64 514–64 533, 2020.

M. A. Mohammed, K. H. Abdulkareem, S. A. Mostafa, M. Khanapi Abd Ghani, M. S. Maashi, B. Garcia-Zapirain, I. Oleagordia, H. Alhakami, and F. T. Al-Dhief, “Voice pathology detection and classification using convolutional neural network model,” Applied Sciences, vol. 10, no. 11, p. 3723, 2020.

J. Hillenbrand, R. A. Cleveland, and R. L. Erickson, “Acoustic correlates of breathy vocal quality,” Journal of Speech, Language, and Hearing Research, vol. 37, no. 4, pp. 769–778, 1994.

N. Saenz-Lechon, J. I. Godino-Llorente, V. Osma-Ruiz, and P. G´omez- Vilda, “Methodological issues in the development of automatic systems for voice pathology detection,” Biomedical Signal Processing and Control, vol. 1, no. 2, pp. 120–128, 2006.

M. Markaki and Y. Stylianou, “Using modulation spectra for voice pathology detection and classification,” in 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, 2009, pp. 2514–2517.

M. S. Hossain, G. Muhammad, and A. Alamri, “Smart healthcare monitoring: a voice pathology detection paradigm for smart cities,” Multimedia Systems, vol. 25, pp. 565–575, 2019.

N. Q. Abdulmajeed, B. Al-Khateeb, and M. A. Mohammed, “A review on voice pathology: Taxonomy, diagnosis, medical procedures and detection techniques, open challenges, limitations, and recommendations for future directions,” Journal of Intelligent Systems, vol. 31, no. 1, pp.855–875, 2022.

J.-N. Lee and J.-Y. Lee, “An efficient smote-based deep learning model for voice pathology detection,” Applied Sciences, vol. 13, no. 6, p. 3571, 2023.

G. Muhammad and M. Alhussein, “Convergence of artificial intelligence and internet of things in smart healthcare: a case study of voice pathology detection,” Ieee Access, vol. 9, pp. 89 198–89 209, 2021.

F. T. Al-Dhief, M. M. Baki, N. M. A. Latiff, N. N. N. A. Malik, N. S. Salim, M. A. A. Albader, N. M. Mahyuddin, and M. A. Mohammed, “Voice pathology detection and classification by adopting online sequential extreme learning machine,” IEEE Access, vol. 9, pp.77 293–77 306, 2021.

A. Al-Nasheri, G. Muhammad, M. Alsulaiman, Z. Ali, K. H. Malki, T. A. Mesallam, and M. F. Ibrahim, “Voice pathology detection and classification using auto-correlation and entropy features in different frequency regions,” Ieee Access, vol. 6, pp. 6961–6974, 2017.

G. Muhammad and M. Melhem, “Pathological voice detection and binary classification using mpeg-7 audio features,” Biomedical Signal Processing and Control, vol. 11, pp. 1–9, 2014.

J. D. Arias-Londono, J. I. Godino-Llorente, N. S´aenz-Lech´on, V. Osma-Ruiz, and G. Castellanos-Dom´ınguez, “Automatic detection of pathological voices using complexity measures, noise parameters, and melcepstral coefficients,” IEEE Transactions on biomedical engineering, vol. 58, no. 2, pp. 370–379, 2010.

A. Al-Nasheri, G. Muhammad, M. Alsulaiman, Z. Ali, T. A. Mesallam, M. Farahat, K. H. Malki, and M. A. Bencherif, “An investigation of multidimensional voice program parameters in three different databases for voice pathology detection and classification,” Journal of Voice, vol. 31, no. 1, pp. 113–e9, 2017.

R. Ranjbarzadeh, S. Dorosti, S. Jafarzadeh Ghoushchi, S. Safavi, N. Razmjooy, N. Tataei Sarshar, S. Anari, and M. Bendechache, “Nerve optic segmentation in ct images using a deep learning model and a texture descriptor,” Complex & Intelligent Systems, vol. 8, no. 4, pp.3543–3557, 2022.

C. Yan and N. Razmjooy, “Kidney stone detection using an optimized deep believe network by fractional coronavirus herd immunity optimizer,” Biomedical Signal Processing and Control, vol. 86, p. 104951, 2023.

M. Naeem, W. K. Mashwani, M. Abiad, H. Shah, Z. Khan, and M. Aamir, “Soft computing techniques for forecasting of covid-19 in pakistan,” Alexandria Engineering Journal, vol. 63, pp. 45–56, 2023.

N. Razmjooy, V. V. Estrela, and H. J. Loschi, “Entropy-based breast cancer detection in digital mammograms using world cup optimization algorithm,” in Research Anthology on Medical Informatics in Breast and Cervical Cancer. IGI Global, 2023, pp. 645–665.

K. Shojaei and M. Abdolmaleki, “Saturated observer-based adaptive neural network leader-following control of n tractors with n-trailers with a guaranteed performance,” International Journal of Adaptive Control and Signal Processing, vol. 35, no. 1, pp. 15–37, 2021.

P. Singh, M. Sahidullah, and G. Saha, “Modulation spectral features for speech emotion recognition using deep neural networks,” Speech Communication, vol. 146, pp. 53–69, 2023.

R. Islam, M. Tarique, and E. Abdel-Raheem, “A survey on signal processing based pathological voice detection techniques,” IEEE Access, vol. 8, pp. 66 749–66 776, 2020.

H. Chen, L. Ran, X. Sun, and C. Cai, “Sw-wavenet: Learning representation from spectrogram and wavegram using wavenet for anomalous sound detection,” in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.

T. Kaneko, K. Tanaka, H. Kameoka, and S. Seki, “istftnet: Fast and lightweight mel-spectrogram vocoder incorporating inverse short-time fourier transform,” in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022, pp. 6207–6211.

M. S. Khan, N. Salsabil, M. G. R. Alam, M. A. A. Dewan, and M. Z. Uddin, “Cnn-xgboost fusion-based affective state recognition using eeg spectrogram image analysis,” Scientific Reports, vol. 12, no. 1, p. 14122, 2022.

G. Aggarwal, K. Jhajharia, J. Izhar, M. Kumar, and L. Abualigah, “A machine learning approach to classify biomedical acoustic features for baby cries,” Journal of Voice, 2023.

M. Du, S. Liu, T. Wang, W. Zhang, Y. Ke, L. Chen, and D. Ming, “Depression recognition using a proposed speech chain model fusing speech production and perception features,” Journal of Affective Disorders, vol. 323, pp. 299–308, 2023.

L. Jing, M. Zhao, P. Li, and X. Xu, “A convolutional neural network based feature learning and fault diagnosis method for the condition monitoring of gearbox,” Measurement, vol. 111, pp. 1–10, 2017.

A. Abeysinghe, M. Fard, R. Jazar, F. Zambetta, and J. Davy, “Mel frequency cepstral coefficient temporal feature integration for classifying squeak and rattle noise,” The Journal of the Acoustical Society of America, vol. 150, no. 1, pp. 193–201, 2021.

S. Chachada and C.-C. J. Kuo, “Environmental sound recognition: A survey,” APSIPA Transactions on Signal and Information Processing, vol. 3, p. e14, 2014.

M. S. Hossain and G. Muhammad, “Environment classification for urban big data using deep learning,” IEEE Communications Magazine, vol. 56, no. 11, pp. 44–50, 2018.

S. Tiwari, V. Sapra, and A. Jain, “Heartbeat sound classification using mel-frequency cepstral coefficients and deep convolutional neural network,” in Advances in Computational Techniques for Biomedical Image Analysis. Elsevier, 2020, pp. 115–131.

S.-H. Fang, Y. Tsao, M.-J. Hsiao, J.-Y. Chen, Y.-H. Lai, F.-C. Lin, and C.-T. Wang, “Detection of pathological voice using cepstrum vectors: A deep learning approach,” Journal of Voice, vol. 33, no. 5, pp. 634–641, 2019.

C. Vikram and K. Umarani, “Pathological voice analysis to detect neurological disorders using mfcc and svm,” Int. J. Adv. Electr. Electron. Eng, vol. 2, no. 4, pp. 87–91, 2013.

A. Sasou, “Automatic identification of pathological voice quality based on the grbas categorization,” in 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPAASC). IEEE, 2017, pp. 1243–1247.

S. Shinohara, Y. Omiya, M. Nakamura, N. Hagiwara, M. Higuchi, S. Mitsuyoshi, and S. Tokuno, “Multilingual evaluation of voice disability index using pitch rate,” Adv. Sci. Technol. Eng. Syst. J, vol. 2, no. 3, pp. 765–772, 2017.

M. Sarria-Paja and G. Castellanos-Dom´ınguez, “Robust pathological voice detection based on component information from hmm,” in Advances in Nonlinear Speech Processing: 5th International Conference on Nonlinear Speech Processing, NOLISP 2011, Las Palmas de Gran Canaria, Spain, November 7-9, 2011. Proceedings 5. Springer, 2011, pp. 254–261.

M. R. Jamaludin, S. H. Salleh, T. T. Swee, K. Ahmad, A. K. Ibrahim, and K. Ismail, “An improved time domain pitch detection algorithm for pathological voice,” American Journal of Applied Sciences, vol. 9, no. 1, p. 93, 2012.

M. A. A. Albadr, S. Tiun, F. T. Al-Dhief, and M. A. Sammour, “Spoken language identification based on the enhanced self-adjusting extreme learning machine approach,” PloS one, vol. 13, no. 4, p. e0194770, 2018.

M. A. A. Albadr, S. Tiun, M. Ayob, F. T. Al-Dhief, K. Omar, and F. A. Hamzah, “Optimised genetic algorithm-extreme learning machine approach for automatic covid-19 detection,” PloS one, vol. 15, no. 12, p. e0242899, 2020

This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to search page