UPSI Digital Repository (UDRep)
Start | FAQ | About
Menu Icon

QR Code Link :

Type :Article
Subject :L Education (General)
ISBN :2376-5992
Main Author :Norliza Abdul Majid
Title :Unlocking the potential of LSTM for accurate salary prediction with MLE, Jeffreys prior, and advanced risk functions
Hits :115
Place of Production :Tanjung Malim
Publisher :Fakulti Pembangunan Manusia
Year of Publication :2024
Notes :PeerJ Computer Science
Corporate Name :Universiti Pendidikan Sultan Idris
HTTP Link : Click to view web link
PDF Full Text :You have no permission to view this item.

Abstract : Universiti Pendidikan Sultan Idris
This article aims to address the challenge of predicting the salaries of college graduates, a subject of significant practical value in the fields of human resources and career planning. Traditional prediction models often overlook diverse influencing factors and complex data distributions, limiting the accuracy and reliability of their predictions. Against this backdrop, we propose a novel prediction model that integrates maximum likelihood estimation (MLE), Jeffreys priors, Kullback-Leibler risk function, and Gaussian mixture models to optimize LSTM models in deep learning. Compared to existing research, our approach has multiple innovations: First, we successfully improve the model’s predictive accuracy through the use of MLE. Second, we reduce the model’s complexity and enhance its interpretability by applying Jeffreys priors. Lastly, we employ the Kullback-Leibler risk function for model selection and optimization, while the Gaussian mixture models further refine the capture of complex characteristics of salary distribution. To validate the effectiveness and robustness of our model, we conducted experiments on two different datasets. The results show significant improvements in prediction accuracy, model complexity, and risk performance. This study not only provides an efficient and reliable tool for predicting the salaries of college graduates but also offers robust theoretical and empirical foundations for future research in this field. © 2024, Li et al. Distributed under Creative Commons CC-BY 4.0. All rights reserved.

References

Abdulhafedh A. 2022. Comparison between common statistical modeling techniques used in research, including: discriminant analysis vs logistic regression, ridge regression vs LASSO, and decision tree vs random forest. Open Access Library Journal 9(2):1–19.

Autin KL, Blustein DL, Ali SR, Garriott PO. 2020. Career development impacts of COVID-19: practice and policy recommendations. Journal of Career Development 47(5):487–494 DOI 10.1177/0894845320944486.

Baccarini A, Blanton M, Zou S. 2022. Understanding information disclosure from secure computation output: a study of average salary computation. ArXiv preprint DOI 10.48550/arXiv.2209.10457.

Burnham KP, Anderson DR. 2001. Kullback-Leibler information as a basis for strong inference in ecological studies. Wildlife Research 28(2):111–119 DOI 10.1071/WR99107.

Casuat CD, Festijo ED, Alon AS. 2020. Predicting students’ employability using support vector machine: a SMOTE-optimized machine learning system. International Journal 8(5):2101–2106 DOI 10.30534/ijeter/2020/102852020.

Chen L, Sun Y, Thakuriah P. 2020. Modelling and predicting individual salaries in united kingdom with graph convolutional network. In: Hybrid Intelligent Systems: 18th International Conference on Hybrid Intelligent Systems (HIS 2018) Held in Porto, Portugal, December 13–15, 2018 18. Cham: Springer, 61–74.

Clarke BS, Barron AR. 1994. Jeffreys’ prior is asymptotically least favorable under entropy risk. Journal of Statistical planning and Inference 41(1):37–60 DOI 10.1016/0378-3758(94)90153-8.

CSAFRIT. 2021. Higher education students performance evaluation, Version 1. Retrieved October 2, 2021. Available at https://www.kaggle.com/datasets/csafrit2/higher-education-studentsperformance-evaluation/data.

Fan C, Sun Y, Zhao Y, Song M, Wang J. 2019. Deep learning-based feature engineering methods for improved building energy prediction. Applied Energy 240:35–45 DOI 10.1016/j.apenergy.2019.02.052.

Fujiyoshi H, Hirakawa T, Yamashita T. 2019. Deep learning-based image recognition for autonomous driving. IATSS Research 43(4):244–252 DOI 10.1016/j.iatssr.2019.11.008.

Hershey JR, Olsen PA. 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models. In: 2007 IEEE International Conference on Acoustics, Speech and Signal Processing-ICASSP’07. Vol. 4. Piscataway: IEEE, IV–317.

Huang J, Huang Z, Zhan X. 2023. Research on three-state reliability evaluation method of high reliability system based on multi-source prior information. PeerJ Computer Science 9(4):e1439 DOI 10.7717/peerj-cs.1439.

Hwang YK, Lee CS. 2021. The effect of job stress and psychological burnout on child-care teachers’ turnover intention: a moderated mediation model of gratitude. Perspektivy nauki i obrazovania–Perspectives of Science and Education 1(49):390–403 DOI 10.32744/pse.2021.1.26.

Kamyab S, Azimifar Z, Sabzi R, Fieguth P. 2022. Deep learning methods for inverse problems. PeerJ Computer Science 8(12):e951 DOI 10.7717/peerj-cs.951.

Kim J, Oh J, Rajaguru V. 2022. Job-seeking anxiety and job preparation behavior of undergraduate students. Healthcare 10(2):288 DOI 10.3390/healthcare10020288.

Kosmidis I, Firth D. 2021. Jeffreys-prior penalty, finiteness and shrinkage in binomial-response generalized linear models. Biometrika 108(1):71–82 DOI 10.1093/biomet/asaa052.

Li M, Liu X. 2020. Maximum likelihood least squares based iterative estimation for a class of bilinear systems using the data filtering technique. International Journal of Control, Automation and Systems 18(6):1581–1592 DOI 10.1007/s12555-019-0191-5.

Matbouli YT, Alghamdi SM. 2022. Statistical machine learning regression models for salary prediction featuring economy wide activities and occupations. Information 13(10):495 DOI 10.3390/info13100495.

Ranjeeth S, Latchoumi T, Paul PV. 2021. Optimal stochastic gradient descent with multilayer perceptron based student’s academic performance prediction model. Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science) 14:1728–1741 DOI 10.2174/2666255813666191116150319.

Rawat W, Wang Z. 2017. Deep convolutional neural networks for image classification: a comprehensive review. Neural Computation 29(9):2352–2449 DOI 10.1162/neco_a_00990.

Reynolds DA, Quatieri TF, Dunn RB. 2000. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10(1–3):19–41 DOI 10.1006/dspr.1999.0361.

Rith-Najarian LR, Boustani MM, Chorpita BF. 2019. A systematic review of prevention programs targeting depression, anxiety, and stress in university students. Journal of Affective Disorders 257(12):568–584 DOI 10.1016/j.jad.2019.06.035.

Thang ND, Chen L, Chan CK. 2011. Robust mixture model-based clustering with genetic algorithm approach. Intelligent Data Analysis 15(3):357–373 DOI 10.3233/IDA-2010-0472.

Uras N, Marchesi L, Marchesi M, Tonelli R. 2020. Forecasting bitcoin closing price series using linear regression and neural networks models. PeerJ Computer Science 6(4):e279 DOI 10.7717/peerj-cs.279.

Wang X, Jiang W, Luo Z. 2016. Combination of convolutional and recurrent neural network for sentiment analysis of short texts. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2428–2437.

Wood SN. 2011. Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society Series B: Statistical Methodology 73(1):3–36 DOI 10.1111/j.1467-9868.2010.00749.x.

Wynter K, Redley B, Holton S, Manias E, McDonall J, McTier L, Hutchinson AM, Kerr D, Lowe G, Phillips NNM, Rasmussen B. 2021. Depression, anxiety and stress among Australian nursing and midwifery undergraduate students during the COVID-19 pandemic: a crosssectional study. International Journal of Nursing Education Scholarship 18(1):20210060 DOI 10.1515/ijnes-2021-0060.

Xufengnian. 2021. Salary forecast for engineering graduates, Version 1. Retrieved November 18, 2021. Available at https://aistudio.baidu.com/datasetdetail/107973.

Zhong W, Qian C, Liu W, Zhu L, Li R. 2023. Feature screening for interval-valued response with application to study association between posted salary and required skills. Journal of the American Statistical Association 118(542):805–817 DOI 10.1080/01621459.2022.2152342.


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to search page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries, kindly contact us at pustakasys@upsi.edu.my or 016-3630263. Office hours only.