UPSI Digital Repository (UDRep)
Start | FAQ | About
Menu Icon

QR Code Link :

Type :final_year_project
Subject :QA Mathematics
Main Author :Zahirah Yahya
Title :Aplikasi kaedah K-Jiran terdekat (KNN) dalam penggantian data hujan yang hilang di Selangor
Place of Production :Tanjong Malim
Publisher :Fakulti Sains dan Matematik
Year of Publication :2023
Corporate Name :Universiti Pendidikan Sultan Idris
PDF Guest :Click to view PDF file

Abstract : Universiti Pendidikan Sultan Idris
Kajian ini bertujuan untuk mengaplikasikan kaedah K-Jiran Terdekat (KNN) dalam penggantian data hujan yang hilang bagi stesen-stesen hujan yang terdapat di Selangor. Kajian ini merangkumi dua objektif utama iaitu (i) mengaplikasikan kaedah KNN dalam penggantian data hujan yang hilang di Selangor dan (ii) menganalisis perbezaan ketepatan hasil penggantian KNN terhadap data hujan yang hilang di stesen yang berbeza di Selangor. Kajian ini dijalankan terhadap data siri masa hujan bagi sepuluh stesen hujan di Selangor. Sebanyak 365 data telah digunakan bagi setiap stesen dan data yang digunakan telah dihilangkan dengan mengikut kategori 1%, 5%, 10%, 25% dan 50% . Kemudian, data yang hilang tersebut digantikan dengan nilai hasil pengiraan KNN. Seterusnya, ketepatan data hasil pengiraan KNN dinilai menggunakan pekali korelasi. Hasil kajian menunjukkan, kaedah KNN dapat diaplikasikan dalam mengira data hujan yang hilang namun nilai pekali korelasi adalah rendah. Seterusnya, analisis perbezaan ketepatan hasil penggantian KNN bagi setiap kategori peratusan data hujan yang hilang menunjukkan bahawa hampir kesemua kategori mendapatkan julat pekali korelasi dengan interpretasi yang sangat lemah iaitu di antara 0.01 dan 0.20 (0.01

References

Abd Hamid, N. Z., & Md Noorani, M. S. (2012). On prediction of Subang, Selangor daily rainfall data: An application of local approximation method. Ojs.Upsi.Edu.My, 4(2), 49–57. http://ojs.upsi.edu.my/index.php/JSML/article/view/395 

Abd Hamid, N. Z., & Noorani, M. S. M. (2014). Suatu kajian perintis menggunakan pendekatan kalut bagi pengesanan sifat dan peramalan siri masa kepekatan PM10. Sains Malaysiana, 43(3), 475–481. http://www.ukm.my/jsm/pdf_files/SM-PDF-43-3-2014/19%20Nor%20Zila.pdf 

Aieb, A., Madani, K., Scarpa, M., Bonacorso, B., & Lefsih, K. (2019). A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria. Heliyon, 5(2), e01247. https://doi.org/10.1016/J.HELIYON.2019.E01247 

Aung, S. S., Itaru, N., & Shiro, T. (2018). A high performance classifier by dimensional tree based dual-kNN. Advances in Intelligent Systems and Computing, 868, 638–654. https://doi.org/10.1007/978-3-030-01054-6_46/COVER 

Azman, A. H., Tukimat, N. N. A., & Malek, M. A. (2021). Comparison of Missing Rainfall Data Treatment Analysis at Kenyir Lake You may also like. Materials Science and Engineering. https://doi.org/10.1088/1757-899X/1144/1/012046 

Berita Harian. (2021). Banjir besar kini ragut 27 nyawa. Berita Harian. https://www.bharian.com.my/berita/nasional/2021/12/902302/banjir-besar-kini-ragut-27-nyawa 

Bernama. (2021). Hujan luar biasa, air pasang punca banjir di Selangor - MB | Astro Awani. Astro Awani. https://www.astroawani.com/berita-malaysia/hujan-luar-biasa-air-pasang-punca-banjir-di-selangor-mb-337092 

Burhanuddin, S. N. Z. A., Deni, S. M., & Ramli, N. M. (2017). Imputation of missing rainfall data using revised normal ratio method. Advanced Science Letters, 23(11), 10981–10985. https://doi.org/10.1166/ASL.2017.10203 

Cubillos, M., Wøhlk, S., & Wulff, J. N. (2022). A bi-objective  k-nearest-neighbors-based Imputation Method for Multilevel Data. Expert Systems with Applications, 204. https://doi.org/10.1016/J.ESWA.2022.117298 

de Asis, C. A. (2021). Comparison of Normal Ratio Method and Distance Power Method for Estimating Missing Rainfall Data with Three Neighboring Stations. Journal of Engineering Research and Reports, 21(6). https://doi.org/10.9734/JERR/2021/v21i617469 

Ekeu-wei, I. T., Alan Blackburn, G., & Pedruco, P. (2018). Infilling Missing Data in Hydrology: Solutions Using Satellite Radar Altimetry and Multiple Imputation for Data-Sparse Regions. Water, 10(10). https://doi.org/10.3390/w10101483 

Evans, A., Kelsey, J., Whittemore, A., & Thompson, W. (1996). Methods in observational epidemiology. https://books.google.com.my/books?hl=en&lr=&id=Xnz6VgL22osC&oi=fnd&pg=PA3&dq=correlation+coefficient+evans+1996+interpretation&ots=kNOa30XxTa&sig=e1e4EoJC2xDXzZMz45FRs90hdKY 

Faridah, S. (2017). Pengelompokan Corak Taburan Hujan Dengan Kaedah Pengelompokan Siri Masa. 

Farzandi, M., Sanaeinejad, H., Rezaei-Pazhan, H., & Sarmad, M. (2022). Improving estimation of missing data in historical monthly precipitation by evolutionary methods in the semi-arid area. Environment, Development and Sustainability, 24(6), 8313–8332. https://doi.org/10.1007/S10668-021-01784-4 

Hamzah, F. B., Hamzah, M., Fatin, S., Razali, M., Jaafar, O., Norhayati, &, & Jamil, A. (2020). Imputation methods for recovering streamflow observation: A methodological review. Cogent Environmental Science, 6(1). https://doi.org/10.1080/23311843.2020.1745133 

Hamzah, F. B., Hamzah, M., Fatin, S., Razali, M., & Samad, H. (2021). A Comparison of Multiple Imputation Methods for Recovering Missing Data in Hydrological Studies. Civil Engineering Journal, 7(9), 1608–1619. https://doi.org/10.28991/cej-2021-03091747 

Hazarika, J., & Sarma, A. K. (2021). Importance of regional rainfall data in homogeneous clustering of data-sparse areas: a study in the upper Brahmaputra valley region. Theoretical and Applied Climatology, 145(3–4), 1161–1175. https://doi.org/10.1007/S00704-021-03686-X/FIGURES/4 

Hodge, V. J., & Austin, J. (2005). A binary neural k-nearest neighbour technique. Knowledge and Information Systems 2005 8:3, 8(3), 276–291. https://doi.org/10.1007/S10115-004-0191-4 

Hu, L. Y., Huang, M. W., Ke, S. W., & Tsai, C. F. (2016). The distance function effect on k-nearest neighbor classification for medical datasets. SpringerPlus, 5(1), 1–9. https://doi.org/10.1186/S40064-016-2941-7/FIGURES/8 

Ijab, M. T., Ahmad, A., Kadir, R. A., & Hamid, S. (2017). Towards big data quality framework for Malaysia’s public sector open data initiative. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 10645 LNCS, 79–87. https://doi.org/10.1007/978-3-319-70010-6_8/COVER 

Ismail, W. N. W., & Ibrahim, W. Z. W. Z. @ W. (2017). Estimation of rainfall and stream flow missing data for Terengganu, Malaysia by using interpolation technique methods. Malaysian Journal of Fundamental and Applied Sciences, 13(3), 213–217. https://doi.org/10.11113/MJFAS.V13N3.578 

Isman, Ahmad, A., & Latief, A. (2021). Perbandingan Metode KNN Dan LBPH Pada Klasifikasi Daun Herbal. Jurnal RESTI (Rekayasa Sistem Dan Teknologi Informasi), 5(3), 557–564. https://doi.org/10.29207/RESTI.V5I3.3006 

Jahan, F., Sinha, N. C., Rahman, M. M., Rahman, M. M., Mondal, M. S. H., & Islam, M. A. (2019). Comparison of missing value estimation techniques in rainfall data of Bangladesh. Theoretical and Applied Climatology, 136, 1115–1131. https://doi.org/10.1007/S00704-018-2537-Y/FIGURES/2 

Kamaruzaman, I. F., Zawiah, W., Zin, W., & Ariff, M. (2017). A comparison of method for treating missing daily rainfall data in Peninsular Malaysia. Malaysian Journal of Fundamental and Applied Sciences, 375–380. https://doi.org/10.11113/MJFAS.V13N4-1.781 

Kim, Y. J., & Chi, M. (2018). Temporal Belief Memory: Imputing Missing Data during RNN Training. Proceedings of the 27th International Joint Conference on Artificial Intelligence. https://par.nsf.gov/servlets/purl/10080619 

Lai, W. Y., & Kuok, K. K. (2019). A Study on Bayesian Principal Component Analysis for Addressing Missing Rainfall Data. Water Resources Management, 33(8), 2615–2628. https://doi.org/10.1007/S11269-019-02209-8/FIGURES/7 

Laña, I., Olabarrieta, I., Vélez, M., & del Ser, J. (2018). On the Imputation of Missing Data for Road Traffic Forecasting: New Insights and Novel Techniques. Transportation Research Part C . https://doi.org/10.1016/j.trc.2018.02.021 

Liang Chuan, Z., Mohd Deni, S., Fam, S.-F., & Ismail, N. (2020). The Effectiveness of a Probabilistic Principal Component Analysis Model and Expectation Maximisation Algorithm in Treating Missing Daily Rainfall Data. Asia-Pacific Journal of Atmospheric Sciences, 56, 119–129. https://doi.org/10.1007/s13143-019-00135-8 

Mahmud, M. (2018). PERISTIWA EL NINO DAN Peristiwa El Nino Dan Pengaruh IOD Terhadap Hujan Di Malaysia. Journal Od Social Sciences and Humanities, 13(2), 166–177. 

Majid, R. (2019). Aplikasi Gis Dalam Perbandingan Ruangan Antara Kawasan Tepubina Dan Kawasan Hijau Di Negeri Selangor. Advanced Journal of Technical and Vocational Education, 3(4), 1–9. https://doi.org/10.26666/rmp.ajtve.2019.4.1 

Malek, M. A., Harun, S., Shamsuddin, S. M., & Mohamad I. (2009). Reconstruction of missing daily rainfall data using unsupervised artificial neural network. International Journal of Computer Systems Science and Engineering, 4(2), 79–84. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.193.4937&rep=rep1&type=pdf 

Masseran, N., Razali, A. M., Ibrahim, K., Zaharim, A., & Sopian, K. (2013). Application of the Single Imputation Method to Estimate Missing Wind Speed Data in Malaysia. Research Journal of Applied Sciences, Engineering and Technology, 6(10), 1780–1784. https://doi.org/10.19026/rjaset.6.3903 

Miró, J. J., Caselles, V., & Estrela, M. J. (2017). Multiple imputation of rainfall missing data in the Iberian Mediterranean context. Atmospheric Research, 197, 313–330. https://doi.org/10.1016/J.ATMOSRES.2017.07.016 

Nor, S. M. C., Shaharudin, S. M., Ismail, S., Zainuddin, N. H., & Tan, M. L. (2020). A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia. Bulletin of Electrical Engineering and Informatics, 9(2), 635–643. https://beei.org/index.php/EEI/article/view/2090 

Pan, R., Yang, T., Cao, J., Lu, K., & Zhang, Z. (2015). Missing data imputation by K nearest neighbours based on grey relational structure and mutual information. Applied Intelligence, 43(3), 614–632. https://doi.org/10.1007/S10489-015-0666-X/TABLES/7 

Parvin, H., Alizadeh, H., & Minaei-Bidgoli, B. (2009). Validation Based Modified K-Nearest Neighbor. AIP Conference Proceedings, 1127, 153–161. https://doi.org/10.1063/1.3146187 

Rahardjo, M. (2011). Metode pengumpulan data penelitian kualitatif. 

Roslan, M. R. H., Ahmad, K., & Ayyash, M. M. (2020). Factors Influencing Information Systems Quality From The System Developers Perspective. Asia-Pacific Journal of Information Technology and Multimedia, 9(1), 82–93. https://doi.org/10.17576/apjitm-2020-0901-07 

Sahoo, A., & Kumar, D. G. (2022). Imputation of missing precipitation data using KNN, SOM, RF, and FNN. Soft Comput, 26(12), 5919–5936. https://doi.org/10.1007/S00500-022-07029-4 

Schober, P., Boer, C., & Schwarte, L. A. (2018). Correlation coefficients: Appropriate use and interpretation. Anesthesia and Analgesia, 126(5), 1763–1768. https://doi.org/10.1213/ANE.0000000000002864 

Shaharudin, S. M., Andayani, S., Binatari, N., Kurniawan, A., Afdal Ahmad Basri, M., & Hila Zainuddin, N. (2020). Imputation methods for addressing missing data of monthly rainfall in Yogyakarta, Indonesia. International Journal of Advanced Trends in Computer Science and Engineering, 9(14), 646–651. https://doi.org/10.30534/ijatcse/2020/9091.42020 

Sharma, G., Singh, A., & Jain, S. (2022). A hybrid deep neural network approach to estimate reference evapotranspiration using limited climate data. Neural Computing and Applications, 34(5), 4013–4032. https://doi.org/10.1007/S00521-021-06661-9/FIGURES/8 

Sun, Y.-S., Xu, H.-Y., & Computers, Y.-Q. (2022). Missing Data Interpolation with Variational Bayesian Inference for Socio-economic Statistics Applications. Journal of Computers , 33(2), 169–176. http://www.csroc.org.tw/journal/JOC33-2/JOC3302-15.pdf 

Tahir, M. I. I. (2021). Banjir terburuk di Selangor dalam tempoh tujuh tahun. Sinar Harian. 

https://www.sinarharian.com.my/article/178609/BERITA/Semasa/Banjir-terburuk-di-Selangor-dalam-tempoh-tujuh-tahun 

Taylor, R. (2016). Interpretation of the Correlation Coefficient: A Basic Review: Journal of Diagnostic Medical Sonography, 6(1), 35–39. https://doi.org/10.1177/875647939000600106 

World Meteorological Organization. (2011). WMO Strategic Plan 2012-2015. WMO. 

Wuthiwongyothin, S., Kalkan, C., & Panyavaraporn, J. (2021). Evaluating Inverse Distance Weighting and Correlation Coefficient Weighting Infilling Methods on Daily Rainfall Time Series. Journal of Science and Technology , 13(2), 71–79. https://ph01.tci-thaijo.org/index.php/snru_journal/article/view/243635 

Yogafanny, E., & Legono, Dj. (2022). A Comparative Study Of Missing Rainfall Data Analysis Using The Methods Of Inversed Square Distance And Arithmetic Mean. ASEAN Engineering, 12(2), 69–74. https://journals.utm.my/aej/article/view/16974 

Yoon, J., Jordon, J., & Schaar, M. van der. (2018). GAIN: Missing Data Imputation using Generative Adversarial Nets. Proceedings of the 35th International Conference on Machine Learning. http://proceedings.mlr.press/v80/yoon18a.html?ref=https://githubhelp.com 

Zakaria, N. A., & Noor, N. M. (2018). Imputation Methods for Filling Missing Ddata in Urban Air Pollution Data for Malaysia. Urbanism. Arhitectura. Constructii, 9(2), 159–166. http://dspace.unimap.edu.my/jspui/bitstream/123456789/68510/1/Imputation%20methods%20for%20filling%20missing%20data%20in%20urban%20air%20pollution%20data%20for%20Malaysia.pdf 

Zhang, S. (2012). Nearest neighbor selection for iteratively kNN imputation. Journal of Systems and Software, 85(11), 2541–2552. https://doi.org/10.1016/J.JSS.2012.05.073 

Zhang, Y., & Thorburn, P. J. (2022). Handling missing data in near real-time environmental monitoring: A system and a review of selected methods. Future Generation Computer Systems, 128, 63–72. https://doi.org/10.1016/J.FUTURE.2021.09.033 

Zulkepli, N. A., & Idris, N. D. M. (2022). Identifying the research of flooding in Kelantan, Malaysia: A review. International Journal of Mechanical Engineering, 7(4). 

  

 


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to previous page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries, kindly contact us at pustakasys@upsi.edu.my or 016-3630263. Office hours only.