UPSI Digital Repository (UDRep)
|
![]() |
|
|
Abstract : Universiti Pendidikan Sultan Idris |
The objective of this study is to evaluate and compare the proposed statistical
downscaling model in Kelantan and Terengganu states. The study also investigates
the most accurate imputation methods in handling the missing atmospheric data and
the important predictors for a statistical downscaling method by reducing the
dimensionality data. The data used in this study include atmospheric data (predictors)
and daily rainfall data (predictand) from 1998 until 2007. As part of its methodology,
this study had used an imputation method for handling missing data. Then, Principal
Component Analysis (PCA) was applied to rectify the issue of high-dimensional data
and select predictors for a two-phase model. The two-phase machine learning
techniques were introduced as a precise statistical downscaling method in Kelantan and
Terengganu states. The first phase is a classification using the Support Vector
Classification (SVC) that determines dry and wet days. Subsequently, a regression
estimates the amount of rainfall based on the frequency of wet days using the Support
Vector Regression (SVR), Artificial Neural Network (ANN), and Relevant Vector
Machine (RVM). The proposed model was analysed by using the performance
measures that are Root Mean Square Error (RMSE) and Nash-Sutcliffe Efficiency
(NSE). The result of imputation methods shows Random Forest (RF) is having the
lowest RMSE value and the highest NSE value. The analysis of PCA results indicates
two selected Principal Component’s cut-off eigenvalues at 1.6 and 70.29% cumulative
percentage of the total variance. In the conclusion of this study, the comparison of
results from the SVC and RVM hybridizations reveals that the hybrid reproduces the
most reasonable daily rainfall projection and supports the high rainfall extremes,
making it a perfect candidate for rainfall prediction research. The implication of this
study is to establish the relationship between predictand variables and predictors in
order to improve predicting accuracy in climate change projections by using a
hybridization model. |
References |
Abbott, D. (1999). Combining models to improve classifier accuracy and robustness. Proceedings of Second International Conference on …, January 1999, 1–7.
Abdel-Kader, H., Salam, M. A.-E., & ... (2021). Hybrid Machine Learning Model for Rainfall Forecasting. Journal of Intelligent …, 1(1), 5–12. https://doi.org/10.5281/zenodo.3376685
Acuña, E., & Rodriguez, C. (2004). The Treatment of Missing Values and its Effect on Classifier Accuracy. Classification, Clustering, and Data Mining Applications. https://doi.org/10.1007/978-3-642-17103-1_60
Advani, V. (2021). What is Machine Learning? How Machine Learning Works and future of it? Great Learning. https://www.mygreatlearning.com/blog/what-ismachine- learning/
Agrawal, A. (2019). Highlights the advantages and disadvantages of machine learning. Cyber Infrastructure, CIS. https://www.cisin.com/coffeebreak/ Enterprise/highlights-the-advantages-and-disadvantages-of-machinelearning. html
Ahmadkhani, S., & Adibi, P. (2016). Face recognition using supervised probabilistic principal component analysis mixture model in dimensionality reduction without loss framework. IET Computer Vision, 10(3), 193–201. https://doi.org/10.1049/iet-cvi.2014.0434
Aksornsingchai, P., & Srinilta, C. (2011). Statistical downscaling for rainfall and temperature prediction in Thailand. IMECS 2011 - International MultiConference of Engineers and Computer Scientists 2011, 1(January 1948), 356–361.
Albon, C. (2017). SVC Parameters When Using RBF Kernel. GitHub. https://chrisalbon.com/machine_learning/support_vector_machines/svc_paramet ers_using_rbf_kernel/
Ali, A. H., & Abdullah, M. Z. (2020). An efficient model for data classification based on SVM grid parameter optimization and PSO feature weight selection. International Journal of Integrated Engineering, 12(1), 1–12. https://doi.org/10.30880/ijie.2020.12.01.001
Aljuaid, T., & Sasi, S. (2017). Proper imputation techniques for missing values in data sets. Proceedings of the 2016 International Conference on Data Science and Engineering, ICDSE 2016. https://doi.org/10.1109/ICDSE.2016.7823957
Alsaber, A. R., Pan, J., & Al-Hurban, A. (2021). Handling complex missing data using random forest approach for an air quality monitoring dataset: A case study of kuwait environmental data (2012 to 2018). International Journal of Environmental Research and Public Health, 18(3), 1–26. https://doi.org/10.3390/ijerph18031333
Amirabadizadeh, M., Ghazali, A. H., Huang, Y. F., & Wayayok, A. (2016). Downscaling daily precipitation and temperatures over the Langat River Basin in Malaysia : A comparison of two statistical downscaling approaches. International Journal of Water Resources and Environmental Engineering, 8(December), 120–136. https://doi.org/10.5897/IJWREE2016.0585
Anandhi, A., Srinivas, V. V., NAnjundiah, R. S., & Kumar, D. N. (2008). Downscaling precipitation to river basin in India for IPCC SRES scenarions using support vector machine. International Journal of Climatology, 28(March 2008), 401–420. https://doi.org/10.1002/joc
Andridge, R. R., & Little, R. J. A. (2010). A review of hot deck imputation for survey non-response. International Statistical Review, 78(1), 40–64. https://doi.org/10.1111/j.1751-5823.2010.00103.x
Angra, S., & Ahuja, S. (2017). Machine learning and its applications: A review. Proceedings of the 2017 International Conference On Big Data Analytics and Computational Intelligence, ICBDACI 2017, April 2020, 57–60. https://doi.org/10.1109/ICBDACI.2017.8070809
Anguita, D., Ghelardoni, L., Ghio, A., Oneto, L., & Ridella, S. (2012). The ‘ K ’ in Kfold Cross Validation. European Symposium on Artificial Neural Networks- ESANN 2012 Proceedings, April.
Anguita, D., Ghio, A., Ridella, S., & Sterpi, D. (2009). K-Fold Cross Validation for Error Rate Estimate in Support Vector Machines. Vessels Fuel Consumption Forecast and Trim Optimisation: a Data Analytics Perspective View project KFold Cross Validation for Error Rate Estimate in Support Vector Machines. Proc. DMIN Int. Conf. Data Mining, January. https://www.researchgate.net/publication/220704948
Anguita, D., Ridella, S., Rivieccio, F., & Zunino, R. (2003). Hyperparameter design criteria for support vector classifiers. Neurocomputing, 55(1–2), 109–134. https://doi.org/10.1016/S0925-2312(03)00430-2
Arifin, F., Robbani, H., Annisa, T., & Ma’Arof, N. N. M. I. (2019). Variations in the Number of Layers and the Number of Neurons in Artificial Neural Networks: Case Study of Pattern Recognition. Journal of Physics: Conference Series, 1413(1). https://doi.org/10.1088/1742-6596/1413/1/012016
ASCE. (2000). Artificial Neural Network in Hydrology I: Preliminary Concepts. In Journal of Hydrologic Engineering (Vol. 5, Issue 2).
Assent, I. (2012). Clustering high dimensional data. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(4), 340–350. https://doi.org/10.1002/widm.1062
Ayesha, S., Hanif, M. K., & Talib, R. (2020). Overview and comparative study of dimensionality reduction techniques for high dimensional data. Information Fusion, 59(May 2019), 44–58. https://doi.org/10.1016/j.inffus.2020.01.005
Azid, A., Juahir, H., Toriman, M. E., Kamarudin, M. K. A., Saudi, A. S. M., Hasnam, C. N. C., Aziz, N. A. A., Azaman, F., Latif, M. T., Zainuddin, S. F. M., Osman, M. R., & Yamin, M. (2014). Prediction of the level of air pollution using principal component analysis and artificial neural network techniques: A case study in Malaysia. Water, Air, and Soil Pollution, 225(8). https://doi.org/10.1007/s11270-014-2063-1
Baghanam, A. H., Eslahi, M., Sheikhbabaei, A., & Seifi, A. J. (2020). Assessing the impact of climate change over the northwest of Iran: an overview of statistical downscaling methods. Theoretical and Applied Climatology, 141(3–4), 1135– 1150. https://doi.org/10.1007/s00704-020-03271-8
Bahari, N. I. S., Ahmad, A., & Aboobaider, B. M. (2014). Application of support vector machine for classification of multispectral data. IOP Conference Series: Earth and Environmental Science, 20(1). https://doi.org/10.1088/1755- 1315/20/1/012038
Bala, R., & Kumar, D. (2017). Classification Using ANN: A Review. International Journal of Computational Intelligence Research, 13(7), 1811–1820. http://www.ripublication.com
Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data analyses. Journal of School Psychology, 48(1), 5–37. https://doi.org/10.1016/j.jsp.2009.10.001
Barnard, J., & Meng, X.-L. (1999). Applications of multiple imputation in medical studies: from AIDS to NHANES. Statistical Methods in Medical Research, 8(1), 17–36. https://doi.org/10.1177/096228029900800103
Batista, G., & Monard, M. (2002). A Study of K -Nearest Neighbour as an Imputation Method. Argentine Symposium on Artificial Intelligence, October.
Beaudoin, A., Bernier, P. Y., Guindon, L., Villemaire, P., Guo, X. J., Stinson, G., Bergeron, T., Magnussen, S., & Hall, R. J. (2014). Mapping attributes of Canada’s forests at moderate resolution through kNN and MODIS imagery. Canadian Journal of Forest Research, 44(5), 521–532. https://doi.org/10.1139/cjfr-2013-0401
Bell, W., Brockwell, P. J., & Davis, R. A. (2009). Time Series: Theory and Methods. In Journal of the American Statistical Association (Vol. 84, Issue 405). https://doi.org/10.2307/2289896
Benestad, R., & Benestad, R. (2016). Downscaling Climate Information. In Oxford Research Encyclopedia of Climate Science (Issue June). https://doi.org/10.1093/acrefore/9780190228620.013.27
Bengio, Y., & Grandvalet, Y. (2004). No Unbiased Estimator of the Variance of KFold Cross Validation. Journal of Machine Learning Research, 5, 1089–1105. https://doi.org/10.1016/S0006-291X(03)00224-9
Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, 25(5), 464–469. https://doi.org/10.1111/j.1467-842X.2001.tb00294.x
Beretta, L., & Santaniello, A. (2016). Nearest neighbor imputation algorithms: A critical evaluation. BMC Medical Informatics and Decision Making, 16(74). https://doi.org/10.1186/s12911-016-0318-z
Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of crossvalidation for evaluating autoregressive time series prediction. Computational Statistics and Data Analysis, 120, 70–83. https://doi.org/10.1016/j.csda.2017.11.003
Berrar, D. (2018). Cross-validation. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 1–3(April), 542–545. https://doi.org/10.1016/B978-0-12-809633-8.20349-X
Berzofsky, M., Biemer, P., & Kalsbeek, W. (2008). A Brief History of Classification Error Models. Proceeding of Joint Statistical Modeetings, 3667–3673.
Bethere, L., Sennikovs, J., & Bethers, U. (2017). Climate indices for the Baltic states from principal component analysis. Earth System Dynamics, 8(4), 951–962. https://doi.org/10.5194/esd-8-951-2017
Bhattacharya, A. (2014). Curse of Dimensionality. Fundamentals of Database Indexing and Searching, 141–148. https://doi.org/10.1201/b17767-13
Bhattacharya, D., Nisha, M. G., & Pillai, G. N. (2015). Relevance vector-machinebased solar cell model. International Journal of Sustainable Energy, 34(10), 685–692. https://doi.org/10.1080/14786451.2014.885030
Bhavsar, H., & Ganatra, A. (2012). A Comparative Study of Training Algorithms for Supervised Machine Learning. International Journal of Soft Computing and Engineering, 2(4), 74–81.
Bing, Q., Gong, B., Yang, Z., Shang, Q., & Zhou, X. (2015). Short-Term Traffic Flow Local Prediction Based on Combined Kernel Function Relevance Vector Machine Model. Mathematical Problems in Engineering, 2015. https://doi.org/10.1155/2015/154703
Böhner, J., & Bechtel, B. (2017). GIS in Climatology and Meteorology. In Comprehensive Geographic Information Systems (Vol. 3). https://doi.org/10.1016/B978-0-12-409548-9.09633-0
Boisberranger, J. du, Bossche, J. Van den, & Estève, L. (2017). RBF SVM parameters. Scikit-Learn Developers. https://scikitlearn. org/stable/about.html#authors
Borra, S., & Di Ciaccio, A. (2010). Measuring the prediction error. A comparison of cross-validation, bootstrap and covariance penalty methods. Computational Statistics and Data Analysis, 54(12), 2976–2989. https://doi.org/10.1016/j.csda.2010.03.004
Breiman, L. (2001). Random Forests. Machine Language, 45(1), 5–32. https://doi.org/10.14569/ijacsa.2016.070603
Breiman, L., Cutler, A., Liaw, A., & Wiener, M. (2018). Package “randomForest.” CRAN. https://doi.org/10.1023/A
Brence, J. R., & Brown, D. E. (2006). Improving the Robust Random Forest Regression Algorithm. In Systems and Information Engineering Technical Papers, Department of Systems and Information Engineering.
Brownlee, J. (2020). Train-Test Split for Evaluating Machine Learning Algorithms. Python Machine Learning. https://machinelearningmastery.com/train-test-splitfor- evaluating-machine-learning-algorithms/
Bunkley, Ni. (2008). Joseph Juran, 103, Pioneer in Quality Control, Dies. The New York Times. https://www.nytimes.com/2008/03/03/business/03juran.html
Bürger, G. (1996). Expanded downscaling for generating local weather scenarios. Climate Research, 7(2), 111–128. https://doi.org/10.3354/cr007111
Burman, P. (1989). A comparative study of ordinary cross-validation, v-fold crossvalidation and the repeated learning-testing methods. Biometrika, 76(3), 503– 514. https://doi.org/10.1093/biomet/76.3.503
Campion, W. M., & Rubin, D. B. (1989). Multiple Imputation for Nonresponse in Surveys. In Journal of Marketing Research (Vol. 26, Issue 4). https://doi.org/10.2307/3172772
Carleo, G., Cirac, I., Cranmer, K., Daudet, L., Schuld, M., Tishby, N., Vogt-Maranto, L., & Zdeborová, L. (2019). Machine learning and the physical sciences. Reviews of Modern Physics, 91(4), 45002. https://doi.org/10.1103/RevModPhys.91.045002
Castellano, C. M., & DeGaetano, A. T. (2017). Downscaling extreme precipitation from CMIP5 simulations using historical analogs. Journal of Applied Meteorology and Climatology, 56(9), 2421–2439. https://doi.org/10.1175/JAMC-D-16-0250.1
Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute error (MAE)? -Arguments against avoiding RMSE in the literature. Geoscientific Model Development, 7(3), 1247–1250. https://doi.org/10.5194/gmd-7-1247-2014
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers and Electrical Engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
Che Mat Nor, S. M., Shaharudin, S. M., Ismail, S., Zainuddin, N. H., & Tan, M. L. (2020). A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia. Bulletin of Electrical Engineering and Informatics, 9(2), 635–643. https://doi.org/10.11591/eei.v9i2.2090
Cheema, J. R. (2014). A Review of Missing Data Handling Methods in Education Research. Review of Educational Research, 84(4), 487–508. https://doi.org/10.3102/0034654314532697
Chen, C., & Shyu, M. L. (2011). Clustering-based binary-class classification for imbalanced data sets. Proceedings of the 2011 IEEE International Conference on Information Reuse and Integration, IRI 2011, 384–389. https://doi.org/10.1109/IRI.2011.6009578
Chen, S., Gu, C., Lin, C., Zhang, K., & Zhu, Y. (2020). Multi-kernel optimized relevance vector machine for probabilistic prediction of concrete dam displacement. Engineering with Computers, 0123456789. https://doi.org/10.1007/s00366-019-00924-9
Chen, S. H., Jain, L., & Tai, C. C. (2005). Computational economics: A perspective from computational intelligence. In Computational Intelligence and its Applications Series (Issue May 2016). Idea Group Publishing. https://doi.org/10.4018/978-1-59140-649-5
Chen, S. T., Yu, P. S., & Tang, Y. H. (2010). Statistical downscaling of daily precipitation using support vector machines and multivariate analysis. Journal of Hydrology, 385(1–4), 13–22. https://doi.org/10.1016/j.jhydrol.2010.01.021
Chen, Z. (2001). Data-Mining and Uncertain Reasoning: An Integrated Approach. In Information Visualization. Wiley, New York. https://doi.org/10.1057/palgrave.ivs.9500041
Cheng, C. H., & Yang, J. H. (2016). A novel rainfall forecast model based on the integrated non-linear attribute selection method and support vector regression. Journal of Intelligent and Fuzzy Systems, 31(2), 915–925. https://doi.org/10.3233/JIFS-169021
Cheng, C. T., Niu, W. J., Feng, Z. K., Shen, J. J., & Chau, K. W. (2015). Daily reservoir runoff forecasting method using artificial neural network based on quantum-behaved particle swarm optimization. Water (Switzerland), 7(8), 4232– 4246. https://doi.org/10.3390/w7084232
Chhabra, G., Vashisht, V., & Ranjan, J. (2017). A Comparison of Multiple Imputation Methods for Data with Missing Values. Indian Journal of Science and Technology, 10(19), 1–7. https://doi.org/10.17485/ijst/2017/v10i19/110646
Chhabra, G., Vashisht, V., & Ranjan, J. (2019). A review on missing data value estimation using imputation algorithm. Journal of Advanced Research in Dynamical and Control Systems, 11(7 Special Issue), 312–318.
Cho, M. Y., & Hoang, T. T. (2017). Feature Selection and Parameters Optimization of SVM Using Particle Swarm Optimization for Fault Classification in Power Distribution Systems. Computational Intelligence and Neuroscience, 1–9. https://doi.org/10.1155/2017/4135465
Coulibaly, P. (2004). Downscaling daily extreme temperatures with genetic programming. Geophysical Research Letters, 31(16), 1–4. https://doi.org/10.1029/2004GL020075
Crusoveanu, L. (2021). Epoch in Neural Networks. Baeldung. https://www.baeldung.com/cs/epoch-neural-networks
Cummins, N., Sethu, V., Epps, J., & Krajewski, J. (2015). Relevance Vector Machine for Depression Prediction Industrial Psychology , Rhenish University of Applied Sciences Cologne , Germany. Interspeech 2015, 1(2), 110–114.
Daniel, F. (2020). What is Machine Learning? Emerj The Al Research and Advisory Company. https://emerj.com/ai-glossary-terms/what-is-machine-learning/
Das, J., & Nanduri, U. V. (2018). Assessment and evaluation of potential climate change impact on monsoon flows using machine learning technique over Wainganga River basin, India. Hydrological Sciences Journal, 63(7), 1020– 1046. https://doi.org/10.1080/02626667.2018.1469757
Davey, A., & Savla, J. (2010). Statistical Power Analysis with Missing Data. Routledge Taylor & Francis Group, LLC.
Dawson, C. W., Abrahart, R. J., Shamseldin, A. Y., & Wilby, R. L. (2006). Flood estimation at ungauged sites using artificial neural networks. Journal of Hydrology, 319(1–4), 391–409. https://doi.org/10.1016/j.jhydrol.2005.07.032
Deo, R. C., Samui, P., & Kim, D. (2016). Estimation of monthly evaporative loss using relevance vector machine, extreme learning machine and multivariate adaptive regression spline models. Stochastic Environmental Research and Risk Assessment, 30(6), 1769–1784. https://doi.org/10.1007/s00477-015-1153-y
Department of Irrigation and Drainage. (2018). Hydrological Standard for Rainfall Station Instrumentation.
Desai, K. M., Survase, S. A., Saudagar, P. S., Lele, S. S., & Singhal, R. S. (2008). Comparison of artificial neural network (ANN) and response surface methodology (RSM) in fermentation media optimization: Case study of fermentative production of scleroglucan. Biochemical Engineering Journal, 41(3), 266–273. https://doi.org/10.1016/j.bej.2008.05.009
Devak, M., & Dhanya, C. T. (2014). Downscaling of Precipitation in Mahanadi Basin , India. International Journal of Civil Engineering Research, 5(2), 111–120.
Dhiraj, K. (2019). Top 4 advantages and disadvantages of Support Vector Machine or SVM. Medium. https://dhirajkumarblog.medium.com/top-4-advantages-anddisadvantages- of-support-vector-machine-or-svm-a3c06a2b107
Dhurandhar, A., & Dobra, A. (2009). Evaluating Evaluation Measure. In Proceedings of Evaluation Methods in Machine Learning Workshop in International Conference on Machine Learning (ICML) 2009. https://doi.org/10.1002/pdh.264
Dominick, D., Juahir, H., Latif, M. T., Zain, S. M., & Aris, A. Z. (2012). Spatial assessment of air quality patterns in Malaysia using multivariate analysis. Atmospheric Environment, 60, 172–181. https://doi.org/10.1016/j.atmosenv.2012.06.021
Dong, Y., Wang, J., Wang, C., & Guo, Z. (2017). Research & application of hybrid forecasting model based on an optimal feature selection system-A case study on electrical load forecasting. Energies, 10(4). https://doi.org/10.3390/en10040490
Dorado, J., RabuñAL, J. R., Pazos, A., Rivero, D., Santos, A., & Puertas, J. (2003). Prediction and modeling of the rainfall-runoff transformation of a typical urban basin using ann and gp. Applied Artificial Intelligence, 17(4), 329–343. https://doi.org/10.1080/713827142
Drago, C., & Scepi, G. (2015). Time series clustering from high dimensional data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7627, Issue December 2014). https://doi.org/10.1007/978-3-662-48577-4_5
Duhan, D., & Pandey, A. (2015). Statistical downscaling of temperature using three techniques in the Tons River basin in Central India. Theoretical and Applied Climatology, 121(3–4), 605–622. https://doi.org/10.1007/s00704-014-1253-5
Efron, B., & Gong, G. (1985). A leisurely look at the Bootstrap, the Jackknife and Cross-Validation. American Statistician, 37(1), 36–48.
El-Shafie, A., Mukhlisin, M., Najah, A. A., & Taha, M. R. (2011). Performance of artificial neural network and regression techniques for rainfall-runoff prediction. International Journal of Physical Sciences, 6(8), 1997–2003. https://doi.org/10.5897/IJPS11.314
Enders, C. K. (2010). Applied missing data analysis. Guilford Press. https://books.google.com/books?hl=en&lr=&id=MN8ruJd2tvgC&oi=fnd&pg=P A1&dq=Enders,+2010&ots=dJnDs_Vls8&sig=gEP41sXuZcAE2DlqF1qEOo9A H8Q
Engel, D., Hüttenberger, L., & Hamann, B. (2012). A survey of dimension reduction methods for high-dimensional data analysis and visualization. OpenAccess Series in Informatics, 27, 135–149. https://doi.org/10.4230/OASIcs.VLUDS.2011.135
Erdal, H. I., & Karakurt, O. (2013). Advancing monthly streamflow prediction accuracy of CART models using ensemble learning paradigms. Journal of Hydrology, 477, 119–128. https://doi.org/10.1016/j.jhydrol.2012.11.015
Erichson, N. B., Zheng, P., Manohar, K., Brunton, S. L., Kutz, J. N., & Aravkin, A. Y. (2020). Sparse Principal Component Analysis via Variable Projection. SIAM Journal on Applied Mathematics, 80(2), 977–1002.
Falkenberg Nielsen, O., & Johnsen, G. (2015). Normal aldring. Anatomi Og Fysiologi, 1. Alsvåg H. Omsorg-med udgangspunkt i Kari Mart.
Fang, C., & Wang, C. (2020). Time Series Data Imputation: A Survey on Deep Learning Approaches. http://arxiv.org/abs/2011.11347
Ferguson, K. (2018). Why It’s Important to Standardize Your Data. Human of Data by Atlan. https://humansofdata.atlan.com/2018/12/datastandardization/#:~: text=Standardized data is essential for,data to measure it against.
Fogarty, D. J. (2006). Multiple imputation as a missing data approach to reject inference on consumer credit scoring. Interstat, December 2000, 1–41. http://interstat.statjournals.net/YEAR/2006/articles/0609001.pdf
Forghani, Y., Tabrizi, R. S., Yazdi, H. S., & Akbarzadeh-T, M. R. (2011). Fuzzy support vector regression. 2011 1st International EConference on Computer and Knowledge Engineering, ICCKE 2011, Vc, 28–33. https://doi.org/10.1109/ICCKE.2011.6413319
Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), 137–146. https://doi.org/10.1007/s11222-009- 9153-8
Gaag, M. van der, Hoffman, T., Remijsen, M., Hijman, R., de Haan, L., van Meijel, B., van Harten, P. N., Valmaggia, L., de Hert, M., Cuijpers, A., & Wiersma, D. (2006). The five-factor model of the Positive and Negative Syndrome Scale II: A ten-fold cross-validation of a revised model. Schizophrenia Research, 85(1–3), 280–287. https://doi.org/10.1016/j.schres.2006.03.021
Gao, L., Song, J., Liu, X., Shao, J., Liu, J., & Shao, J. (2017). Learning in highdimensional multimedia data: the state of the art. Multimedia Systems, 23(3), 303–313. https://doi.org/10.1007/s00530-015-0494-1
Gao, Y., Merz, C., Lischeid, G., & Schneider, M. (2018). A review on missing hydrological data processing. Environmental Earth Sciences, 77(2), 47. https://doi.org/10.1007/s12665-018-7228-6
Gaur, A., & Simonovic, S. P. (2018). Introduction to physical scaling: A model aimed to bridge the gap between statistical and dynamic downscaling approaches. In Trends and Changes in Hydroclimatic Variables: Links to Climate Variability and Change. Elsevier Inc. https://doi.org/10.1016/B978-0-12-810985-4.00004-9
Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American Statistical Association, 70(350), 320–328. https://doi.org/10.1080/01621459.1975.10479865
Ghahramani, Z. (2004). Unsupervised Learning. Machine Learning, 72–112.
Ghasemi, F., Mehridehnavi, A., Pérez-Garrido, A., & Pérez-Sánchez, H. (2018). Neural network and deep-learning algorithms used in QSAR studies: merits and drawbacks. Drug Discovery Today, 23(10), 1784–1790. https://doi.org/10.1016/j.drudis.2018.06.016
Ghosh, S., & Mujumdar, P. P. (2008). Statistical downscaling of GCM simulations to streamflow using relevance vector machine. Advances in Water Resources, 31(1), 132–146. https://doi.org/10.1016/j.advwatres.2007.07.005
Ghritlahre, H. K., & Prasad, R. K. (2018). Application of ANN technique to predict the performance of solar collector systems - A review. Renewable and Sustainable Energy Reviews, 84(September 2017), 75–88. https://doi.org/10.1016/j.rser.2018.01.001
Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on performance of learning algorithms for hydrologic predictions: Implications to an imputation technique. Water Resources Research, 43(7), 1–12. https://doi.org/10.1029/2006WR005298
Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized Cross-Validation as a Method for Choosing a Good Ridge Parameter. Technometrics, 21(2), 215–223.
Goly, A., Teegavarapu, R. S. V., & Mondal, A. (2014). Development and evaluation of statistical downscaling models for monthly precipitation. Earth Interactions, 18(18), 1–28. https://doi.org/10.1175/EI-D-14-0024.1
Gondara, L. (2016). Random forest with random projection to impute missing gene expression data. Proceedings - 2015 IEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015, 1251–1256. https://doi.org/10.1109/ICMLA.2015.29
Grace-Martin, K. (2013). Assessing the Fit of Regression Models. The Analysis Factor. https://www.theanalysisfactor.com/assessing-the-fit-of-regressionmodels/
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549–576. https://doi.org/10.1146/annurev.psych.58.110405.085530
Gupta, P. (2017). Cross-Validation in Machine Learning. Towars Data Science. https://towardsdatascience.com/cross-validation-in-machine-learning- 72924a69872f
Hadipour, S., Harun, S., Arefnia, A., & Alamgir, M. (2016). Transfer function models for statistical downscaling of monthly precipitation. Jurnal Teknologi, 78(9–4), 55–62. https://doi.org/10.11113/jt.v78.9695
Halik, G., Anwar, N., Santosa, B., & Edijatno. (2015). Reservoir inflow prediction under GCM scenario downscaled by wavelet transform and support vector machine hybrid models. Advances in Civil Engineering, 2015(July). https://doi.org/10.1155/2015/515376
Hamidi, O., Poorolajal, J., Sadeghifar, M., Abbasi, H., Maryanaji, Z., Faridi, H. R., & Tapak, L. (2015). A comparative study of support vector machines and artificial neural networks for predicting precipitation in Iran. Theoretical and Applied Climatology, 119(3–4), 723–731. https://doi.org/10.1007/s00704-014-1141-z
Han, M., & Zhao, Y. (2010). Robust relevance vector machine with noise variance coefficient. Proceedings of the International Joint Conference on Neural Networks. https://doi.org/10.1109/IJCNN.2010.5596989
Hannah, L. (2015). The Climate System and Climate Change. In Climate Change Biology. https://doi.org/10.1016/b978-0-12-420218-4.00002-0
Hasan, N., Nath, N. C., & Rasel, R. I. (2016). A support vector regression model for forecasting rainfall. 2nd International Conference on Electrical Information and Communication Technologies, EICT 2015, Eict, 554–559. https://doi.org/10.1109/EICT.2015.7392014
Hayati Rezvan, P., Lee, K. J., & Simpson, J. A. (2015). The rise of multiple imputation: A review of the reporting and implementation of the method in medical research Data collection, quality, and reporting. BMC Medical Research Methodology, 15(1), 1–14. https://doi.org/10.1186/s12874-015-0022-1
Heitjan, D. F., Rubin, D. B., Heitjan, B. Y. D. F., & Rubin, D. B. (1991). Ignorability and Coarse Data. The Annals of Statistics, 19(4), 2244–2253.
Henn, B., Raleigh, M. S., Fisher, A., & Lundquist, J. D. (2013). A comparison of methods for filling gaps in hourly near-surface air temperature data. Journal of Hydrometeorology, 14(3), 929–945. https://doi.org/10.1175/JHM-D-12-027.1
Hewitson, B. C., & Crane, R. G. (1996). Climate downscaling: Techniques and application. Climate Research, 7(2), 85–95. https://doi.org/10.3354/cr007085
Hjelmfelt, A. T., & Wang, M. (1993). Predicting Runoff using Artificial Neural Networks. Proceedings of the International Conference on Hydrology and Water Resources, 16(December), 233–244. https://doi.org/10.1007/978-94-011-0389- 3_16
Hoi, S. C. H., Jin, R., Zhu, J., & Lyu, M. R. (2009). Semisupervised SVM batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems, 27(3), 1–29. https://doi.org/10.1145/1508850.1508854
Hong, S., & Lynn, H. S. (2020). Accuracy of random-forest-based imputation of missing data in the presence of non-normality, non-linearity, and interaction. BMC Medical Research Methodology, 20(1), 1–12. https://doi.org/10.1186/s12874-020-01080-1
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal component. Journal of Educational Psychology, 24(6), 417.
Hou, K., Shao, G., Wang, H., Zheng, L., Zhang, Q., Wu, S., & Hu, W. (2018). Research on practical power system stability analysis algorithm based on modified SVM. Protection and Control of Modern Power Systems, 3(1). https://doi.org/10.1186/s41601-018-0086-0
Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2016). A Practical Guide to Support Vector Classification. Department of Computer Science NAtional Taiwan University, 106. https://doi.org/10.1177/02632760022050997
Huang, S., Nianguang, C. A. I., Penzuti Pacheco, P., Narandes, S., Wang, Y., & Wayne, X. U. (2018). Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics and Proteomics, 15(1), 41–51. https://doi.org/10.21873/cgp.20063
Hunt, L. A. (2017). Missing data imputation and its effect on the accuracy of classification. Studies in Classification, Data Analysis, and Knowledge Organization, 195089, 3–14. https://doi.org/10.1007/978-3-319-55723-6_1
Hussain, M., Yusof, K. W., Mustafa, M. R., & Afshar, N. R. (2015). Application of statistical downscaling model (SDSM) for long term prediction of rainfall in Sarawak, Malaysia. Water Resources Management VIII, 1, 269–278. https://doi.org/10.2495/wrm150231
I, W., & Rahman S, S. S. U. (2015). Treatment of Missing Values in Data Mining. Journal of Computer Science & Systems Biology, 09(02), 51–53. https://doi.org/10.4172/jcsb.1000221
Idri, A., Abnane, I., & Abran, A. (2015). Systematic mapping study of missing values techniques in software engineering data. 2015 IEEE/ACIS 16th International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, SNPD 2015 - Proceedings. https://doi.org/10.1109/SNPD.2015.7176280
Irawan, N. D., Wijono, W., & Setyawati, O. (2017). Perbaikan Missing value Menggunakan Pendekatan Korelasi Pada Metode K-Nearest Neighbor. Jurnal Infotel, 9(3). https://doi.org/10.20895/infotel.v9i3.286
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. Annals of Applied Statistics, 2(3), 841–860. https://doi.org/10.1214/08-AOAS169
Janecek, A., Gansterer, W. N. W., Demel, M., & Ecker, G. (2008). On the Relationship Between Feature Selection and Classification Accuracy. Fsdm, 4, 90–105.
Jemain, A. A. (2015). Penyurihan Ikhtisas Data Hujan. Dewan Bahasa dan Pustaka.
Jiang, P., & Chen, J. (2016). Displacement prediction of landslide based on generalized regression neural networks with K-fold cross-validation. Neurocomputing, 198, 40–47. https://doi.org/10.1016/j.neucom.2015.08.118
Jollife, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065). https://doi.org/10.1098/rsta.2015.0202
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. https://doi.org/10.1126/science.aaa8415
Joseph, V. R., & Vakayil, A. (2021). SPlit: An Optimal Method for Data Splitting. Technometrics, 0(0), 1–11. https://doi.org/10.1080/00401706.2021.1921037
Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R. (2010). Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11, 517–553.
Juvonen, A., Sipola, T., & Hämäläinen, T. (2015). Online anomaly detection using dimensionality reduction techniques for HTTP log analysis. Computer Networks, 91, 46–56. https://doi.org/10.1016/j.comnet.2015.07.019
Kääriäinen, M. (2006). Semi-supervised model selection based on cross-validation. IEEE International Conference on Neural Networks - Conference Proceedings, 1894–1899. https://doi.org/10.1109/ijcnn.2006.246911
Kabanda, T., & Nenwiini, S. (2016). Impacts of climate variation on the length of the rainfall season: an analysis of spatial patterns in North-East South Africa. Theoretical and Applied Climatology, 125(1–2), 93–100. https://doi.org/10.1007/s00704-015-1498-7
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23(3), 187–200. https://doi.org/10.1007/BF02289233
Kaiser, J. (2014). Dealing with Missing Values in Data. Journal of Systems Integration, 42–51. https://doi.org/10.20470/jsi.v5i1.178
Kamaruzaman, I. F., Wan Zin, W. Z., & Mohd Ariff, N. (2017). A comparison of method for treating missing daily rainfall data in Peninsular Malaysia. Malaysian Journal of Fundamental and Applied Sciences, 13(4–1), 375–380. https://doi.org/10.11113/mjfas.v13n4-1.781
Kamble, V. B., & Deshmukh, S. N. (2017). Comparision Between Accuracy and MSE,RMSE by Using Proposed Method with Imputation Technique. Oriental Journal of Computer Science and Technology, 10(04), 773–779. https://doi.org/10.13005/ojcst/10.04.11
Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of Anesthesiology, 64(5), 402–406. https://doi.org/10.4097/kjae.2013.64.5.402
Karamizadeh, S., Abdullah, S. M., Halimi, M., Shayan, J., & Rajabi, M. J. (2014). Advantage and drawback of support vector machine functionality. I4CT 2014 - 1st International Conference on Computer, Communications, and Control Technology, Proceedings, I4ct, 63–65. https://doi.org/10.1109/I4CT.2014.6914146
Karunanithi, N., Grenney, W. J., Whitley, D., & Bovee, K. (1995). Neural networks for river flow prediction. Journal of Computing in Civil Engineering, 8(2), 201– 220. https://doi.org/10.1061/(ASCE)0887-3801(1995)9:4(293.x)
Kassambara. (2018). Evaluation of Classification Model Accuracy: Essentials. Statistical Tools for High-Throughput Data Analysis (STHDA). http://www.sthda.com/english/articles/36-classification-methods-essentials/143- evaluation-of-classification-model-accuracy-essentials/
Katal, A., Wazid, M., & Goundar, R. (2013). Big Data: Issues, Challenges, Tools and Good Practices. 2013 Sixth International Conference on Contemporary Computing (IC3), 404–409. https://doi.org/10.1109/IC3.2013.6612229.
Kavitha, R., & Kannan, E. (2016). An efficient framework for heart disease classification using feature extraction and feature selection technique in data mining. 1st International Conference on Emerging Trends in Engineering, Technology and Science, ICETETS 2016 - Proceedings. https://doi.org/10.1109/ICETETS.2016.7603000
Khan, F. U. F., Khan, K. U. Z., & Singh, S. K. (2018). Is Group Means Imputation Any Better Than Mean Imputation: A Study Using C5.0 Classifier. Journal of Physics: Conference Series, 1060(1), 1–5. https://doi.org/10.1088/1742- 6596/1060/1/012014
Kim, J., & Ryu, J. H. (2016). A heuristic gap filling method for daily precipitation series. Water Resources Management, 30(7), 2275–2294. https://doi.org/10.1007/s11269-016-1284-z
Knoben, W. J. M., Freer, J. E., & Woods, R. A. (2019). Technical note: Inherent benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores. Hydrology and Earth System Sciences, 23(10), 4323–4331. https://doi.org/10.5194/hess-23-4323-2019
Koch, P., Konen, W., Flasch, O., & Bartz-Beielstein, T. (n.d.). Optimization of Support Vector Regression Models for Stormwater Prediction. 146--160.
Kohavi, R. (1995). A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. International Joint Conference of Artificial Intelligence, March 2001.
Kolmogorov, A. N. (1957). On the representation of continuous functions of several variables as superpositions of continuous functions of one variable and addition. Doklady Akademii Nauk SSSR, 114(5), 953–956. https://doi.org/10.18411/lj-12- 2018-148
Kong, D., Chen, Y., Li, N., Duan, C., Lu, L., & Chen, D. (2019). Relevance vector machine for tool wear prediction. Mechanical Systems and Signal Processing, 127, 573–594. https://doi.org/10.1016/j.ymssp.2019.03.023
Kong, Q., Gong, H., Ding, X., & Hou, R. (2017). Classification Application Based on Mutual Information and Random Forest Method for High Dimensional Data. Proceedings - 9th International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2017, 1(Mi), 171–174. https://doi.org/10.1109/IHMSC.2017.45
Kotu, V., & Deshpande, B. (2019). Model Evaluation. Data Science, 263–279. https://doi.org/10.1016/b978-0-12-814761-0.00008-3
Kouhestani, S., Eslamian, S. S., Abedi-Koupai, J., & Besalatpour, A. A. (2016). Projection of climate change impacts on precipitation using soft-computing techniques: A case study in Zayandeh-rud Basin, Iran. Global and Planetary Change, 144(July), 158–170. https://doi.org/10.1016/j.gloplacha.2016.07.013
Kumar, P. S., Praveen, T. V., & Prasad, M. A. (2016). Artificial Neural Network Model for Rainfall-Runoff -A Case Study. International Journal of Hybrid Information Technology, 9(3), 263–272. https://doi.org/10.14257/ijhit.2016.9.3.24
Lang, K. M., & Little, T. D. (2018). Principled missing data treatments. Prevention Science, 19(3), 284–294. https://doi.org/10.1007/s11121-016-0644-5
Larson, S. C. (1931). The shrinkage of the coefficient of multiple correlation. Journal of Educational Psychology, 22(1), 45–55. https://doi.org/10.1037/h0072400
Lee, K. J., & Carlin, J. B. (2010). Multiple imputation for missing data: Fully conditional specification versus multivariate normal imputation. American Journal of Epidemiology, 171(5), 624–632. https://doi.org/10.1093/aje/kwp425
Lei, J. (2019). Cross-Validation With Confidence. Journal of the American Statistical Association, 115(532), 1978–1997. https://doi.org/10.1080/01621459.2019.1672556
Li, L., He, S., Zhang, J., & Ran, B. (2016). Short-term highway traffic flow prediction based on a hybrid strategy considering temporal–spatial information. Journal of Advanced Transportation, 50(8), 2029–2040. https://doi.org/10.1002/atr.1443
Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R News, 2(3), 18–22.
Lin, S., Zhang, S., Qiao, J., Liu, H., & Yu, G. (2008). A parameter choosing method of SVR for time series prediction. Proceedings of the 9th International Conference for Young Computer Scientists, ICYCS 2008, 130–135. https://doi.org/10.1109/ICYCS.2008.393
Lionello, P., Abrantes, F., Congedi, L., Dulac, F., Gacic, M., Gomis, D., Goodess, C., Hoff, H., Kutiel, H., Luterbacher, J., Planton, S., Reale, M., Schröder, K., Vittoria Struglia, M., Toreti, A., Tsimplis, M., Ulbrich, U., & Xoplaki, E. (2012). Introduction: Mediterranean Climate-Background Information. In The Climate of the Mediterranean Region. Elsevier. https://doi.org/10.1016/B978-0-12-416042- 2.00012-4
Liu, C. W., Lin, K. H., & Kuo, Y. M. (2003). Application of factor analysis in the assessment of groundwater quality in a blackfoot disease area in Taiwan. Science of the Total Environment, 313(1–3), 77–89. https://doi.org/10.1016/S0048- 9697(02)00683-6
Lo Presti, R., Barca, E., & Passarella, G. (2010). A methodology for treating missing data applied to daily rainfall data in the Candelaro River Basin (Italy). Environmental Monitoring and Assessment, 160(1–4), 1–22. https://doi.org/10.1007/s10661-008-0653-3
Lopez, C., Tucker, S., Salameh, T., & Tucker, C. (2018). An unsupervised machine learning method for discovering patient clusters based on genetic signatures. Journal of Biomedical Informatics, 85(June), 30–39. https://doi.org/10.1016/j.jbi.2018.07.004
Loyola R, D. G., Pedergnana, M., & Gimeno García, S. (2016). Smart sampling and incremental function learning for very large high dimensional data. Neural Networks, 78, 75–87. https://doi.org/10.1016/j.neunet.2015.09.001
Luo, J., & Sun, Y. (2020). Optimization of process parameters for the minimization of surface residual stress in turning pure iron material using central composite design. Measurement: Journal of the International Measurement Confederation, 163, 108001. https://doi.org/10.1016/j.measurement.2020.108001
MacKay, D. J. C. (1996). Bayesian Methods for Backpropagation Networks. Physics of Neural Networks, 211–254. https://doi.org/10.1007/978-1-4612-0723-8_6
Mahmood, B. (2016). 4 Reasons Your Machine Learning Model is Wrong (and How to Fix It). KD Nuggets. https://www.kdnuggets.com/2016/12/4-reasons-machinelearning- model-wrong.html
Majumder, S. K., Ghosh, N., & Gupta, P. K. (2005). Relevance vector machine for optical diagnosis of cancer. Lasers in Surgery and Medicine, 36(4), 323–333. https://doi.org/10.1002/lsm.20160
Malhi, A., & Gao, R. X. (2004). PCA-based feature selection scheme for machine defect classification. IEEE Transactions on Instrumentation and Measurement, 53(6), 1517–1525. https://doi.org/10.1109/TIM.2004.834070
Mandel J, S. P. (2015). A Comparison of Six Methods for Missing Data Imputation. Journal of Biometrics & Biostatistics, 06(01), 1–6. https://doi.org/10.4172/2155- 6180.1000224
Manikandan, J., & Venkataramani, B. (2009). Design of a modified one-against-all SVM classifier. Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, October, 1869–1874. https://doi.org/10.1109/ICSMC.2009.5346200
Marçais, J., Dreuzy, J. De, Marçais, J., Prospective, J. D. D., & Learning, D. (2018). Prospective Interest of Deep Learning for Hydrological Inference. Groundwater, Wiley, 55(5), 688–692. https://hal-insu.archives-ouvertes.fr/insu-01574652
McCuen, R. H., Knight, Z., & Cutter, A. G. (2006). Evaluation of the Nash–Sutcliffe Efficiency Index. Journal of Hydrologic Engineering, 11(6), 597–602. https://doi.org/10.1061/(asce)1084-0699(2006)11:6(597)
Mechoso, C. R., & Arakawa, A. (2015). Numerical Models: General Circulation Models. In Encyclopedia of Atmospheric Sciences: Second Edition (Second Edi, Vol. 4). Elsevier. https://doi.org/10.1016/B978-0-12-382225-3.00157-2
Mehta, P., Bukov, M., Wang, C. H., Day, A. G. R., Richardson, C., Fisher, C. K., & Schwab, D. J. (2019). A high-bias, low-variance introduction to Machine Learning for physicists. Physics Reports, 810, 1–124. https://doi.org/10.1016/j.physrep.2019.03.001
Mekonnen, D. G., Moges, M. A., Mulat, A. G., & Shumitter, P. (2019). The impact of climate change on mean and extreme state of hydrological variables in Megech watershed, Upper Blue Nile Basin, Ethiopia. In Extreme Hydrology and Climate Variability: Monitoring, Modelling, Adaptation and Mitigation (Issue 2009). Elsevier Inc. https://doi.org/10.1016/B978-0-12-815998-9.00011-7
Meng, C., Zeleznik, O. A., Thallinger, G. G., Kuster, B., Gholami, A. M., & Culhane, A. C. (2016). Dimension reduction techniques for the integrative analysis of multi-omics data. Briefings in Bioinformatics, 17(4), 628–641. https://doi.org/10.1093/bib/bbv108
Methaprayoon, K., Yingvivatanapong, C., Lee, W. J., & Liao, J. R. (2007). An integration of ANN wind power estimation into unit commitment considering the forecasting uncertainty. IEEE Transactions on Industry Applications, 43(6), 1441–1448. https://doi.org/10.1109/TIA.2007.908203
Minakshi Vohra, R. G. (2014). Missing Value Imputation in Multi Attribute Data Set. International Journal of Computer Science and Information Technologies, 5(4), 5315–5321.
Mishra, N., Soni, H. K., Sharma, S., & Upadhyay, A. K. (2018). Development and analysis of Artificial Neural Network models for rainfall prediction by using time-series data. International Journal of Intelligent Systems and Applications, 10(1), 16–23. https://doi.org/10.5815/ijisa.2018.01.03
Mishra, S., & Datta-Gupta, A. (2018). Data-Driven Modeling. Applied Statistical Modeling and Data Analytics, 195–224. https://doi.org/10.1016/b978-0-12- 803279-4.00008-0
Moritz, S., Sardá, A., Bartz-Beielstein, T., Zaefferer, M., & Stork, J. (2015). Comparison of different Methods for Univariate Time Series Imputation in R. Preprint ArXiv:1510.03924, arXiv, 1–20. http://arxiv.org/abs/1510.03924
Moss, H. B., Leslie, D. S., & Rayson, P. (2018). Using J-K-fold cross validation to reduce variance when tuning NLP models. ArXiv.
Mosteller, F., & Tukey, J. W. (1968). Data analysis, including statistics. In Handbook of Social Psychology. Addison-Wesley. https://doi.org/10.1214/aos/1043351253
Mosteller, Frederick, & Wallace, D. L. (1963). Inference in an Authorship Problem. Journal of the American Statistical Association, 58(302), 275–309. https://doi.org/10.1080/01621459.1963.10500849
Mubarak, S., Darwis, H., Umar, F., Ilmawan, L. B., Anraeni, S., & Mude, M. A. (2018). Feature Selection of Oral Cyst and Tumor Images Using Principal Component Analysis. Proceedings - 2nd East Indonesia Conference on Computer and Information Technology: Internet of Things for Industry, EIConCIT 2018, 322–325. https://doi.org/10.1109/EIConCIT.2018.8878641
Muhammad, I., & Yan, Z. (2015). Supervised Machine Learning Approaches: a Survey. ICTACT Journal on Soft Computing, 05(03), 946–952. https://doi.org/10.21917/ijsc.2015.0133
Murti, D. M. P., Pujianto, U., Wibawa, A. P., & Akbar, M. I. (2019). K-Nearest Neighbor (K-NN) based Missing Data Imputation. Proceeding - 2019 5th International Conference on Science in Information Technology: Embracing Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019, 83– 88. https://doi.org/10.1109/ICSITech46713.2019.8987530
Naik, P., Wedel, M., Bacon, L., Bodapati, A., Bradlow, E., Kamakura, W., Kreulen, J., Lenk, P., Madigan, D. M., & Montgomery, A. (2008). Challenges and opportunities in high-dimensional choice data analyses. Marketing Letters, 19(3– 4), 201–213. https://doi.org/10.1007/s11002-008-9036-3
Nanda, M. A., Seminar, K. B., Nandika, D., & Maddu, A. (2018). A comparison study of kernel functions in the support vector machine and its application for termite detection. Information (Switzerland), 9(1). https://doi.org/10.3390/info9010005
Nash, J. E., & Sutcliffe, J. V. (1970). River Flow Forecasting through Conceptual Models Part 1- A discussion of principles. In Journal of Hydrology (Vol. 10, Issue 3). https://doi.org/10.1080/00750770109555783
Nasteski, V. (2017). An overview of the supervised machine learning methods. Horizons.B, 4(December 2017), 51–62. https://doi.org/10.20544/horizons.b.04.1.17.p05
Nasution, M. Z. F., Sitompul, O. S., & Ramli, M. (2018). PCA based feature reduction to improve the accuracy of decision tree c4.5 classification. Journal of Physics: Conference Series, 978(1). https://doi.org/10.1088/1742- 6596/978/1/012058
Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law. Contemporary Physics, 46(5), 323–351. https://doi.org/10.1080/00107510500052444
Ng, S. C. (2017). Principal component analysis to reduce dimension on digital image. Procedia Computer Science, 111(2015), 113–119. https://doi.org/10.1016/j.procs.2017.06.017
Nikolaev, N., & Tino, P. (2005). Sequential relevance vector machine learning from time series. Proceedings of the International Joint Conference on Neural Networks, 2, 1308–1313. https://doi.org/10.1109/IJCNN.2005.1556043
Nishijima, M., Nieuwenhoff, N., Pires, R., & Oliveira, P. R. (2019). Movie films consumption in Brazil: an analysis of support vector machine classification. AI and Society, 0123456789. https://doi.org/10.1007/s00146-019-00899-7
Noor, M., Tarmizi Ismail, S. S., Bin, F. A. B., Nashwan, M. S., Khan, N., Ahmed, K., Shiru, M. S., Muhammad, M. K. I. Bin, A.Salman, S., Momade, M. H., Iqbal, Z., Sa’Adi, Z., & Khan, and S. U. (n.d.). Annual Rainfall Variations in Peninsular Malaysia under Climate Change Scenarios. 1(15), 298–317.
Nourani, V., Razzaghzadeh, Z., Baghanam, A. H., & Molajou, A. (2019). ANN-based statistical downscaling of climatic parameters using decision tree predictor screening method. Theoretical and Applied Climatology, 137(3–4), 1729–1746. https://doi.org/10.1007/s00704-018-2686-z
O. Yamini, & Prof. S. Ramakrishna. (2015). A Study on Advantages of Data Mining Classification Techniques. International Journal of Engineering Research And, V4(09), 969–972. https://doi.org/10.17577/ijertv4is090815
Okkan, U., & Inan, G. (2015). Bayesian Learning and Relevance Vector Machines Approach for Downscaling of Monthly Precipitation. Journal of Hydrologic Engineering, 20(4), 04014051. https://doi.org/10.1061/(asce)he.1943- 5584.0001024
Okkan, U., Serbes, Z. A., & Samui, P. (2014). Relevance vector machines approach for long-term flow prediction. Neural Computing and Applications, 25(6), 1393– 1405. https://doi.org/10.1007/s00521-014-1626-9
Othman, A. S., & Tukimat, N. N. A. (2018). Assessment of the Potential Occurrence of Dry Period in the Long Term for Pahang State, Malaysia. MATEC Web of Conferences, 150, 1–6. https://doi.org/10.1051/matecconf/201815003004
Pal, M. (2011). Kernel Methods in Remote Sensing: A review. Ish Journal of Hydraulic Engineering, 15(1), 194–215. http://arxiv.org/abs/1101.2987
Panigrahi, R., & Borah, S. (2019). Classification and Analysis of Facebook Metrics Dataset Using Supervised Classifiers. In Social Network Analytics. Elsevier Inc. https://doi.org/10.1016/b978-0-12-815458-8.00001-3
Pantanowitz, A., & Marwala, T. (2009). Missing data imputation through the use of the random forest algorithm. Advances in Intelligent and Soft Computing, 61 AISC, 53–62. https://doi.org/10.1007/978-3-642-03156-4_6
Parmar, A., Mistree, K., & Sompurna, M. (2017). Machine Learning Techniques for Rainfall Prediction : A Review. 3(6), 913–917.
Paul D., A. (2001). Missing data — Quantitative applications in the social sciences. SAGE Publication.
Pearson F.R.S., K. (1901). Llll. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science Series, 2(11), 559–572. https://doi.org/10.1080/14786440109462720
Pekalska, E. (2015). Pattern Recognition Tools. Pattern Recognition Tools 37Steps. http://37steps.com/4859/cross-validation/
Pepinsky, T. B. (2018). A Note on Listwise Deletion versus Multiple Imputation. Political Analysis, 26(4), 480–488. https://doi.org/10.1017/pan.2018.18
Peterson, C., & Rognvaldsson, T. (1991). An Introduction to Artifical Neuron Network. In Fundamental of Neural Network: Architecture Algorithm and Application (pp. 113–169). 1991 CERN School of Computing.
Pett, M., Lackey, N., & Sullivan, J. (2011). An Overview of Factor Analysis. Making Sense of Factor Analysis, 2–12. https://doi.org/10.4135/9781412984898.n1
Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of Educational Research, 74(4), 525–556. https://doi.org/10.3102/00346543074004525
Pour, S. H., Shahid, S., & Chung, E. S. (2016). A Hybrid Model for Statistical Downscaling of Daily Rainfall. Procedia Engineering, 154, 1424–1430. https://doi.org/10.1016/j.proeng.2016.07.514
Pramoditha, R. (2021). 11 Dimensionality reduction techniques you should know in 2021. Medium. https://towardsdatascience.com/11-dimensionality-reductiontechniques- you-should-know-in-2021-dcb9500d388b
Punlumjeak, W., Arunrerk, J., & Rachburee, N. (2017). An analytics prediction model of monthly rainfall time series: Case of Thailand. Journal of Telecommunication, Electronic and Computer Engineering, 9(2–6), 53–57.
Qian, L., Liu, C., Yi, J., & Liu, S. (2020). Application of hybrid algorithm of bionic heuristic and machine learning in nonlinear sequence. Journal of Physics: Conference Series, 1682(1). https://doi.org/10.1088/1742-6596/1682/1/012009
Qiu, M., Song, Y., & Akagi, F. (2016). Application of artificial neural network for the prediction of stock market returns: The case of the Japanese stock market. Chaos, Solitons and Fractals, 85, 1–7. https://doi.org/10.1016/j.chaos.2016.01.004
Qiu, S., Gao, L., & Wang, J. (2014). Classification and regression of ELM, LVQ and SVM for E-nose data of strawberry juice. Journal of Food Engineering, 144, 77– 85. https://doi.org/10.1016/j.jfoodeng.2014.07.015
Quiiionero-candela, J., & Hansen, L. K. (2002). Time Series Prediction based on the Relevance Vector Machine with Adaptive Kernels. 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, 985–988.
Raghavendra, S., & Deka, P. C. (2014). Support vector machine applications in the field of hydrology: A review. Applied Soft Computing Journal, 19, 372–386. https://doi.org/10.1016/j.asoc.2014.02.002
Rakesh Tanty, & Tanweer S. Desmukh. (2015). Application of Artificial Neural Network in Hydrology- A Review. International Journal of Engineering Research And, V4(06), 2–7. https://doi.org/10.17577/ijertv4is060247
Raman, H., & Sunilkumar, N. (1995). Multivariate modelling of water resources time series using artificial neural networks. Hydrological Sciences Journal, 40(2), 145–163. https://doi.org/10.1080/02626669509491401
Rau, P., Bourrel, L., Labat, D., Melo, P., Dewitte, B., Frappart, F., Lavado, W., & Felipe, O. (2017). Regionalization of rainfall over the Peruvian Pacific slope and coast. International Journal of Climatology, 37(1), 143–158. https://doi.org/10.1002/joc.4693
Rawal, S., Gupta, S. C., & Singh, S. (2017). Predicting Missing Values in a Dataset: Challenges and Approaches. International Journal of Recent Research Aspects, 4(3), 34–38. https://www.ijrra.net/Vol4issue3/IJRRA-04-03-07.pdf
Ray, S. (2015). 7 Regression Techniques you should know! Analytics Vidhya. https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guideregression/
Ray, S. (2017). Understanding Support Vector Machine(SVM) algorithm from examples (along with code). Analytics Vidhya. https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vectormachine- example-code/
Ries, A., Campbell, A., Strategic, A., Centre, M., & Zeldin, T. (1997). The 80/20 principle: The secret of achieving more with less. In Long Range Planning (Vol. 30, Issue 6). https://doi.org/10.1016/s0024-6301(97)80978-8
Ritter, A., & Muñoz-Carpena, R. (2013). Performance evaluation of hydrological models: Statistical significance for reducing subjectivity in goodness-of-fit assessments. Journal of Hydrology, 480, 33–45. https://doi.org/10.1016/j.jhydrol.2012.12.004
Rodr, R., Pastorini, M., Etcheverry, L., Chreties, C., Fossati, M., Castro, A., & Gorgoglione, A. (2021). Water-Quality Data Imputation with a High Percentage of Missing Values : A Machine Learning Approach. Sustainability, 13, 6318.
Rohani, A., Taki, M., & Abdollahpour, M. (2018). A novel soft computing model (Gaussian process regression with K-fold cross validation) for daily and monthly solar radiation forecasting (Part: I). Renewable Energy, 115, 411–422. https://doi.org/10.1016/j.renene.2017.08.061
Rosebrock, A. (2019). Why is my validation loss lower than my training loss? PyImageSearch. https://www.pyimagesearch.com/2019/10/14/why-is-myvalidation- loss-lower-than-my-training-loss/
Roudier, P. (2017). Just enough machine learning to be dangerous. Creative Commons Attribution 4.0. http://pierreroudier.github.io/teaching/20171014- DSM-Masterclass-Hamilton/machine-learningtheory. html#for_more_information
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592. https://doi.org/10.1186/1471-2105-12-432
Rubin, D. B., & Wiley, A. J. (2014). Statistical Analysis with Missing Data. NY: John Wiley & Sons.
Ruiming, F. (2019). Wavelet based relevance vector machine model for monthly runoff prediction. Water Quality Research Journal of Canada, 54(2), 134–141. https://doi.org/10.2166/wcc.2018.196
Rushton, A., Croucher, P. and Baker, P. (2014). Handbook of logistics and distribution management. Kogan Page Limited.
Sachindra, D. A., Ahmed, K., Rashid, M. M., Shahid, S., & Perera, B. J. C. (2018). Statistical downscaling of precipitation using machine learning techniques. Atmospheric Research, 212, 240–258. https://doi.org/10.1016/j.atmosres.2018.05.022
Sachindra, D. A., Huang, F., Barton, A., & Perera, B. J. C. (2013). Least square support vector and multi-linear regression for statistically downscaling general circulation model outputs to catchment streamflows. International Journal of Climatology, 33(5), 1087–1106. https://doi.org/10.1002/joc.3493
Saiful Samsudin, M., Azid, A., Iskandar Khalit, S., Milleana Shaharudin, S., Lananan, F., & Juahir, H. (2018). Pollution Sources Identification of Water Quality Using Chemometrics: a Case Study in Klang River Basin, Malaysia. International Journal of Engineering & Technology, 7(4.43), 83–89. https://www.researchgate.net/publication/331701453
Saini, O. and P. S. S. (2018). A Review on Dimension Reduction Techniques in Data Mining. Computer Engineering and Intelligent Systems, 9(1), 7–14.
Saitta, S. (2010). What is a good classification accuracy in data mining? Data Mining. http://www.dataminingblog.com/what-is-a-good-classificationaccuracy- in-data-mining/
Salvi, K., S., K., & Ghosh, S. (2013). High-resolution multisite daily rainfall projections in India with statistical downscaling for climate change impacts assessment. Journal of Geophysical Research: Atmospheres, 118(9), 3557–3578. https://doi.org/10.1002/jgrd.50280
Samsudin, M. S., Khalit, S. I., Azid, A., Juahir, H., Mohd Saudi, A. S., Sharip, Z., & Zaudi, M. A. (2017). Control limit detection for source apportionment in Perlis River Basin, Malaysia. Malaysian Journal of Fundamental and Applied Sciences, 13(3). https://doi.org/10.11113/mjfas.v13n3.687
Samui, P. (2012). Application of Relevance Vector Machine for Prediction of Ultimate Capacity of Driven Piles in Cohesionless Soils. Geotechnical and Geological Engineering, 30(5), 1261–1270. https://doi.org/10.1007/s10706-012- 9539-9
Samui, P., & Dixon, B. (2012). Application of support vector machine and relevance vector machine to determine evaporative losses in reservoirs. Hydrological Processes, 26(9), 1361–1369. https://doi.org/10.1002/hyp.8278
Samui, P., Mandla, V. R., Krishna, A., & Teja, T. (2011). Prediction of Rainfall Using Support Vector Machine and Relevance Vector Machine. Earth Science India, 4(Iv), 188–200.
Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7(2), 147–177. https://doi.org/10.1037/1082- 989X.7.2.147
Schmidt, I., & Mosima, B. (2014). To Impute or Not Impute : That Is the Question ? In & H. J. A. (Eds. . In G. J. Mellenbergh (Ed.), Advising on research methods: Selected topics 2013. Johannes van Kessel Publishing. http://www.paultwin.com/wpcontent/ uploads/Lodder_1140873_Paper_Imputation.pdf
Schölkopf, B., Smola, A., & Müller, K. R. (1997). Kernel principal component analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 1327, 583–588.
Schoof, J. T. (2013). Statistical downscaling in climatology. Geography Compass, 7(4), 249–265. https://doi.org/10.1111/gec3.12036
Shaharudin, S. M., Ahmad, N., Zainuddin, N. H., & Mohamed, N. S. (2018). Identification of rainfall patterns on hydrological simulation using robust principal component analysis. Indonesian Journal of Electrical Engineering and Computer Science, 11(3), 1162–1167. https://doi.org/10.11591/ijeecs.v11.i3.pp1162-1167
Shaharudin, Shazlyn Milleana, Andayani, S., Kismiantini, Binatari, N., Kurniawan, A., Basri, M. A. A., & Zainuddin, N. H. (2020). Imputation methods for addressing missing data of monthly rainfall in Yogyakarta, Indonesia. International Journal of Advanced Trends in Computer Science and Engineering, 9(1.4 Special Issue), 646–651. https://doi.org/10.30534/ijatcse/2020/9091.42020
Shamseldin, A. Y. (1997). Application of a neural network technique to rainfallrunoff modelling. Journal of Hydrology, 199(3–4), 272–294. https://doi.org/10.1016/S0022-1694(96)03330-6
Sherer, T., & JiayueHu. (2018). Training and Test Data sets. Shetty, B. (2020). An In-Depth Guide to Supervised Machine Learning Classification. Built In. https://builtin.com/data-science/supervised-machine-learningclassification
Shi, C. R., & Adnan, R. (2014). Modified cross-validation as a method for estimating parameter. AIP Conference Proceedings, 1635(2014), 724–731. https://doi.org/10.1063/1.4903662
Shinozaki, T., & Ostendorf, M. (2008). Cross-validation and aggregated EM training for robust parameter estimation. Computer Speech and Language, 22(2), 185– 195. https://doi.org/10.1016/j.csl.2007.07.005
Singh, K. P., Malik, A., & Sinha, S. (2005). Water quality assessment and apportionment of pollution sources of Gomti river (India) using multivariate statistical techniques - A case study. Analytica Chimica Acta, 538(1–2), 355– 374. https://doi.org/10.1016/j.aca.2005.02.006
Smid, M., & Costa, A. C. (2018). Climate projections and downscaling techniques: a discussion for impact studies in urban systems. International Journal of Urban Sciences, 22(3), 277–307. https://doi.org/10.1080/12265934.2017.1409132
Soley-bori, M. (2013). Dealing with missing data: Key assumptions and methods for applied analysis. PM931 Directed Study in Health Policy and Management, 4, 20.
Song, F., Guo, Z., & Mei, D. (2010). Feature selection using principal component analysis. Proceedings - 2010 International Conference on System Science, Engineering Design and Manufacturing Informatization, ICSEM 2010, 1, 27–30. https://doi.org/10.1109/ICSEM.2010.14
Stahl, J. (2019). Overfitting in Machine Learning: What it is and How to prevent. Elite Data Science. https://elitedatascience.com/overfitting-in-machine-learning
Stekhoven, D. J., & Bühlmann, P. (2012). Missforest-Non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), 112–118. https://doi.org/10.1093/bioinformatics/btr597
Stephen, O. (2012). Hybrid GA-SVM for Efficient Feature Selection in E-mail Classification. 3(3), 17–29.
Stone M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions. Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 111– 147.
Stone, M. (1977). Equivalence of Choice of Model by Cross-validation An Asymptotic Akaike ’ s Criterion. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 44–47.
Su, Y., Huang, Y., & Kuo, C. C. J. (2018). Efficient Text Classification Using Treestructured Multi-linear Principal Component Analysis. Proceedings - International Conference on Pattern Recognition, 2018-Augus, 585–590. https://doi.org/10.1109/ICPR.2018.8545832
Tahir, T., Hashim, A. M., & Yusof, K. W. (2018). Statistical downscaling of rainfall under transitional climate in Limbang River Basin by using SDSM. IOP Conference Series: Earth and Environmental Science, 140(1). https://doi.org/10.1088/1755-1315/140/1/012037
Tang, F., & Ishwaran, H. (2017). Random forest missing data algorithms. Statistical Analysis and Data Mining, 10(6), 363–377. https://doi.org/10.1002/sam.11348
Tang, J., Niu, X., Wang, S., Gao, H., Wang, X., & Wu, J. (2016). Statistical downscaling and dynamical downscaling of regional climate in China: Present climate evaluations and future climate projections. Journal of Geophysical Research: Atmospheres, 121, 2110–2129. https://doi.org/10.1038/175238c0
Tangri, N., Ansell, D., & Naimark, D. (2008). Predicting technique survival in peritoneal dialysis patients: Comparing artificial neural networks and logistic regression. Nephrology Dialysis Transplantation, 23(9), 2972–2981. https://doi.org/10.1093/ndt/gfn187
Tannenbaum, C. E. (2009). The Empirical Nature and Statistical Treatment of Missing data [University of Pennsylvania]. In ProQuest Dissertations Publishing. http://dx.doi.org/10.1016/j.jaci.2012.05.050
Tanwar, S., Ramani, T., & Tyagi, S. (2018). Dimensionality reduction using PCA and SVD in big data: A comparative case study. Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST, 220 LNICST, 116–125. https://doi.org/10.1007/978-3-319-73712-6_12
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2014). Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS. Journal of Hydrology, 512, 332–343. https://doi.org/10.1016/j.jhydrol.2014.03.008
Tipping, M. E. (2000). The relevance vector machine. Advances in Neural Information Processing Systems, 653–658.
Tipping, M. E. (2001). Sparse Bayesian Learning and the Relevance Vector Machine. Journal of Machine Learning Research, 1(3), 211–244. https://doi.org/10.1162/15324430152748236
Tisseuil, C., Vrac, M., Lek, S., & Wade, A. J. (2010). Statistical downscaling of river flows. Journal of Hydrology, 385(1–4), 279–291. https://doi.org/10.1016/j.jhydrol.2010.02.030
Tokuç, A. A. (2021). Splitting a Dataset into Train and Test Sets. Baeldung. https://www.baeldung.com/cs/train-test-datasets-ratio
Tripathi, S., & Govindaraju, R. S. (2007). On selection of kernel parametes in relevance vector machines for hydrologic applications. Stochastic Environmental Research and Risk Assessment, 21(6), 747–764. https://doi.org/10.1007/s00477- 006-0087-9
Tripathi, S., Srinivas, V. V., & Nanjundiah, R. S. (2006). Downscaling of precipitation for climate change scenarios: A support vector machine approach. Journal of Hydrology, 330(3–4), 621–640. https://doi.org/10.1016/j.jhydrol.2006.04.030
Trzaska, S., & Schnarr, E. (2014). A review of downscaling methods for climate change projections. United States Agency for International Development by Tetra Tech ARD, September, 1–42.
Tsakiri, K., Marsellos, A., & Kapetanakis, S. (2018). Artificial neural network and multiple linear regression for flood prediction in Mohawk River, New York. Water (Switzerland), 10(9). https://doi.org/10.3390/w10091158
Tutz, G., & Ramzan, S. (2015). Improved methods for the imputation of missing data by nearest neighbor methods. Computational Statistics and Data Analysis, 90(xxxx), 84–99. https://doi.org/10.1016/j.csda.2015.04.009
Valentine, J. C., & McHugh, C. M. (2007). The Effects of Attrition on Baseline Comparability in Randomized Experiments in Education: A Meta-Analysis. Psychological Methods, 12(3), 268–282. https://doi.org/10.1037/1082- 989X.12.3.268
Vallantin, L. (2018). Why you should not trust only in accuracy to measure machine learning performance. Medium. https://medium.com/@limavallantin/why-youshould- not-trust-only-in-accuracy-to-measure-machine-learning-performancea72cf00b4516
van der Heijden, G. J. M. G., T. Donders, A. R., Stijnen, T., & Moons, K. G. M. (2006). Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: A clinical example. Journal of Clinical Epidemiology, 59(10), 1102–1109. https://doi.org/10.1016/j.jclinepi.2006.01.015
Van Heerden, C., Barnard, E., Davel, M., Van Der Walt, C., Van Dyk, E., Feld, M., & Müller, C. (2010). Combining regression and classification methods for improving automatic speaker age recognition. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 5174– 5177. https://doi.org/10.1109/ICASSP.2010.5495006
Van Uytven, E., De Niel, J., & Willems, P. (2019). Uncovering the shortcomings of a weather typing based statistical downscaling method. Hydrology and Earth System Sciences Discussions, 1–35. https://doi.org/10.5194/hess-2019-40
Vandal, T., Kodra, E., & Ganguly, A. R. (2019). Intercomparison of machine learning methods for statistical downscaling: the case of daily and extreme precipitation. Theoretical and Applied Climatology, 137(1–2), 557–570. https://doi.org/10.1007/s00704-018-2613-3
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer Science. https://doi.org/10.1007/978-1-4757-2440-0
Visoni, P. (2015). Predictive Model Selection Criteria for Logistic Regression. Statistical Modelling, 8, 1000–1005. https://doi.org/10.1400/40307
Vrac, M., Stein, M., & Hayhoe, K. (2007). Statistical downscaling of precipitation through nonhomogeneous stochastic weather typing. Climate Research, 34(3), 169–184. https://doi.org/10.3354/cr00696
Vu, M. T., Aribarg, T., Supratid, S., Raghavan, S. V., & Liong, S. Y. (2016). Statistical downscaling rainfall using artificial neural network: significantly wetter Bangkok? Theoretical and Applied Climatology, 126(3–4), 453–467. https://doi.org/10.1007/s00704-015-1580-1
Wakefield, K. (2019). Predictive analytics and machine learning. SAS Analytics. https://www.sas.com/en_gb/insights/articles/analytics/a-guide-to-predictiveanalytics- and-machine-learning.html
Waljee, A. K., Mukherjee, A., Singal, A. G., Zhang, Y., Warren, J., Balis, U., Marrero, J., Zhu, J., & Higgins, P. D. R. (2013). Comparison of imputation methods for missing laboratory data in medicine. BMJ Open, 3(8), 1–7. https://doi.org/10.1136/bmjopen-2013-002847
Wang, J. E., & Qiao, J. Z. (2014). Parameter selection of SVR based on improved kfold cross validation. Applied Mechanics and Materials, 462–463, 182–186. https://doi.org/10.4028/www.scientific.net/AMM.462-463.182
Wang, Y., Xiao, Y., Lai, J., & Chen, Y. (2020). An adaptive k nearest neighbour method for imputation of missing traffic data based on two similarity metrics. Archives of Transport, 54(2), 59–73. https://doi.org/10.5604/01.3001.0014.2968
Wei, L., Yang, Y., Nishikawa, R. M., Wernick, M. N., & Edwards, A. (2005). Relevance vector machine for automatic detection a of clustered microcalcifications. IEEE Transactions on Medical Imaging, 24(10), 1278–1285. https://doi.org/10.1109/TMI.2005.855435
Wen, Z., Li, B., Ramamohanarao, K., Chen, J., Chen, Y., & Zhang, R. (2017). Improving efficiency of SVM k-fold cross-validation by alpha seeding. 31st AAAI Conference on Artificial Intelligence, AAAI 2017, i, 2768–2774.
Wigley, R. L. W. and T. M. L. (1997). Downscaling general circulation model output:a review of methods and limitations. Progress in Physical Geography, 21(4), 530–548.
Wilby, R. L., Charles, S. P., Zorita, E., Timbal, B., Whetton, P., & Mearns, L. O. (2004). Guidelines for Use of Climate Scenarios Developed from Statistical Downscaling Methods. Analysis, 27(August), 1–27. https://doi.org/citeulikearticle- id:8861447
Wiskott, L. (2016). Lecture notes on Principal Component Analysis. https://doi.org/http://orcid.org/0000-0001-6237-740X
Wiskott, L., & Alberto N., E.-B. (2013). How to Solve Classification and Regression Problems on High-Dimensional Data with a Supervised Extension of Slow Feature Analysis. Journal of Machine Learning Research, 14, 3683–3719. http://cogprints.org/8966/
Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold and leave-one-out cross validation. Pattern Recognition, 48(9), 2839–2846. https://doi.org/10.1016/j.patcog.2015.03.009
Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 102(479), 974–983. https://doi.org/10.1198/016214507000000617
Xia, Y. (2020). Correlation and association analyses in microbiome study integrating multiomics in health and disease. In Progress in Molecular Biology and Translational Science (1st ed., Vol. 171). Elsevier Inc. https://doi.org/10.1016/bs.pmbts.2020.04.003
Xu, R., Chen, N., Chen, Y., & Chen, Z. (2020). Downscaling and Projection of Multi- CMIP5 Precipitation Using Machine Learning Methods in the Upper Han River Basin. Advances in Meteorology, 2020. https://doi.org/10.1155/2020/8680436
Yadav, S., & Shukla, S. (2016). Analysis of k-Fold Cross-Validation over Hold-Out Validation on Colossal Datasets for Quality Classification. Proceedings - 6th International Advanced Computing Conference, IACC 2016, Cv, 78–83. https://doi.org/10.1109/IACC.2016.25
Ye, Y., Xiong, Y., Zhou, Q., Wu, J., Li, X., & Xiao, X. (2020). Comparison of Machine Learning Methods and Conventional Logistic Regressions for Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective Cohort Study. Journal of Diabetes Research, 2020. https://www.hindawi.com/journals/jdr/2020/4168340/
Zainudin, S., Jasim, D. S., & Bakar, A. A. (2016). Comparative analysis of data mining techniques for malaysian rainfall prediction. International Journal on Advanced Science, Engineering and Information Technology, 6(6), 1148–1153. https://doi.org/10.18517/ijaseit.6.6.1487
Zhang, D., Tan, M. L., Dawood, S. R. S., Samat, N., Chang, C. K., Roy, R., Tew, Y. L., & Mahamud, M. A. (2020). Comparison of ncep-cfsr and cmads for hydrological modelling using swat in the muda river basin, malaysia. Water (Switzerland), 12(11). https://doi.org/10.3390/w12113288
Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. European Journal of Operational Research, 116(1), 16–32. https://doi.org/10.1016/S0377-2217(98)00051-4
Zhang, Yongli. (2012). Support vector machine classification algorithm and its application. Communications in Computer and Information Science, 308 CCIS(PART 2), 179–186. https://doi.org/10.1007/978-3-642-34041-3_27
Zhang, Yudong, & Wu, L. (2012). Classification of fruits using computer vision and a multiclass support vector machine. Sensors (Switzerland), 12(9), 12489–12505. https://doi.org/10.3390/s120912489
Zhao, Y., & Miner, S. D. (2014). Data Mining Applications with R: ProQuest Tech Books. http://proquest.safaribooksonline.com.proxy1.library.mcgill.ca/book/programmin g/r/9780124115118
Zorita, E., & von Storch, H. (1997). A survey of statistical downscaling techniques. GKSS Report, 20. https://www.osti.gov/etdeweb/servlets/purl/595191
Zoro, R. (2012). How to explain poor classification performance of recall when using SVM. Cross Validated. https://stats.stackexchange.com/q/22208
|
This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials. You may use the digitized material for private study, scholarship, or research. |