UPSI Digital Repository (UDRep)
Start | FAQ | About

QR Code Link :

Type :thesis
Subject :QA Mathematics
Main Author :Nurul Ainina Filza Sulaiman
Title :Statistical downscaling of projecting rainfall amount based on SVC-RVM model
Place of Production :Tanjong Malim
Publisher :Fakulti Sains dan Matematik
Year of Publication :2022
Corporate Name :Universiti Pendidikan Sultan Idris
PDF Guest :Click to view PDF file
PDF Full Text :Login required to access this item.

Abstract : Universiti Pendidikan Sultan Idris
The objective of this study is to evaluate and compare the proposed statistical downscaling model in Kelantan and Terengganu states. The study also investigates the most accurate imputation methods in handling the missing atmospheric data and the important predictors for a statistical downscaling method by reducing the dimensionality data. The data used in this study include atmospheric data (predictors) and daily rainfall data (predictand) from 1998 until 2007. As part of its methodology, this study had used an imputation method for handling missing data. Then, Principal Component Analysis (PCA) was applied to rectify the issue of high-dimensional data and select predictors for a two-phase model. The two-phase machine learning techniques were introduced as a precise statistical downscaling method in Kelantan and Terengganu states. The first phase is a classification using the Support Vector Classification (SVC) that determines dry and wet days. Subsequently, a regression estimates the amount of rainfall based on the frequency of wet days using the Support Vector Regression (SVR), Artificial Neural Network (ANN), and Relevant Vector Machine (RVM). The proposed model was analysed by using the performance measures that are Root Mean Square Error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The result of imputation methods shows Random Forest (RF) is having the lowest RMSE value and the highest NSE value. The analysis of PCA results indicates two selected Principal Component’s cut-off eigenvalues at 1.6 and 70.29% cumulative percentage of the total variance. In the conclusion of this study, the comparison of results from the SVC and RVM hybridizations reveals that the hybrid reproduces the most reasonable daily rainfall projection and supports the high rainfall extremes, making it a perfect candidate for rainfall prediction research. The implication of this study is to establish the relationship between predictand variables and predictors in order to improve predicting accuracy in climate change projections by using a hybridization model.

References

Abbott, D. (1999). Combining models to improve classifier accuracy and robustness.

Proceedings of Second International Conference on …, January 1999, 1–7.

 

Abdel-Kader, H., Salam, M. A.-E., & ... (2021). Hybrid Machine Learning Model for

Rainfall Forecasting. Journal of Intelligent …, 1(1), 5–12.

https://doi.org/10.5281/zenodo.3376685

 

Acuña, E., & Rodriguez, C. (2004). The Treatment of Missing Values and its Effect

on Classifier Accuracy. Classification, Clustering, and Data Mining

Applications. https://doi.org/10.1007/978-3-642-17103-1_60

 

Advani, V. (2021). What is Machine Learning? How Machine Learning Works and

future of it? Great Learning. https://www.mygreatlearning.com/blog/what-ismachine-

learning/

 

Agrawal, A. (2019). Highlights the advantages and disadvantages of machine

learning. Cyber Infrastructure, CIS. https://www.cisin.com/coffeebreak/

Enterprise/highlights-the-advantages-and-disadvantages-of-machinelearning.

html

 

Ahmadkhani, S., & Adibi, P. (2016). Face recognition using supervised probabilistic

principal component analysis mixture model in dimensionality reduction without

loss framework. IET Computer Vision, 10(3), 193–201.

https://doi.org/10.1049/iet-cvi.2014.0434

 

Aksornsingchai, P., & Srinilta, C. (2011). Statistical downscaling for rainfall and

temperature prediction in Thailand. IMECS 2011 - International

MultiConference of Engineers and Computer Scientists 2011, 1(January 1948),

356–361.

 

Albon, C. (2017). SVC Parameters When Using RBF Kernel. GitHub.

https://chrisalbon.com/machine_learning/support_vector_machines/svc_paramet

ers_using_rbf_kernel/

 

Ali, A. H., & Abdullah, M. Z. (2020). An efficient model for data classification based

on SVM grid parameter optimization and PSO feature weight selection.

International Journal of Integrated Engineering, 12(1), 1–12.

https://doi.org/10.30880/ijie.2020.12.01.001

 

Aljuaid, T., & Sasi, S. (2017). Proper imputation techniques for missing values in data

sets. Proceedings of the 2016 International Conference on Data Science and

Engineering, ICDSE 2016. https://doi.org/10.1109/ICDSE.2016.7823957

 

Alsaber, A. R., Pan, J., & Al-Hurban, A. (2021). Handling complex missing data

using random forest approach for an air quality monitoring dataset: A case study

of kuwait environmental data (2012 to 2018). International Journal of

Environmental Research and Public Health, 18(3), 1–26.

https://doi.org/10.3390/ijerph18031333

 

Amirabadizadeh, M., Ghazali, A. H., Huang, Y. F., & Wayayok, A. (2016).

Downscaling daily precipitation and temperatures over the Langat River Basin in

Malaysia : A comparison of two statistical downscaling approaches.

International Journal of Water Resources and Environmental Engineering,

8(December), 120–136. https://doi.org/10.5897/IJWREE2016.0585

 

Anandhi, A., Srinivas, V. V., NAnjundiah, R. S., & Kumar, D. N. (2008).

Downscaling precipitation to river basin in India for IPCC SRES scenarions

using support vector machine. International Journal of Climatology, 28(March

2008), 401–420. https://doi.org/10.1002/joc

 

Andridge, R. R., & Little, R. J. A. (2010). A review of hot deck imputation for survey

non-response. International Statistical Review, 78(1), 40–64.

https://doi.org/10.1111/j.1751-5823.2010.00103.x

 

Angra, S., & Ahuja, S. (2017). Machine learning and its applications: A review.

Proceedings of the 2017 International Conference On Big Data Analytics and

Computational Intelligence, ICBDACI 2017, April 2020, 57–60.

https://doi.org/10.1109/ICBDACI.2017.8070809

 

Anguita, D., Ghelardoni, L., Ghio, A., Oneto, L., & Ridella, S. (2012). The ‘ K ’ in Kfold

Cross Validation. European Symposium on Artificial Neural Networks-

ESANN 2012 Proceedings, April.

 

Anguita, D., Ghio, A., Ridella, S., & Sterpi, D. (2009). K-Fold Cross Validation for

Error Rate Estimate in Support Vector Machines. Vessels Fuel Consumption

Forecast and Trim Optimisation: a Data Analytics Perspective View project KFold

Cross Validation for Error Rate Estimate in Support Vector Machines.

Proc. DMIN Int. Conf. Data Mining, January.

https://www.researchgate.net/publication/220704948

 

Anguita, D., Ridella, S., Rivieccio, F., & Zunino, R. (2003). Hyperparameter design

criteria for support vector classifiers. Neurocomputing, 55(1–2), 109–134.

https://doi.org/10.1016/S0925-2312(03)00430-2

 

Arifin, F., Robbani, H., Annisa, T., & Ma’Arof, N. N. M. I. (2019). Variations in the

Number of Layers and the Number of Neurons in Artificial Neural Networks:

Case Study of Pattern Recognition. Journal of Physics: Conference Series,

1413(1). https://doi.org/10.1088/1742-6596/1413/1/012016

 

ASCE. (2000). Artificial Neural Network in Hydrology I: Preliminary Concepts. In

Journal of Hydrologic Engineering (Vol. 5, Issue 2).

 

Assent, I. (2012). Clustering high dimensional data. Wiley Interdisciplinary Reviews:

Data Mining and Knowledge Discovery, 2(4), 340–350.

https://doi.org/10.1002/widm.1062

 

Ayesha, S., Hanif, M. K., & Talib, R. (2020). Overview and comparative study of

dimensionality reduction techniques for high dimensional data. Information

Fusion, 59(May 2019), 44–58. https://doi.org/10.1016/j.inffus.2020.01.005

 

Azid, A., Juahir, H., Toriman, M. E., Kamarudin, M. K. A., Saudi, A. S. M., Hasnam,

C. N. C., Aziz, N. A. A., Azaman, F., Latif, M. T., Zainuddin, S. F. M., Osman,

M. R., & Yamin, M. (2014). Prediction of the level of air pollution using

principal component analysis and artificial neural network techniques: A case

study in Malaysia. Water, Air, and Soil Pollution, 225(8).

https://doi.org/10.1007/s11270-014-2063-1

 

Baghanam, A. H., Eslahi, M., Sheikhbabaei, A., & Seifi, A. J. (2020). Assessing the

impact of climate change over the northwest of Iran: an overview of statistical

downscaling methods. Theoretical and Applied Climatology, 141(3–4), 1135–

1150. https://doi.org/10.1007/s00704-020-03271-8

 

Bahari, N. I. S., Ahmad, A., & Aboobaider, B. M. (2014). Application of support

vector machine for classification of multispectral data. IOP Conference Series:

Earth and Environmental Science, 20(1). https://doi.org/10.1088/1755-

1315/20/1/012038

 

Bala, R., & Kumar, D. (2017). Classification Using ANN: A Review. International

Journal of Computational Intelligence Research, 13(7), 1811–1820.

http://www.ripublication.com

 

Baraldi, A. N., & Enders, C. K. (2010). An introduction to modern missing data

analyses. Journal of School Psychology, 48(1), 5–37.

https://doi.org/10.1016/j.jsp.2009.10.001

 

Barnard, J., & Meng, X.-L. (1999). Applications of multiple imputation in medical

studies: from AIDS to NHANES. Statistical Methods in Medical Research, 8(1),

17–36. https://doi.org/10.1177/096228029900800103

 

Batista, G., & Monard, M. (2002). A Study of K -Nearest Neighbour as an Imputation

Method. Argentine Symposium on Artificial Intelligence, October.

 

Beaudoin, A., Bernier, P. Y., Guindon, L., Villemaire, P., Guo, X. J., Stinson, G.,

Bergeron, T., Magnussen, S., & Hall, R. J. (2014). Mapping attributes of

Canada’s forests at moderate resolution through kNN and MODIS imagery.

Canadian Journal of Forest Research, 44(5), 521–532.

https://doi.org/10.1139/cjfr-2013-0401

 

Bell, W., Brockwell, P. J., & Davis, R. A. (2009). Time Series: Theory and Methods.

In Journal of the American Statistical Association (Vol. 84, Issue 405).

https://doi.org/10.2307/2289896

 

Benestad, R., & Benestad, R. (2016). Downscaling Climate Information. In Oxford

Research Encyclopedia of Climate Science (Issue June).

https://doi.org/10.1093/acrefore/9780190228620.013.27

 

Bengio, Y., & Grandvalet, Y. (2004). No Unbiased Estimator of the Variance of KFold

Cross Validation. Journal of Machine Learning Research, 5, 1089–1105.

https://doi.org/10.1016/S0006-291X(03)00224-9

 

Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and

New Zealand Journal of Public Health, 25(5), 464–469.

https://doi.org/10.1111/j.1467-842X.2001.tb00294.x

 

Beretta, L., & Santaniello, A. (2016). Nearest neighbor imputation algorithms: A

critical evaluation. BMC Medical Informatics and Decision Making, 16(74).

https://doi.org/10.1186/s12911-016-0318-z

 

Bergmeir, C., Hyndman, R. J., & Koo, B. (2018). A note on the validity of crossvalidation

for evaluating autoregressive time series prediction. Computational

Statistics and Data Analysis, 120, 70–83.

https://doi.org/10.1016/j.csda.2017.11.003

 

Berrar, D. (2018). Cross-validation. Encyclopedia of Bioinformatics and

Computational Biology: ABC of Bioinformatics, 1–3(April), 542–545.

https://doi.org/10.1016/B978-0-12-809633-8.20349-X

 

Berzofsky, M., Biemer, P., & Kalsbeek, W. (2008). A Brief History of Classification

Error Models. Proceeding of Joint Statistical Modeetings, 3667–3673.

 

Bethere, L., Sennikovs, J., & Bethers, U. (2017). Climate indices for the Baltic states

from principal component analysis. Earth System Dynamics, 8(4), 951–962.

https://doi.org/10.5194/esd-8-951-2017

 

Bhattacharya, A. (2014). Curse of Dimensionality. Fundamentals of Database

Indexing and Searching, 141–148. https://doi.org/10.1201/b17767-13

 

Bhattacharya, D., Nisha, M. G., & Pillai, G. N. (2015). Relevance vector-machinebased

solar cell model. International Journal of Sustainable Energy, 34(10),

685–692. https://doi.org/10.1080/14786451.2014.885030

 

Bhavsar, H., & Ganatra, A. (2012). A Comparative Study of Training Algorithms for

Supervised Machine Learning. International Journal of Soft Computing and

Engineering, 2(4), 74–81.

 

Bing, Q., Gong, B., Yang, Z., Shang, Q., & Zhou, X. (2015). Short-Term Traffic Flow

Local Prediction Based on Combined Kernel Function Relevance Vector

Machine Model. Mathematical Problems in Engineering, 2015.

https://doi.org/10.1155/2015/154703

 

Böhner, J., & Bechtel, B. (2017). GIS in Climatology and Meteorology. In

Comprehensive Geographic Information Systems (Vol. 3).

https://doi.org/10.1016/B978-0-12-409548-9.09633-0

 

Boisberranger, J. du, Bossche, J. Van den, & Estève, L. (2017). RBF SVM

parameters. Scikit-Learn Developers. https://scikitlearn.

org/stable/about.html#authors

 

Borra, S., & Di Ciaccio, A. (2010). Measuring the prediction error. A comparison of

cross-validation, bootstrap and covariance penalty methods. Computational

Statistics and Data Analysis, 54(12), 2976–2989.

https://doi.org/10.1016/j.csda.2010.03.004

 

Breiman, L. (2001). Random Forests. Machine Language, 45(1), 5–32.

https://doi.org/10.14569/ijacsa.2016.070603

 

Breiman, L., Cutler, A., Liaw, A., & Wiener, M. (2018). Package “randomForest.”

CRAN. https://doi.org/10.1023/A

 

Brence, J. R., & Brown, D. E. (2006). Improving the Robust Random Forest

Regression Algorithm. In Systems and Information Engineering Technical

Papers, Department of Systems and Information Engineering.

 

Brownlee, J. (2020). Train-Test Split for Evaluating Machine Learning Algorithms.

Python Machine Learning. https://machinelearningmastery.com/train-test-splitfor-

evaluating-machine-learning-algorithms/

 

Bunkley, Ni. (2008). Joseph Juran, 103, Pioneer in Quality Control, Dies. The New

York Times. https://www.nytimes.com/2008/03/03/business/03juran.html

 

Bürger, G. (1996). Expanded downscaling for generating local weather scenarios.

Climate Research, 7(2), 111–128. https://doi.org/10.3354/cr007111

 

Burman, P. (1989). A comparative study of ordinary cross-validation, v-fold crossvalidation

and the repeated learning-testing methods. Biometrika, 76(3), 503–

514. https://doi.org/10.1093/biomet/76.3.503

 

Campion, W. M., & Rubin, D. B. (1989). Multiple Imputation for Nonresponse in

Surveys. In Journal of Marketing Research (Vol. 26, Issue 4).

https://doi.org/10.2307/3172772

 

Carleo, G., Cirac, I., Cranmer, K., Daudet, L., Schuld, M., Tishby, N., Vogt-Maranto,

L., & Zdeborová, L. (2019). Machine learning and the physical sciences.

Reviews of Modern Physics, 91(4), 45002.

https://doi.org/10.1103/RevModPhys.91.045002

 

Castellano, C. M., & DeGaetano, A. T. (2017). Downscaling extreme precipitation

from CMIP5 simulations using historical analogs. Journal of Applied

Meteorology and Climatology, 56(9), 2421–2439.

https://doi.org/10.1175/JAMC-D-16-0250.1

 

Chai, T., & Draxler, R. R. (2014). Root mean square error (RMSE) or mean absolute

error (MAE)? -Arguments against avoiding RMSE in the literature. Geoscientific

Model Development, 7(3), 1247–1250. https://doi.org/10.5194/gmd-7-1247-2014

 

Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods.

Computers and Electrical Engineering, 40(1), 16–28.

https://doi.org/10.1016/j.compeleceng.2013.11.024

 

Che Mat Nor, S. M., Shaharudin, S. M., Ismail, S., Zainuddin, N. H., & Tan, M. L.

(2020). A comparative study of different imputation methods for daily rainfall

data in east-coast Peninsular Malaysia. Bulletin of Electrical Engineering and

Informatics, 9(2), 635–643. https://doi.org/10.11591/eei.v9i2.2090

 

Cheema, J. R. (2014). A Review of Missing Data Handling Methods in Education

Research. Review of Educational Research, 84(4), 487–508.

https://doi.org/10.3102/0034654314532697

 

Chen, C., & Shyu, M. L. (2011). Clustering-based binary-class classification for

imbalanced data sets. Proceedings of the 2011 IEEE International Conference on

Information Reuse and Integration, IRI 2011, 384–389.

https://doi.org/10.1109/IRI.2011.6009578

 

Chen, S., Gu, C., Lin, C., Zhang, K., & Zhu, Y. (2020). Multi-kernel optimized

relevance vector machine for probabilistic prediction of concrete dam

displacement. Engineering with Computers, 0123456789.

https://doi.org/10.1007/s00366-019-00924-9

 

Chen, S. H., Jain, L., & Tai, C. C. (2005). Computational economics: A perspective

from computational intelligence. In Computational Intelligence and its

Applications Series (Issue May 2016). Idea Group Publishing.

https://doi.org/10.4018/978-1-59140-649-5

 

Chen, S. T., Yu, P. S., & Tang, Y. H. (2010). Statistical downscaling of daily

precipitation using support vector machines and multivariate analysis. Journal of

Hydrology, 385(1–4), 13–22. https://doi.org/10.1016/j.jhydrol.2010.01.021

 

Chen, Z. (2001). Data-Mining and Uncertain Reasoning: An Integrated Approach. In

Information Visualization. Wiley, New York.

https://doi.org/10.1057/palgrave.ivs.9500041

 

Cheng, C. H., & Yang, J. H. (2016). A novel rainfall forecast model based on the

integrated non-linear attribute selection method and support vector regression.

Journal of Intelligent and Fuzzy Systems, 31(2), 915–925.

https://doi.org/10.3233/JIFS-169021

 

Cheng, C. T., Niu, W. J., Feng, Z. K., Shen, J. J., & Chau, K. W. (2015). Daily

reservoir runoff forecasting method using artificial neural network based on

quantum-behaved particle swarm optimization. Water (Switzerland), 7(8), 4232–

4246. https://doi.org/10.3390/w7084232

 

Chhabra, G., Vashisht, V., & Ranjan, J. (2017). A Comparison of Multiple Imputation

Methods for Data with Missing Values. Indian Journal of Science and

Technology, 10(19), 1–7. https://doi.org/10.17485/ijst/2017/v10i19/110646

 

Chhabra, G., Vashisht, V., & Ranjan, J. (2019). A review on missing data value

estimation using imputation algorithm. Journal of Advanced Research in

Dynamical and Control Systems, 11(7 Special Issue), 312–318.

 

Cho, M. Y., & Hoang, T. T. (2017). Feature Selection and Parameters Optimization of

SVM Using Particle Swarm Optimization for Fault Classification in Power

Distribution Systems. Computational Intelligence and Neuroscience, 1–9.

https://doi.org/10.1155/2017/4135465

 

Coulibaly, P. (2004). Downscaling daily extreme temperatures with genetic

programming. Geophysical Research Letters, 31(16), 1–4.

https://doi.org/10.1029/2004GL020075

 

Crusoveanu, L. (2021). Epoch in Neural Networks. Baeldung.

https://www.baeldung.com/cs/epoch-neural-networks

 

Cummins, N., Sethu, V., Epps, J., & Krajewski, J. (2015). Relevance Vector Machine

for Depression Prediction Industrial Psychology , Rhenish University of Applied

Sciences Cologne , Germany. Interspeech 2015, 1(2), 110–114.

 

Daniel, F. (2020). What is Machine Learning? Emerj The Al Research and Advisory

Company. https://emerj.com/ai-glossary-terms/what-is-machine-learning/

 

Das, J., & Nanduri, U. V. (2018). Assessment and evaluation of potential climate

change impact on monsoon flows using machine learning technique over

Wainganga River basin, India. Hydrological Sciences Journal, 63(7), 1020–

1046. https://doi.org/10.1080/02626667.2018.1469757

 

Davey, A., & Savla, J. (2010). Statistical Power Analysis with Missing Data.

Routledge Taylor & Francis Group, LLC.

 

Dawson, C. W., Abrahart, R. J., Shamseldin, A. Y., & Wilby, R. L. (2006). Flood

estimation at ungauged sites using artificial neural networks. Journal of

Hydrology, 319(1–4), 391–409. https://doi.org/10.1016/j.jhydrol.2005.07.032

 

Deo, R. C., Samui, P., & Kim, D. (2016). Estimation of monthly evaporative loss

using relevance vector machine, extreme learning machine and multivariate

adaptive regression spline models. Stochastic Environmental Research and Risk

Assessment, 30(6), 1769–1784. https://doi.org/10.1007/s00477-015-1153-y

 

Department of Irrigation and Drainage. (2018). Hydrological Standard for Rainfall

Station Instrumentation.

 

Desai, K. M., Survase, S. A., Saudagar, P. S., Lele, S. S., & Singhal, R. S. (2008).

Comparison of artificial neural network (ANN) and response surface

methodology (RSM) in fermentation media optimization: Case study of

fermentative production of scleroglucan. Biochemical Engineering Journal,

41(3), 266–273. https://doi.org/10.1016/j.bej.2008.05.009

 

Devak, M., & Dhanya, C. T. (2014). Downscaling of Precipitation in Mahanadi Basin

, India. International Journal of Civil Engineering Research, 5(2), 111–120.

 

Dhiraj, K. (2019). Top 4 advantages and disadvantages of Support Vector Machine or

SVM. Medium. https://dhirajkumarblog.medium.com/top-4-advantages-anddisadvantages-

of-support-vector-machine-or-svm-a3c06a2b107

 

Dhurandhar, A., & Dobra, A. (2009). Evaluating Evaluation Measure. In Proceedings

of Evaluation Methods in Machine Learning Workshop in International

Conference on Machine Learning (ICML) 2009. https://doi.org/10.1002/pdh.264

 

Dominick, D., Juahir, H., Latif, M. T., Zain, S. M., & Aris, A. Z. (2012). Spatial

assessment of air quality patterns in Malaysia using multivariate analysis.

Atmospheric Environment, 60, 172–181.

https://doi.org/10.1016/j.atmosenv.2012.06.021

 

Dong, Y., Wang, J., Wang, C., & Guo, Z. (2017). Research & application of hybrid

forecasting model based on an optimal feature selection system-A case study on

electrical load forecasting. Energies, 10(4). https://doi.org/10.3390/en10040490

 

Dorado, J., RabuñAL, J. R., Pazos, A., Rivero, D., Santos, A., & Puertas, J. (2003).

Prediction and modeling of the rainfall-runoff transformation of a typical urban

basin using ann and gp. Applied Artificial Intelligence, 17(4), 329–343.

https://doi.org/10.1080/713827142

 

Drago, C., & Scepi, G. (2015). Time series clustering from high dimensional data. In

Lecture Notes in Computer Science (including subseries Lecture Notes in

Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7627, Issue

December 2014). https://doi.org/10.1007/978-3-662-48577-4_5

 

Duhan, D., & Pandey, A. (2015). Statistical downscaling of temperature using three

techniques in the Tons River basin in Central India. Theoretical and Applied

Climatology, 121(3–4), 605–622. https://doi.org/10.1007/s00704-014-1253-5

 

Efron, B., & Gong, G. (1985). A leisurely look at the Bootstrap, the Jackknife and

Cross-Validation. American Statistician, 37(1), 36–48.

 

El-Shafie, A., Mukhlisin, M., Najah, A. A., & Taha, M. R. (2011). Performance of

artificial neural network and regression techniques for rainfall-runoff prediction.

International Journal of Physical Sciences, 6(8), 1997–2003.

https://doi.org/10.5897/IJPS11.314

 

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

https://books.google.com/books?hl=en&lr=&id=MN8ruJd2tvgC&oi=fnd&pg=P

A1&dq=Enders,+2010&ots=dJnDs_Vls8&sig=gEP41sXuZcAE2DlqF1qEOo9A

H8Q

 

Engel, D., Hüttenberger, L., & Hamann, B. (2012). A survey of dimension reduction

methods for high-dimensional data analysis and visualization. OpenAccess Series

in Informatics, 27, 135–149. https://doi.org/10.4230/OASIcs.VLUDS.2011.135

 

Erdal, H. I., & Karakurt, O. (2013). Advancing monthly streamflow prediction

accuracy of CART models using ensemble learning paradigms. Journal of

Hydrology, 477, 119–128. https://doi.org/10.1016/j.jhydrol.2012.11.015

 

Erichson, N. B., Zheng, P., Manohar, K., Brunton, S. L., Kutz, J. N., & Aravkin, A.

Y. (2020). Sparse Principal Component Analysis via Variable Projection. SIAM

Journal on Applied Mathematics, 80(2), 977–1002.

 

Falkenberg Nielsen, O., & Johnsen, G. (2015). Normal aldring. Anatomi Og

Fysiologi, 1. Alsvåg H. Omsorg-med udgangspunkt i Kari Mart.

 

Fang, C., & Wang, C. (2020). Time Series Data Imputation: A Survey on Deep

Learning Approaches. http://arxiv.org/abs/2011.11347

 

Ferguson, K. (2018). Why It’s Important to Standardize Your Data. Human of Data

by Atlan. https://humansofdata.atlan.com/2018/12/datastandardization/#:~:

text=Standardized data is essential for,data to measure it

against.

 

Fogarty, D. J. (2006). Multiple imputation as a missing data approach to reject

inference on consumer credit scoring. Interstat, December 2000, 1–41.

http://interstat.statjournals.net/YEAR/2006/articles/0609001.pdf

 

Forghani, Y., Tabrizi, R. S., Yazdi, H. S., & Akbarzadeh-T, M. R. (2011). Fuzzy

support vector regression. 2011 1st International EConference on Computer and

Knowledge Engineering, ICCKE 2011, Vc, 28–33.

https://doi.org/10.1109/ICCKE.2011.6413319

 

Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation.

Statistics and Computing, 21(2), 137–146. https://doi.org/10.1007/s11222-009-

9153-8

 

Gaag, M. van der, Hoffman, T., Remijsen, M., Hijman, R., de Haan, L., van Meijel,

B., van Harten, P. N., Valmaggia, L., de Hert, M., Cuijpers, A., & Wiersma, D.

(2006). The five-factor model of the Positive and Negative Syndrome Scale II: A

ten-fold cross-validation of a revised model. Schizophrenia Research, 85(1–3),

280–287. https://doi.org/10.1016/j.schres.2006.03.021

 

Gao, L., Song, J., Liu, X., Shao, J., Liu, J., & Shao, J. (2017). Learning in highdimensional

multimedia data: the state of the art. Multimedia Systems, 23(3),

303–313. https://doi.org/10.1007/s00530-015-0494-1

 

Gao, Y., Merz, C., Lischeid, G., & Schneider, M. (2018). A review on missing

hydrological data processing. Environmental Earth Sciences, 77(2), 47.

https://doi.org/10.1007/s12665-018-7228-6

 

Gaur, A., & Simonovic, S. P. (2018). Introduction to physical scaling: A model aimed

to bridge the gap between statistical and dynamic downscaling approaches. In

Trends and Changes in Hydroclimatic Variables: Links to Climate Variability

and Change. Elsevier Inc. https://doi.org/10.1016/B978-0-12-810985-4.00004-9

 

Geisser, S. (1975). The predictive sample reuse method with applications. Journal of

the American Statistical Association, 70(350), 320–328.

https://doi.org/10.1080/01621459.1975.10479865

 

Ghahramani, Z. (2004). Unsupervised Learning. Machine Learning, 72–112.

 

Ghasemi, F., Mehridehnavi, A., Pérez-Garrido, A., & Pérez-Sánchez, H. (2018).

Neural network and deep-learning algorithms used in QSAR studies: merits and

drawbacks. Drug Discovery Today, 23(10), 1784–1790.

https://doi.org/10.1016/j.drudis.2018.06.016

 

Ghosh, S., & Mujumdar, P. P. (2008). Statistical downscaling of GCM simulations to

streamflow using relevance vector machine. Advances in Water Resources,

31(1), 132–146. https://doi.org/10.1016/j.advwatres.2007.07.005

 

Ghritlahre, H. K., & Prasad, R. K. (2018). Application of ANN technique to predict

the performance of solar collector systems - A review. Renewable and

Sustainable Energy Reviews, 84(September 2017), 75–88.

https://doi.org/10.1016/j.rser.2018.01.001

 

Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on

performance of learning algorithms for hydrologic predictions: Implications to

an imputation technique. Water Resources Research, 43(7), 1–12.

https://doi.org/10.1029/2006WR005298

 

Golub, G. H., Heath, M., & Wahba, G. (1979). Generalized Cross-Validation as a

Method for Choosing a Good Ridge Parameter. Technometrics, 21(2), 215–223.

 

Goly, A., Teegavarapu, R. S. V., & Mondal, A. (2014). Development and evaluation

of statistical downscaling models for monthly precipitation. Earth Interactions,

18(18), 1–28. https://doi.org/10.1175/EI-D-14-0024.1

 

Gondara, L. (2016). Random forest with random projection to impute missing gene

expression data. Proceedings - 2015 IEEE 14th International Conference on

Machine Learning and Applications, ICMLA 2015, 1251–1256.

https://doi.org/10.1109/ICMLA.2015.29

 

Grace-Martin, K. (2013). Assessing the Fit of Regression Models. The Analysis

Factor. https://www.theanalysisfactor.com/assessing-the-fit-of-regressionmodels/

 

Graham, J. W. (2009). Missing data analysis: Making it work in the real world.

Annual Review of Psychology, 60, 549–576.

https://doi.org/10.1146/annurev.psych.58.110405.085530

 

Gupta, P. (2017). Cross-Validation in Machine Learning. Towars Data Science.

https://towardsdatascience.com/cross-validation-in-machine-learning-

72924a69872f

 

Hadipour, S., Harun, S., Arefnia, A., & Alamgir, M. (2016). Transfer function models

for statistical downscaling of monthly precipitation. Jurnal Teknologi, 78(9–4),

55–62. https://doi.org/10.11113/jt.v78.9695

 

Halik, G., Anwar, N., Santosa, B., & Edijatno. (2015). Reservoir inflow prediction

under GCM scenario downscaled by wavelet transform and support vector

machine hybrid models. Advances in Civil Engineering, 2015(July).

https://doi.org/10.1155/2015/515376

 

Hamidi, O., Poorolajal, J., Sadeghifar, M., Abbasi, H., Maryanaji, Z., Faridi, H. R., &

Tapak, L. (2015). A comparative study of support vector machines and artificial

neural networks for predicting precipitation in Iran. Theoretical and Applied

Climatology, 119(3–4), 723–731. https://doi.org/10.1007/s00704-014-1141-z

 

Han, M., & Zhao, Y. (2010). Robust relevance vector machine with noise variance

coefficient. Proceedings of the International Joint Conference on Neural

Networks. https://doi.org/10.1109/IJCNN.2010.5596989

 

Hannah, L. (2015). The Climate System and Climate Change. In Climate Change

Biology. https://doi.org/10.1016/b978-0-12-420218-4.00002-0

 

Hasan, N., Nath, N. C., & Rasel, R. I. (2016). A support vector regression model for

forecasting rainfall. 2nd International Conference on Electrical Information and

Communication Technologies, EICT 2015, Eict, 554–559.

https://doi.org/10.1109/EICT.2015.7392014

 

Hayati Rezvan, P., Lee, K. J., & Simpson, J. A. (2015). The rise of multiple

imputation: A review of the reporting and implementation of the method in

medical research Data collection, quality, and reporting. BMC Medical Research

Methodology, 15(1), 1–14. https://doi.org/10.1186/s12874-015-0022-1

 

Heitjan, D. F., Rubin, D. B., Heitjan, B. Y. D. F., & Rubin, D. B. (1991). Ignorability

and Coarse Data. The Annals of Statistics, 19(4), 2244–2253.

 

Henn, B., Raleigh, M. S., Fisher, A., & Lundquist, J. D. (2013). A comparison of

methods for filling gaps in hourly near-surface air temperature data. Journal of

Hydrometeorology, 14(3), 929–945. https://doi.org/10.1175/JHM-D-12-027.1

 

Hewitson, B. C., & Crane, R. G. (1996). Climate downscaling: Techniques and

application. Climate Research, 7(2), 85–95. https://doi.org/10.3354/cr007085

 

Hjelmfelt, A. T., & Wang, M. (1993). Predicting Runoff using Artificial Neural

Networks. Proceedings of the International Conference on Hydrology and Water

Resources, 16(December), 233–244. https://doi.org/10.1007/978-94-011-0389-

3_16

 

Hoi, S. C. H., Jin, R., Zhu, J., & Lyu, M. R. (2009). Semisupervised SVM batch mode

active learning with applications to image retrieval. ACM Transactions on

Information Systems, 27(3), 1–29. https://doi.org/10.1145/1508850.1508854

 

Hong, S., & Lynn, H. S. (2020). Accuracy of random-forest-based imputation of

missing data in the presence of non-normality, non-linearity, and interaction.

BMC Medical Research Methodology, 20(1), 1–12.

https://doi.org/10.1186/s12874-020-01080-1

 

Hotelling, H. (1933). Analysis of a complex of statistical variables into principal

component. Journal of Educational Psychology, 24(6), 417.

 

Hou, K., Shao, G., Wang, H., Zheng, L., Zhang, Q., Wu, S., & Hu, W. (2018).

Research on practical power system stability analysis algorithm based on

modified SVM. Protection and Control of Modern Power Systems, 3(1).

https://doi.org/10.1186/s41601-018-0086-0

 

Hsu, C.-W., Chang, C.-C., & Lin, C.-J. (2016). A Practical Guide to Support Vector

Classification. Department of Computer Science NAtional Taiwan University,

106. https://doi.org/10.1177/02632760022050997

 

Huang, S., Nianguang, C. A. I., Penzuti Pacheco, P., Narandes, S., Wang, Y., &

Wayne, X. U. (2018). Applications of support vector machine (SVM) learning in

cancer genomics. Cancer Genomics and Proteomics, 15(1), 41–51.

https://doi.org/10.21873/cgp.20063

 

Hunt, L. A. (2017). Missing data imputation and its effect on the accuracy of

classification. Studies in Classification, Data Analysis, and Knowledge

Organization, 195089, 3–14. https://doi.org/10.1007/978-3-319-55723-6_1

 

Hussain, M., Yusof, K. W., Mustafa, M. R., & Afshar, N. R. (2015). Application of

statistical downscaling model (SDSM) for long term prediction of rainfall in

Sarawak, Malaysia. Water Resources Management VIII, 1, 269–278.

https://doi.org/10.2495/wrm150231

 

I, W., & Rahman S, S. S. U. (2015). Treatment of Missing Values in Data Mining.

Journal of Computer Science & Systems Biology, 09(02), 51–53.

https://doi.org/10.4172/jcsb.1000221

 

Idri, A., Abnane, I., & Abran, A. (2015). Systematic mapping study of missing values

techniques in software engineering data. 2015 IEEE/ACIS 16th International

Conference on Software Engineering, Artificial Intelligence, Networking and

Parallel/Distributed Computing, SNPD 2015 - Proceedings.

https://doi.org/10.1109/SNPD.2015.7176280

 

Irawan, N. D., Wijono, W., & Setyawati, O. (2017). Perbaikan Missing value

Menggunakan Pendekatan Korelasi Pada Metode K-Nearest Neighbor. Jurnal

Infotel, 9(3). https://doi.org/10.20895/infotel.v9i3.286

 

Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random

survival forests. Annals of Applied Statistics, 2(3), 841–860.

https://doi.org/10.1214/08-AOAS169

 

Janecek, A., Gansterer, W. N. W., Demel, M., & Ecker, G. (2008). On the

Relationship Between Feature Selection and Classification Accuracy. Fsdm, 4,

90–105.

 

Jemain, A. A. (2015). Penyurihan Ikhtisas Data Hujan. Dewan Bahasa dan Pustaka.

 

Jiang, P., & Chen, J. (2016). Displacement prediction of landslide based on

generalized regression neural networks with K-fold cross-validation.

Neurocomputing, 198, 40–47. https://doi.org/10.1016/j.neucom.2015.08.118

 

Jollife, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent

developments. Philosophical Transactions of the Royal Society A: Mathematical,

Physical and Engineering Sciences, 374(2065).

https://doi.org/10.1098/rsta.2015.0202

 

Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and

prospects. Science, 349(6245), 255–260. https://doi.org/10.1126/science.aaa8415

 

Joseph, V. R., & Vakayil, A. (2021). SPlit: An Optimal Method for Data Splitting.

Technometrics, 0(0), 1–11. https://doi.org/10.1080/00401706.2021.1921037

 

Journée, M., Nesterov, Y., Richtárik, P., & Sepulchre, R. (2010). Generalized power

method for sparse principal component analysis. Journal of Machine Learning

Research, 11, 517–553.

 

Juvonen, A., Sipola, T., & Hämäläinen, T. (2015). Online anomaly detection using

dimensionality reduction techniques for HTTP log analysis. Computer Networks,

91, 46–56. https://doi.org/10.1016/j.comnet.2015.07.019

 

Kääriäinen, M. (2006). Semi-supervised model selection based on cross-validation.

IEEE International Conference on Neural Networks - Conference Proceedings,

1894–1899. https://doi.org/10.1109/ijcnn.2006.246911

 

Kabanda, T., & Nenwiini, S. (2016). Impacts of climate variation on the length of the

rainfall season: an analysis of spatial patterns in North-East South Africa.

Theoretical and Applied Climatology, 125(1–2), 93–100.

https://doi.org/10.1007/s00704-015-1498-7

 

Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis.

Psychometrika, 23(3), 187–200. https://doi.org/10.1007/BF02289233

 

Kaiser, J. (2014). Dealing with Missing Values in Data. Journal of Systems

Integration, 42–51. https://doi.org/10.20470/jsi.v5i1.178

 

Kamaruzaman, I. F., Wan Zin, W. Z., & Mohd Ariff, N. (2017). A comparison of

method for treating missing daily rainfall data in Peninsular Malaysia. Malaysian

Journal of Fundamental and Applied Sciences, 13(4–1), 375–380.

https://doi.org/10.11113/mjfas.v13n4-1.781

 

Kamble, V. B., & Deshmukh, S. N. (2017). Comparision Between Accuracy and

MSE,RMSE by Using Proposed Method with Imputation Technique. Oriental

Journal of Computer Science and Technology, 10(04), 773–779.

https://doi.org/10.13005/ojcst/10.04.11

 

Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of

Anesthesiology, 64(5), 402–406. https://doi.org/10.4097/kjae.2013.64.5.402

 

Karamizadeh, S., Abdullah, S. M., Halimi, M., Shayan, J., & Rajabi, M. J. (2014).

Advantage and drawback of support vector machine functionality. I4CT 2014 -

1st International Conference on Computer, Communications, and Control

Technology, Proceedings, I4ct, 63–65.

https://doi.org/10.1109/I4CT.2014.6914146

 

Karunanithi, N., Grenney, W. J., Whitley, D., & Bovee, K. (1995). Neural networks

for river flow prediction. Journal of Computing in Civil Engineering, 8(2), 201–

220. https://doi.org/10.1061/(ASCE)0887-3801(1995)9:4(293.x)

 

Kassambara. (2018). Evaluation of Classification Model Accuracy: Essentials.

Statistical Tools for High-Throughput Data Analysis (STHDA).

http://www.sthda.com/english/articles/36-classification-methods-essentials/143-

evaluation-of-classification-model-accuracy-essentials/

 

Katal, A., Wazid, M., & Goundar, R. (2013). Big Data: Issues, Challenges, Tools and

Good Practices. 2013 Sixth International Conference on Contemporary

Computing (IC3), 404–409. https://doi.org/10.1109/IC3.2013.6612229.

 

Kavitha, R., & Kannan, E. (2016). An efficient framework for heart disease

classification using feature extraction and feature selection technique in data

mining. 1st International Conference on Emerging Trends in Engineering,

Technology and Science, ICETETS 2016 - Proceedings.

https://doi.org/10.1109/ICETETS.2016.7603000

 

Khan, F. U. F., Khan, K. U. Z., & Singh, S. K. (2018). Is Group Means Imputation

Any Better Than Mean Imputation: A Study Using C5.0 Classifier. Journal of

Physics: Conference Series, 1060(1), 1–5. https://doi.org/10.1088/1742-

6596/1060/1/012014

 

Kim, J., & Ryu, J. H. (2016). A heuristic gap filling method for daily precipitation

series. Water Resources Management, 30(7), 2275–2294.

https://doi.org/10.1007/s11269-016-1284-z

 

Knoben, W. J. M., Freer, J. E., & Woods, R. A. (2019). Technical note: Inherent

benchmark or not? Comparing Nash-Sutcliffe and Kling-Gupta efficiency scores.

Hydrology and Earth System Sciences, 23(10), 4323–4331.

https://doi.org/10.5194/hess-23-4323-2019

 

Koch, P., Konen, W., Flasch, O., & Bartz-Beielstein, T. (n.d.). Optimization of

Support Vector Regression Models for Stormwater Prediction. 146--160.

 

Kohavi, R. (1995). A Study of Cross-Validation and Bootstrap for Accuracy

Estimation and Model Selection. International Joint Conference of Artificial

Intelligence, March 2001.

 

Kolmogorov, A. N. (1957). On the representation of continuous functions of several

variables as superpositions of continuous functions of one variable and addition.

Doklady Akademii Nauk SSSR, 114(5), 953–956. https://doi.org/10.18411/lj-12-

2018-148

 

Kong, D., Chen, Y., Li, N., Duan, C., Lu, L., & Chen, D. (2019). Relevance vector

machine for tool wear prediction. Mechanical Systems and Signal Processing,

127, 573–594. https://doi.org/10.1016/j.ymssp.2019.03.023

 

Kong, Q., Gong, H., Ding, X., & Hou, R. (2017). Classification Application Based on

Mutual Information and Random Forest Method for High Dimensional Data.

Proceedings - 9th International Conference on Intelligent Human-Machine

Systems and Cybernetics, IHMSC 2017, 1(Mi), 171–174.

https://doi.org/10.1109/IHMSC.2017.45

 

Kotu, V., & Deshpande, B. (2019). Model Evaluation. Data Science, 263–279.

https://doi.org/10.1016/b978-0-12-814761-0.00008-3

 

Kouhestani, S., Eslamian, S. S., Abedi-Koupai, J., & Besalatpour, A. A. (2016).

Projection of climate change impacts on precipitation using soft-computing

techniques: A case study in Zayandeh-rud Basin, Iran. Global and Planetary

Change, 144(July), 158–170. https://doi.org/10.1016/j.gloplacha.2016.07.013

 

Kumar, P. S., Praveen, T. V., & Prasad, M. A. (2016). Artificial Neural Network

Model for Rainfall-Runoff -A Case Study. International Journal of Hybrid

Information Technology, 9(3), 263–272.

https://doi.org/10.14257/ijhit.2016.9.3.24

 

Lang, K. M., & Little, T. D. (2018). Principled missing data treatments. Prevention

Science, 19(3), 284–294. https://doi.org/10.1007/s11121-016-0644-5

 

Larson, S. C. (1931). The shrinkage of the coefficient of multiple correlation. Journal

of Educational Psychology, 22(1), 45–55. https://doi.org/10.1037/h0072400

 

Lee, K. J., & Carlin, J. B. (2010). Multiple imputation for missing data: Fully

conditional specification versus multivariate normal imputation. American

Journal of Epidemiology, 171(5), 624–632. https://doi.org/10.1093/aje/kwp425

 

Lei, J. (2019). Cross-Validation With Confidence. Journal of the American Statistical

Association, 115(532), 1978–1997.

https://doi.org/10.1080/01621459.2019.1672556

 

Li, L., He, S., Zhang, J., & Ran, B. (2016). Short-term highway traffic flow prediction

based on a hybrid strategy considering temporal–spatial information. Journal of

Advanced Transportation, 50(8), 2029–2040. https://doi.org/10.1002/atr.1443

 

Liaw, A., & Wiener, M. (2002). Classification and Regression by randomForest. R

News, 2(3), 18–22.

 

Lin, S., Zhang, S., Qiao, J., Liu, H., & Yu, G. (2008). A parameter choosing method

of SVR for time series prediction. Proceedings of the 9th International

Conference for Young Computer Scientists, ICYCS 2008, 130–135.

https://doi.org/10.1109/ICYCS.2008.393

 

Lionello, P., Abrantes, F., Congedi, L., Dulac, F., Gacic, M., Gomis, D., Goodess, C.,

Hoff, H., Kutiel, H., Luterbacher, J., Planton, S., Reale, M., Schröder, K.,

Vittoria Struglia, M., Toreti, A., Tsimplis, M., Ulbrich, U., & Xoplaki, E. (2012).

Introduction: Mediterranean Climate-Background Information. In The Climate of

the Mediterranean Region. Elsevier. https://doi.org/10.1016/B978-0-12-416042-

2.00012-4

 

Liu, C. W., Lin, K. H., & Kuo, Y. M. (2003). Application of factor analysis in the

assessment of groundwater quality in a blackfoot disease area in Taiwan. Science

of the Total Environment, 313(1–3), 77–89. https://doi.org/10.1016/S0048-

9697(02)00683-6

 

Lo Presti, R., Barca, E., & Passarella, G. (2010). A methodology for treating missing

data applied to daily rainfall data in the Candelaro River Basin (Italy).

Environmental Monitoring and Assessment, 160(1–4), 1–22.

https://doi.org/10.1007/s10661-008-0653-3

 

Lopez, C., Tucker, S., Salameh, T., & Tucker, C. (2018). An unsupervised machine

learning method for discovering patient clusters based on genetic signatures.

Journal of Biomedical Informatics, 85(June), 30–39.

https://doi.org/10.1016/j.jbi.2018.07.004

 

Loyola R, D. G., Pedergnana, M., & Gimeno García, S. (2016). Smart sampling and

incremental function learning for very large high dimensional data. Neural

Networks, 78, 75–87. https://doi.org/10.1016/j.neunet.2015.09.001

 

Luo, J., & Sun, Y. (2020). Optimization of process parameters for the minimization of

surface residual stress in turning pure iron material using central composite

design. Measurement: Journal of the International Measurement Confederation,

163, 108001. https://doi.org/10.1016/j.measurement.2020.108001

 

MacKay, D. J. C. (1996). Bayesian Methods for Backpropagation Networks. Physics

of Neural Networks, 211–254. https://doi.org/10.1007/978-1-4612-0723-8_6

 

Mahmood, B. (2016). 4 Reasons Your Machine Learning Model is Wrong (and How

to Fix It). KD Nuggets. https://www.kdnuggets.com/2016/12/4-reasons-machinelearning-

model-wrong.html

 

Majumder, S. K., Ghosh, N., & Gupta, P. K. (2005). Relevance vector machine for

optical diagnosis of cancer. Lasers in Surgery and Medicine, 36(4), 323–333.

https://doi.org/10.1002/lsm.20160

 

Malhi, A., & Gao, R. X. (2004). PCA-based feature selection scheme for machine

defect classification. IEEE Transactions on Instrumentation and Measurement,

53(6), 1517–1525. https://doi.org/10.1109/TIM.2004.834070

 

Mandel J, S. P. (2015). A Comparison of Six Methods for Missing Data Imputation.

Journal of Biometrics & Biostatistics, 06(01), 1–6. https://doi.org/10.4172/2155-

6180.1000224

 

Manikandan, J., & Venkataramani, B. (2009). Design of a modified one-against-all

SVM classifier. Conference Proceedings - IEEE International Conference on

Systems, Man and Cybernetics, October, 1869–1874.

https://doi.org/10.1109/ICSMC.2009.5346200

 

Marçais, J., Dreuzy, J. De, Marçais, J., Prospective, J. D. D., & Learning, D. (2018).

Prospective Interest of Deep Learning for Hydrological Inference. Groundwater,

Wiley, 55(5), 688–692. https://hal-insu.archives-ouvertes.fr/insu-01574652

 

McCuen, R. H., Knight, Z., & Cutter, A. G. (2006). Evaluation of the Nash–Sutcliffe

Efficiency Index. Journal of Hydrologic Engineering, 11(6), 597–602.

https://doi.org/10.1061/(asce)1084-0699(2006)11:6(597)

 

Mechoso, C. R., & Arakawa, A. (2015). Numerical Models: General Circulation

Models. In Encyclopedia of Atmospheric Sciences: Second Edition (Second Edi,

Vol. 4). Elsevier. https://doi.org/10.1016/B978-0-12-382225-3.00157-2

 

Mehta, P., Bukov, M., Wang, C. H., Day, A. G. R., Richardson, C., Fisher, C. K., &

Schwab, D. J. (2019). A high-bias, low-variance introduction to Machine

Learning for physicists. Physics Reports, 810, 1–124.

https://doi.org/10.1016/j.physrep.2019.03.001

 

Mekonnen, D. G., Moges, M. A., Mulat, A. G., & Shumitter, P. (2019). The impact of

climate change on mean and extreme state of hydrological variables in Megech

watershed, Upper Blue Nile Basin, Ethiopia. In Extreme Hydrology and Climate

Variability: Monitoring, Modelling, Adaptation and Mitigation (Issue 2009).

Elsevier Inc. https://doi.org/10.1016/B978-0-12-815998-9.00011-7

 

Meng, C., Zeleznik, O. A., Thallinger, G. G., Kuster, B., Gholami, A. M., & Culhane,

A. C. (2016). Dimension reduction techniques for the integrative analysis of

multi-omics data. Briefings in Bioinformatics, 17(4), 628–641.

https://doi.org/10.1093/bib/bbv108

 

Methaprayoon, K., Yingvivatanapong, C., Lee, W. J., & Liao, J. R. (2007). An

integration of ANN wind power estimation into unit commitment considering the

forecasting uncertainty. IEEE Transactions on Industry Applications, 43(6),

1441–1448. https://doi.org/10.1109/TIA.2007.908203

 

Minakshi Vohra, R. G. (2014). Missing Value Imputation in Multi Attribute Data Set.

International Journal of Computer Science and Information Technologies, 5(4),

5315–5321.

 

Mishra, N., Soni, H. K., Sharma, S., & Upadhyay, A. K. (2018). Development and

analysis of Artificial Neural Network models for rainfall prediction by using

time-series data. International Journal of Intelligent Systems and Applications,

10(1), 16–23. https://doi.org/10.5815/ijisa.2018.01.03

 

Mishra, S., & Datta-Gupta, A. (2018). Data-Driven Modeling. Applied Statistical

Modeling and Data Analytics, 195–224. https://doi.org/10.1016/b978-0-12-

803279-4.00008-0

 

Moritz, S., Sardá, A., Bartz-Beielstein, T., Zaefferer, M., & Stork, J. (2015).

Comparison of different Methods for Univariate Time Series Imputation in R.

Preprint ArXiv:1510.03924, arXiv, 1–20. http://arxiv.org/abs/1510.03924

 

Moss, H. B., Leslie, D. S., & Rayson, P. (2018). Using J-K-fold cross validation to

reduce variance when tuning NLP models. ArXiv.

 

Mosteller, F., & Tukey, J. W. (1968). Data analysis, including statistics. In Handbook

of Social Psychology. Addison-Wesley. https://doi.org/10.1214/aos/1043351253

 

Mosteller, Frederick, & Wallace, D. L. (1963). Inference in an Authorship Problem.

Journal of the American Statistical Association, 58(302), 275–309.

https://doi.org/10.1080/01621459.1963.10500849

 

Mubarak, S., Darwis, H., Umar, F., Ilmawan, L. B., Anraeni, S., & Mude, M. A.

(2018). Feature Selection of Oral Cyst and Tumor Images Using Principal

Component Analysis. Proceedings - 2nd East Indonesia Conference on

Computer and Information Technology: Internet of Things for Industry,

EIConCIT 2018, 322–325. https://doi.org/10.1109/EIConCIT.2018.8878641

 

Muhammad, I., & Yan, Z. (2015). Supervised Machine Learning Approaches: a

Survey. ICTACT Journal on Soft Computing, 05(03), 946–952.

https://doi.org/10.21917/ijsc.2015.0133

 

Murti, D. M. P., Pujianto, U., Wibawa, A. P., & Akbar, M. I. (2019). K-Nearest

Neighbor (K-NN) based Missing Data Imputation. Proceeding - 2019 5th

International Conference on Science in Information Technology: Embracing

Industry 4.0: Towards Innovation in Cyber Physical System, ICSITech 2019, 83–

88. https://doi.org/10.1109/ICSITech46713.2019.8987530

 

Naik, P., Wedel, M., Bacon, L., Bodapati, A., Bradlow, E., Kamakura, W., Kreulen,

J., Lenk, P., Madigan, D. M., & Montgomery, A. (2008). Challenges and

opportunities in high-dimensional choice data analyses. Marketing Letters, 19(3–

4), 201–213. https://doi.org/10.1007/s11002-008-9036-3

 

Nanda, M. A., Seminar, K. B., Nandika, D., & Maddu, A. (2018). A comparison study

of kernel functions in the support vector machine and its application for termite

detection. Information (Switzerland), 9(1). https://doi.org/10.3390/info9010005

 

Nash, J. E., & Sutcliffe, J. V. (1970). River Flow Forecasting through Conceptual

Models Part 1- A discussion of principles. In Journal of Hydrology (Vol. 10,

Issue 3). https://doi.org/10.1080/00750770109555783

 

Nasteski, V. (2017). An overview of the supervised machine learning methods.

Horizons.B, 4(December 2017), 51–62.

https://doi.org/10.20544/horizons.b.04.1.17.p05

 

Nasution, M. Z. F., Sitompul, O. S., & Ramli, M. (2018). PCA based feature

reduction to improve the accuracy of decision tree c4.5 classification. Journal of

Physics: Conference Series, 978(1). https://doi.org/10.1088/1742-

6596/978/1/012058

 

Newman, M. E. J. (2005). Power laws, Pareto distributions and Zipf’s law.

Contemporary Physics, 46(5), 323–351.

https://doi.org/10.1080/00107510500052444

 

Ng, S. C. (2017). Principal component analysis to reduce dimension on digital image.

Procedia Computer Science, 111(2015), 113–119.

https://doi.org/10.1016/j.procs.2017.06.017

 

Nikolaev, N., & Tino, P. (2005). Sequential relevance vector machine learning from

time series. Proceedings of the International Joint Conference on Neural

Networks, 2, 1308–1313. https://doi.org/10.1109/IJCNN.2005.1556043

 

Nishijima, M., Nieuwenhoff, N., Pires, R., & Oliveira, P. R. (2019). Movie films

consumption in Brazil: an analysis of support vector machine classification. AI

and Society, 0123456789. https://doi.org/10.1007/s00146-019-00899-7

 

Noor, M., Tarmizi Ismail, S. S., Bin, F. A. B., Nashwan, M. S., Khan, N., Ahmed, K.,

Shiru, M. S., Muhammad, M. K. I. Bin, A.Salman, S., Momade, M. H., Iqbal, Z.,

Sa’Adi, Z., & Khan, and S. U. (n.d.). Annual Rainfall Variations in Peninsular

Malaysia under Climate Change Scenarios. 1(15), 298–317.

 

Nourani, V., Razzaghzadeh, Z., Baghanam, A. H., & Molajou, A. (2019). ANN-based

statistical downscaling of climatic parameters using decision tree predictor

screening method. Theoretical and Applied Climatology, 137(3–4), 1729–1746.

https://doi.org/10.1007/s00704-018-2686-z

 

O. Yamini, & Prof. S. Ramakrishna. (2015). A Study on Advantages of Data Mining

Classification Techniques. International Journal of Engineering Research And,

V4(09), 969–972. https://doi.org/10.17577/ijertv4is090815

 

Okkan, U., & Inan, G. (2015). Bayesian Learning and Relevance Vector Machines

Approach for Downscaling of Monthly Precipitation. Journal of Hydrologic

Engineering, 20(4), 04014051. https://doi.org/10.1061/(asce)he.1943-

5584.0001024

 

Okkan, U., Serbes, Z. A., & Samui, P. (2014). Relevance vector machines approach

for long-term flow prediction. Neural Computing and Applications, 25(6), 1393–

1405. https://doi.org/10.1007/s00521-014-1626-9

 

Othman, A. S., & Tukimat, N. N. A. (2018). Assessment of the Potential Occurrence

of Dry Period in the Long Term for Pahang State, Malaysia. MATEC Web of

Conferences, 150, 1–6. https://doi.org/10.1051/matecconf/201815003004

 

Pal, M. (2011). Kernel Methods in Remote Sensing: A review. Ish Journal of

Hydraulic Engineering, 15(1), 194–215. http://arxiv.org/abs/1101.2987

 

Panigrahi, R., & Borah, S. (2019). Classification and Analysis of Facebook Metrics

Dataset Using Supervised Classifiers. In Social Network Analytics. Elsevier Inc.

https://doi.org/10.1016/b978-0-12-815458-8.00001-3

 

Pantanowitz, A., & Marwala, T. (2009). Missing data imputation through the use of

the random forest algorithm. Advances in Intelligent and Soft Computing, 61

AISC, 53–62. https://doi.org/10.1007/978-3-642-03156-4_6

 

Parmar, A., Mistree, K., & Sompurna, M. (2017). Machine Learning Techniques for

Rainfall Prediction : A Review. 3(6), 913–917.

 

Paul D., A. (2001). Missing data — Quantitative applications in the social sciences.

SAGE Publication.

 

Pearson F.R.S., K. (1901). Llll. On lines and planes of closest fit to systems of points

in space. The London, Edinburgh, and Dublin Philosophical Magazine and

Journal of Science Series, 2(11), 559–572.

https://doi.org/10.1080/14786440109462720

 

Pekalska, E. (2015). Pattern Recognition Tools. Pattern Recognition Tools 37Steps.

http://37steps.com/4859/cross-validation/

 

Pepinsky, T. B. (2018). A Note on Listwise Deletion versus Multiple Imputation.

Political Analysis, 26(4), 480–488. https://doi.org/10.1017/pan.2018.18

 

Peterson, C., & Rognvaldsson, T. (1991). An Introduction to Artifical Neuron

Network. In Fundamental of Neural Network: Architecture Algorithm and

Application (pp. 113–169). 1991 CERN School of Computing.

 

Pett, M., Lackey, N., & Sullivan, J. (2011). An Overview of Factor Analysis. Making

Sense of Factor Analysis, 2–12. https://doi.org/10.4135/9781412984898.n1

 

Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review

of reporting practices and suggestions for improvement. Review of Educational

Research, 74(4), 525–556. https://doi.org/10.3102/00346543074004525

 

Pour, S. H., Shahid, S., & Chung, E. S. (2016). A Hybrid Model for Statistical

Downscaling of Daily Rainfall. Procedia Engineering, 154, 1424–1430.

https://doi.org/10.1016/j.proeng.2016.07.514

 

Pramoditha, R. (2021). 11 Dimensionality reduction techniques you should know in

2021. Medium. https://towardsdatascience.com/11-dimensionality-reductiontechniques-

you-should-know-in-2021-dcb9500d388b

 

Punlumjeak, W., Arunrerk, J., & Rachburee, N. (2017). An analytics prediction model

of monthly rainfall time series: Case of Thailand. Journal of Telecommunication,

Electronic and Computer Engineering, 9(2–6), 53–57.

 

Qian, L., Liu, C., Yi, J., & Liu, S. (2020). Application of hybrid algorithm of bionic

heuristic and machine learning in nonlinear sequence. Journal of Physics:

Conference Series, 1682(1). https://doi.org/10.1088/1742-6596/1682/1/012009

 

Qiu, M., Song, Y., & Akagi, F. (2016). Application of artificial neural network for the

prediction of stock market returns: The case of the Japanese stock market.

Chaos, Solitons and Fractals, 85, 1–7.

https://doi.org/10.1016/j.chaos.2016.01.004

 

Qiu, S., Gao, L., & Wang, J. (2014). Classification and regression of ELM, LVQ and

SVM for E-nose data of strawberry juice. Journal of Food Engineering, 144, 77–

85. https://doi.org/10.1016/j.jfoodeng.2014.07.015

 

Quiiionero-candela, J., & Hansen, L. K. (2002). Time Series Prediction based on the

Relevance Vector Machine with Adaptive Kernels. 2002 IEEE International

Conference on Acoustics, Speech, and Signal Processing, 985–988.

 

Raghavendra, S., & Deka, P. C. (2014). Support vector machine applications in the

field of hydrology: A review. Applied Soft Computing Journal, 19, 372–386.

https://doi.org/10.1016/j.asoc.2014.02.002

 

Rakesh Tanty, & Tanweer S. Desmukh. (2015). Application of Artificial Neural

Network in Hydrology- A Review. International Journal of Engineering

Research And, V4(06), 2–7. https://doi.org/10.17577/ijertv4is060247

 

Raman, H., & Sunilkumar, N. (1995). Multivariate modelling of water resources time

series using artificial neural networks. Hydrological Sciences Journal, 40(2),

145–163. https://doi.org/10.1080/02626669509491401

 

Rau, P., Bourrel, L., Labat, D., Melo, P., Dewitte, B., Frappart, F., Lavado, W., &

Felipe, O. (2017). Regionalization of rainfall over the Peruvian Pacific slope and

coast. International Journal of Climatology, 37(1), 143–158.

https://doi.org/10.1002/joc.4693

 

Rawal, S., Gupta, S. C., & Singh, S. (2017). Predicting Missing Values in a Dataset:

Challenges and Approaches. International Journal of Recent Research Aspects,

4(3), 34–38. https://www.ijrra.net/Vol4issue3/IJRRA-04-03-07.pdf

 

Ray, S. (2015). 7 Regression Techniques you should know! Analytics Vidhya.

https://www.analyticsvidhya.com/blog/2015/08/comprehensive-guideregression/

 

Ray, S. (2017). Understanding Support Vector Machine(SVM) algorithm from

examples (along with code). Analytics Vidhya.

https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vectormachine-

example-code/

 

Ries, A., Campbell, A., Strategic, A., Centre, M., & Zeldin, T. (1997). The 80/20

principle: The secret of achieving more with less. In Long Range Planning (Vol.

30, Issue 6). https://doi.org/10.1016/s0024-6301(97)80978-8

 

Ritter, A., & Muñoz-Carpena, R. (2013). Performance evaluation of hydrological

models: Statistical significance for reducing subjectivity in goodness-of-fit

assessments. Journal of Hydrology, 480, 33–45.

https://doi.org/10.1016/j.jhydrol.2012.12.004

 

Rodr, R., Pastorini, M., Etcheverry, L., Chreties, C., Fossati, M., Castro, A., &

Gorgoglione, A. (2021). Water-Quality Data Imputation with a High Percentage

of Missing Values : A Machine Learning Approach. Sustainability, 13, 6318.

 

Rohani, A., Taki, M., & Abdollahpour, M. (2018). A novel soft computing model

(Gaussian process regression with K-fold cross validation) for daily and monthly

solar radiation forecasting (Part: I). Renewable Energy, 115, 411–422.

https://doi.org/10.1016/j.renene.2017.08.061

 

Rosebrock, A. (2019). Why is my validation loss lower than my training loss?

PyImageSearch. https://www.pyimagesearch.com/2019/10/14/why-is-myvalidation-

loss-lower-than-my-training-loss/

 

Roudier, P. (2017). Just enough machine learning to be dangerous. Creative

Commons Attribution 4.0. http://pierreroudier.github.io/teaching/20171014-

DSM-Masterclass-Hamilton/machine-learningtheory.

html#for_more_information

 

Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581–592.

https://doi.org/10.1186/1471-2105-12-432

 

Rubin, D. B., & Wiley, A. J. (2014). Statistical Analysis with Missing Data. NY: John

Wiley & Sons.

 

Ruiming, F. (2019). Wavelet based relevance vector machine model for monthly

runoff prediction. Water Quality Research Journal of Canada, 54(2), 134–141.

https://doi.org/10.2166/wcc.2018.196

 

Rushton, A., Croucher, P. and Baker, P. (2014). Handbook of logistics and

distribution management. Kogan Page Limited.

 

Sachindra, D. A., Ahmed, K., Rashid, M. M., Shahid, S., & Perera, B. J. C. (2018).

Statistical downscaling of precipitation using machine learning techniques.

Atmospheric Research, 212, 240–258.

https://doi.org/10.1016/j.atmosres.2018.05.022

 

Sachindra, D. A., Huang, F., Barton, A., & Perera, B. J. C. (2013). Least square

support vector and multi-linear regression for statistically downscaling general

circulation model outputs to catchment streamflows. International Journal of

Climatology, 33(5), 1087–1106. https://doi.org/10.1002/joc.3493

 

Saiful Samsudin, M., Azid, A., Iskandar Khalit, S., Milleana Shaharudin, S., Lananan,

F., & Juahir, H. (2018). Pollution Sources Identification of Water Quality Using

Chemometrics: a Case Study in Klang River Basin, Malaysia. International

Journal of Engineering & Technology, 7(4.43), 83–89.

https://www.researchgate.net/publication/331701453

 

Saini, O. and P. S. S. (2018). A Review on Dimension Reduction Techniques in Data

Mining. Computer Engineering and Intelligent Systems, 9(1), 7–14.

 

Saitta, S. (2010). What is a good classification accuracy in data mining? Data

Mining. http://www.dataminingblog.com/what-is-a-good-classificationaccuracy-

in-data-mining/

 

Salvi, K., S., K., & Ghosh, S. (2013). High-resolution multisite daily rainfall

projections in India with statistical downscaling for climate change impacts

assessment. Journal of Geophysical Research: Atmospheres, 118(9), 3557–3578.

https://doi.org/10.1002/jgrd.50280

 

Samsudin, M. S., Khalit, S. I., Azid, A., Juahir, H., Mohd Saudi, A. S., Sharip, Z., &

Zaudi, M. A. (2017). Control limit detection for source apportionment in Perlis

River Basin, Malaysia. Malaysian Journal of Fundamental and Applied

Sciences, 13(3). https://doi.org/10.11113/mjfas.v13n3.687

 

Samui, P. (2012). Application of Relevance Vector Machine for Prediction of

Ultimate Capacity of Driven Piles in Cohesionless Soils. Geotechnical and

Geological Engineering, 30(5), 1261–1270. https://doi.org/10.1007/s10706-012-

9539-9

 

Samui, P., & Dixon, B. (2012). Application of support vector machine and relevance

vector machine to determine evaporative losses in reservoirs. Hydrological

Processes, 26(9), 1361–1369. https://doi.org/10.1002/hyp.8278

 

Samui, P., Mandla, V. R., Krishna, A., & Teja, T. (2011). Prediction of Rainfall Using

Support Vector Machine and Relevance Vector Machine. Earth Science India,

4(Iv), 188–200.

 

Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art.

Psychological Methods, 7(2), 147–177. https://doi.org/10.1037/1082-

989X.7.2.147

 

Schmidt, I., & Mosima, B. (2014). To Impute or Not Impute : That Is the Question ?

In & H. J. A. (Eds. . In G. J. Mellenbergh (Ed.), Advising on research methods:

Selected topics 2013. Johannes van Kessel Publishing.

http://www.paultwin.com/wpcontent/

uploads/Lodder_1140873_Paper_Imputation.pdf

 

Schölkopf, B., Smola, A., & Müller, K. R. (1997). Kernel principal component

analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes

in Artificial Intelligence and Lecture Notes in Bioinformatics), 1327, 583–588.

 

Schoof, J. T. (2013). Statistical downscaling in climatology. Geography Compass,

7(4), 249–265. https://doi.org/10.1111/gec3.12036

 

Shaharudin, S. M., Ahmad, N., Zainuddin, N. H., & Mohamed, N. S. (2018).

Identification of rainfall patterns on hydrological simulation using robust

principal component analysis. Indonesian Journal of Electrical Engineering and

Computer Science, 11(3), 1162–1167.

https://doi.org/10.11591/ijeecs.v11.i3.pp1162-1167

 

Shaharudin, Shazlyn Milleana, Andayani, S., Kismiantini, Binatari, N., Kurniawan,

A., Basri, M. A. A., & Zainuddin, N. H. (2020). Imputation methods for

addressing missing data of monthly rainfall in Yogyakarta, Indonesia.

International Journal of Advanced Trends in Computer Science and

Engineering, 9(1.4 Special Issue), 646–651.

https://doi.org/10.30534/ijatcse/2020/9091.42020

 

Shamseldin, A. Y. (1997). Application of a neural network technique to rainfallrunoff

modelling. Journal of Hydrology, 199(3–4), 272–294.

https://doi.org/10.1016/S0022-1694(96)03330-6

 

Sherer, T., & JiayueHu. (2018). Training and Test Data sets.

Shetty, B. (2020). An In-Depth Guide to Supervised Machine Learning Classification.

Built In. https://builtin.com/data-science/supervised-machine-learningclassification

 

Shi, C. R., & Adnan, R. (2014). Modified cross-validation as a method for estimating

parameter. AIP Conference Proceedings, 1635(2014), 724–731.

https://doi.org/10.1063/1.4903662

 

Shinozaki, T., & Ostendorf, M. (2008). Cross-validation and aggregated EM training

for robust parameter estimation. Computer Speech and Language, 22(2), 185–

195. https://doi.org/10.1016/j.csl.2007.07.005

 

Singh, K. P., Malik, A., & Sinha, S. (2005). Water quality assessment and

apportionment of pollution sources of Gomti river (India) using multivariate

statistical techniques - A case study. Analytica Chimica Acta, 538(1–2), 355–

374. https://doi.org/10.1016/j.aca.2005.02.006

 

Smid, M., & Costa, A. C. (2018). Climate projections and downscaling techniques: a

discussion for impact studies in urban systems. International Journal of Urban

Sciences, 22(3), 277–307. https://doi.org/10.1080/12265934.2017.1409132

 

Soley-bori, M. (2013). Dealing with missing data: Key assumptions and methods for

applied analysis. PM931 Directed Study in Health Policy and Management, 4,

20.

 

Song, F., Guo, Z., & Mei, D. (2010). Feature selection using principal component

analysis. Proceedings - 2010 International Conference on System Science,

Engineering Design and Manufacturing Informatization, ICSEM 2010, 1, 27–30.

https://doi.org/10.1109/ICSEM.2010.14

 

Stahl, J. (2019). Overfitting in Machine Learning: What it is and How to prevent.

Elite Data Science. https://elitedatascience.com/overfitting-in-machine-learning

 

Stekhoven, D. J., & Bühlmann, P. (2012). Missforest-Non-parametric missing value

imputation for mixed-type data. Bioinformatics, 28(1), 112–118.

https://doi.org/10.1093/bioinformatics/btr597

 

Stephen, O. (2012). Hybrid GA-SVM for Efficient Feature Selection in E-mail

Classification. 3(3), 17–29.

 

Stone M. (1974). Cross-Validatory Choice and Assessment of Statistical Predictions.

Journal of the Royal Statistical Society. Series B (Methodological), 36(2), 111–

147.

 

Stone, M. (1977). Equivalence of Choice of Model by Cross-validation An

Asymptotic Akaike ’ s Criterion. Journal of the Royal Statistical Society. Series

B (Methodological), 39(1), 44–47.

 

Su, Y., Huang, Y., & Kuo, C. C. J. (2018). Efficient Text Classification Using Treestructured

Multi-linear Principal Component Analysis. Proceedings -

International Conference on Pattern Recognition, 2018-Augus, 585–590.

https://doi.org/10.1109/ICPR.2018.8545832

 

Tahir, T., Hashim, A. M., & Yusof, K. W. (2018). Statistical downscaling of rainfall

under transitional climate in Limbang River Basin by using SDSM. IOP

Conference Series: Earth and Environmental Science, 140(1).

https://doi.org/10.1088/1755-1315/140/1/012037

 

Tang, F., & Ishwaran, H. (2017). Random forest missing data algorithms. Statistical

Analysis and Data Mining, 10(6), 363–377. https://doi.org/10.1002/sam.11348

 

Tang, J., Niu, X., Wang, S., Gao, H., Wang, X., & Wu, J. (2016). Statistical

downscaling and dynamical downscaling of regional climate in China: Present

climate evaluations and future climate projections. Journal of Geophysical

Research: Atmospheres, 121, 2110–2129. https://doi.org/10.1038/175238c0

 

Tangri, N., Ansell, D., & Naimark, D. (2008). Predicting technique survival in

peritoneal dialysis patients: Comparing artificial neural networks and logistic

regression. Nephrology Dialysis Transplantation, 23(9), 2972–2981.

https://doi.org/10.1093/ndt/gfn187

 

Tannenbaum, C. E. (2009). The Empirical Nature and Statistical Treatment of

Missing data [University of Pennsylvania]. In ProQuest Dissertations

Publishing. http://dx.doi.org/10.1016/j.jaci.2012.05.050

 

Tanwar, S., Ramani, T., & Tyagi, S. (2018). Dimensionality reduction using PCA and

SVD in big data: A comparative case study. Lecture Notes of the Institute for

Computer Sciences, Social-Informatics and Telecommunications Engineering,

LNICST, 220 LNICST, 116–125. https://doi.org/10.1007/978-3-319-73712-6_12

 

Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2014). Flood susceptibility mapping

using a novel ensemble weights-of-evidence and support vector machine models

in GIS. Journal of Hydrology, 512, 332–343.

https://doi.org/10.1016/j.jhydrol.2014.03.008

 

Tipping, M. E. (2000). The relevance vector machine. Advances in Neural

Information Processing Systems, 653–658.

 

Tipping, M. E. (2001). Sparse Bayesian Learning and the Relevance Vector Machine.

Journal of Machine Learning Research, 1(3), 211–244.

https://doi.org/10.1162/15324430152748236

 

Tisseuil, C., Vrac, M., Lek, S., & Wade, A. J. (2010). Statistical downscaling of river

flows. Journal of Hydrology, 385(1–4), 279–291.

https://doi.org/10.1016/j.jhydrol.2010.02.030

 

Tokuç, A. A. (2021). Splitting a Dataset into Train and Test Sets. Baeldung.

https://www.baeldung.com/cs/train-test-datasets-ratio

 

Tripathi, S., & Govindaraju, R. S. (2007). On selection of kernel parametes in

relevance vector machines for hydrologic applications. Stochastic Environmental

Research and Risk Assessment, 21(6), 747–764. https://doi.org/10.1007/s00477-

006-0087-9

 

Tripathi, S., Srinivas, V. V., & Nanjundiah, R. S. (2006). Downscaling of

precipitation for climate change scenarios: A support vector machine approach.

Journal of Hydrology, 330(3–4), 621–640.

https://doi.org/10.1016/j.jhydrol.2006.04.030

 

Trzaska, S., & Schnarr, E. (2014). A review of downscaling methods for climate

change projections. United States Agency for International Development by

Tetra Tech ARD, September, 1–42.

 

Tsakiri, K., Marsellos, A., & Kapetanakis, S. (2018). Artificial neural network and

multiple linear regression for flood prediction in Mohawk River, New York.

Water (Switzerland), 10(9). https://doi.org/10.3390/w10091158

 

Tutz, G., & Ramzan, S. (2015). Improved methods for the imputation of missing data

by nearest neighbor methods. Computational Statistics and Data Analysis,

90(xxxx), 84–99. https://doi.org/10.1016/j.csda.2015.04.009

 

Valentine, J. C., & McHugh, C. M. (2007). The Effects of Attrition on Baseline

Comparability in Randomized Experiments in Education: A Meta-Analysis.

Psychological Methods, 12(3), 268–282. https://doi.org/10.1037/1082-

989X.12.3.268

 

Vallantin, L. (2018). Why you should not trust only in accuracy to measure machine

learning performance. Medium. https://medium.com/@limavallantin/why-youshould-

not-trust-only-in-accuracy-to-measure-machine-learning-performancea72cf00b4516

 

van der Heijden, G. J. M. G., T. Donders, A. R., Stijnen, T., & Moons, K. G. M.

(2006). Imputation of missing values is superior to complete case analysis and

the missing-indicator method in multivariable diagnostic research: A clinical

example. Journal of Clinical Epidemiology, 59(10), 1102–1109.

https://doi.org/10.1016/j.jclinepi.2006.01.015

 

Van Heerden, C., Barnard, E., Davel, M., Van Der Walt, C., Van Dyk, E., Feld, M., &

Müller, C. (2010). Combining regression and classification methods for

improving automatic speaker age recognition. ICASSP, IEEE International

Conference on Acoustics, Speech and Signal Processing - Proceedings, 5174–

5177. https://doi.org/10.1109/ICASSP.2010.5495006

 

Van Uytven, E., De Niel, J., & Willems, P. (2019). Uncovering the shortcomings of a

weather typing based statistical downscaling method. Hydrology and Earth

System Sciences Discussions, 1–35. https://doi.org/10.5194/hess-2019-40

 

Vandal, T., Kodra, E., & Ganguly, A. R. (2019). Intercomparison of machine learning

methods for statistical downscaling: the case of daily and extreme precipitation.

Theoretical and Applied Climatology, 137(1–2), 557–570.

https://doi.org/10.1007/s00704-018-2613-3

 

Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer Science.

https://doi.org/10.1007/978-1-4757-2440-0

 

Visoni, P. (2015). Predictive Model Selection Criteria for Logistic Regression.

Statistical Modelling, 8, 1000–1005. https://doi.org/10.1400/40307

 

Vrac, M., Stein, M., & Hayhoe, K. (2007). Statistical downscaling of precipitation

through nonhomogeneous stochastic weather typing. Climate Research, 34(3),

169–184. https://doi.org/10.3354/cr00696

 

Vu, M. T., Aribarg, T., Supratid, S., Raghavan, S. V., & Liong, S. Y. (2016).

Statistical downscaling rainfall using artificial neural network: significantly

wetter Bangkok? Theoretical and Applied Climatology, 126(3–4), 453–467.

https://doi.org/10.1007/s00704-015-1580-1

 

Wakefield, K. (2019). Predictive analytics and machine learning. SAS Analytics.

https://www.sas.com/en_gb/insights/articles/analytics/a-guide-to-predictiveanalytics-

and-machine-learning.html

 

Waljee, A. K., Mukherjee, A., Singal, A. G., Zhang, Y., Warren, J., Balis, U.,

Marrero, J., Zhu, J., & Higgins, P. D. R. (2013). Comparison of imputation

methods for missing laboratory data in medicine. BMJ Open, 3(8), 1–7.

https://doi.org/10.1136/bmjopen-2013-002847

 

Wang, J. E., & Qiao, J. Z. (2014). Parameter selection of SVR based on improved kfold

cross validation. Applied Mechanics and Materials, 462–463, 182–186.

https://doi.org/10.4028/www.scientific.net/AMM.462-463.182

 

Wang, Y., Xiao, Y., Lai, J., & Chen, Y. (2020). An adaptive k nearest neighbour

method for imputation of missing traffic data based on two similarity metrics.

Archives of Transport, 54(2), 59–73. https://doi.org/10.5604/01.3001.0014.2968

 

Wei, L., Yang, Y., Nishikawa, R. M., Wernick, M. N., & Edwards, A. (2005).

Relevance vector machine for automatic detection a of clustered

microcalcifications. IEEE Transactions on Medical Imaging, 24(10), 1278–1285.

https://doi.org/10.1109/TMI.2005.855435

 

Wen, Z., Li, B., Ramamohanarao, K., Chen, J., Chen, Y., & Zhang, R. (2017).

Improving efficiency of SVM k-fold cross-validation by alpha seeding. 31st

AAAI Conference on Artificial Intelligence, AAAI 2017, i, 2768–2774.

 

Wigley, R. L. W. and T. M. L. (1997). Downscaling general circulation model

output:a review of methods and limitations. Progress in Physical Geography,

21(4), 530–548.

 

Wilby, R. L., Charles, S. P., Zorita, E., Timbal, B., Whetton, P., & Mearns, L. O.

(2004). Guidelines for Use of Climate Scenarios Developed from Statistical

Downscaling Methods. Analysis, 27(August), 1–27. https://doi.org/citeulikearticle-

id:8861447

 

Wiskott, L. (2016). Lecture notes on Principal Component Analysis.

https://doi.org/http://orcid.org/0000-0001-6237-740X

 

Wiskott, L., & Alberto N., E.-B. (2013). How to Solve Classification and Regression

Problems on High-Dimensional Data with a Supervised Extension of Slow

Feature Analysis. Journal of Machine Learning Research, 14, 3683–3719.

http://cogprints.org/8966/

 

Wong, T. T. (2015). Performance evaluation of classification algorithms by k-fold

and leave-one-out cross validation. Pattern Recognition, 48(9), 2839–2846.

https://doi.org/10.1016/j.patcog.2015.03.009

 

Wu, Y., & Liu, Y. (2007). Robust truncated hinge loss support vector machines.

Journal of the American Statistical Association, 102(479), 974–983.

https://doi.org/10.1198/016214507000000617

 

Xia, Y. (2020). Correlation and association analyses in microbiome study integrating

multiomics in health and disease. In Progress in Molecular Biology and

Translational Science (1st ed., Vol. 171). Elsevier Inc.

https://doi.org/10.1016/bs.pmbts.2020.04.003

 

Xu, R., Chen, N., Chen, Y., & Chen, Z. (2020). Downscaling and Projection of Multi-

CMIP5 Precipitation Using Machine Learning Methods in the Upper Han River

Basin. Advances in Meteorology, 2020. https://doi.org/10.1155/2020/8680436

 

Yadav, S., & Shukla, S. (2016). Analysis of k-Fold Cross-Validation over Hold-Out

Validation on Colossal Datasets for Quality Classification. Proceedings - 6th

International Advanced Computing Conference, IACC 2016, Cv, 78–83.

https://doi.org/10.1109/IACC.2016.25

 

Ye, Y., Xiong, Y., Zhou, Q., Wu, J., Li, X., & Xiao, X. (2020). Comparison of

Machine Learning Methods and Conventional Logistic Regressions for

Predicting Gestational Diabetes Using Routine Clinical Data: A Retrospective

Cohort Study. Journal of Diabetes Research, 2020.

https://www.hindawi.com/journals/jdr/2020/4168340/

 

Zainudin, S., Jasim, D. S., & Bakar, A. A. (2016). Comparative analysis of data

mining techniques for malaysian rainfall prediction. International Journal on

Advanced Science, Engineering and Information Technology, 6(6), 1148–1153.

https://doi.org/10.18517/ijaseit.6.6.1487

 

Zhang, D., Tan, M. L., Dawood, S. R. S., Samat, N., Chang, C. K., Roy, R., Tew, Y.

L., & Mahamud, M. A. (2020). Comparison of ncep-cfsr and cmads for

hydrological modelling using swat in the muda river basin, malaysia. Water

(Switzerland), 12(11). https://doi.org/10.3390/w12113288

 

Zhang, G., Hu, M. Y., Patuwo, B. E., & Indro, D. C. (1999). Artificial neural

networks in bankruptcy prediction: general framework and cross-validation

analysis. European Journal of Operational Research, 116(1), 16–32.

https://doi.org/10.1016/S0377-2217(98)00051-4

 

Zhang, Yongli. (2012). Support vector machine classification algorithm and its

application. Communications in Computer and Information Science, 308

CCIS(PART 2), 179–186. https://doi.org/10.1007/978-3-642-34041-3_27

 

Zhang, Yudong, & Wu, L. (2012). Classification of fruits using computer vision and a

multiclass support vector machine. Sensors (Switzerland), 12(9), 12489–12505.

https://doi.org/10.3390/s120912489

 

Zhao, Y., & Miner, S. D. (2014). Data Mining Applications with R: ProQuest Tech

Books.

http://proquest.safaribooksonline.com.proxy1.library.mcgill.ca/book/programmin

g/r/9780124115118

 

Zorita, E., & von Storch, H. (1997). A survey of statistical downscaling techniques.

GKSS Report, 20. https://www.osti.gov/etdeweb/servlets/purl/595191

 

Zoro, R. (2012). How to explain poor classification performance of recall when using

SVM. Cross Validated. https://stats.stackexchange.com/q/22208

 


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to previous page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries with this repository, kindly contact us at pustakasys@upsi.edu.my or Whatsapp +60163630263 (Office hours only)