UPSI Digital Repository (UDRep)
Start | FAQ | About

QR Code Link :

Type :thesis
Subject :Q Science
Main Author :Siti Mariana Che Mat Nor
Title :Spatiotemporal rainfall patterns recognition using Robust Principal Component Analysis and Fuzzy C-means
Place of Production :Tanjong Malim
Publisher :Fakulti Sains dan Matematik
Year of Publication :2021
Corporate Name :Universiti Pendidikan Sultan Idris
PDF Guest :Click to view PDF file
PDF Full Text :The author has requested the full text of this item to be restricted.

Abstract : Universiti Pendidikan Sultan Idris
The main objective of this study is to identify the spatiotemporal rainfall patterns using Robust Principal Component Analysis and Fuzzy C-means (RPCA-FCM) of torrential rainfall of the East Coast of Peninsular Malaysia. As a methodology, the RPCA-FCM model was proposed to solve issues in identifying torrential rainfall. Generally, most rainfall data were missing for various reasons. The missing data mechanism was identified to choose suitable imputation methods. RF-MLR was chosen as the best imputation method in handling missing rainfall data. Dimension reduction method coupled with clustering approach was applied to reduce the data dimensions and perform the cluster partition. An RPCA-based Tukey’s biweight correlation and the optimum breakdown point to extract the number of components in RPCA were proposed. The data used in this study was generated using Monte Carlo simulation to evaluate the performance of the proposed statistical model. The result revealed that a breakdown point of 0.4 at 85% cumulative variance percentage efficiently extracts the number of components to avoid low-frequency variations or insignificant clusters’ spatial scale. This study also showed that there is an improvement where the RPCA downweighed the far-from-center outliers and developed the cluster partitions. However, K-Means allows each element to exclusively belong to a cluster. A solution was attained where FCM was combined to allow the data elements to belong to more than one cluster based on the rainfall data structure. In a conclusion, the results show a substantial improvement with the RPCA-FCM than the classical model in terms of the average number of clusters obtained and the cluster quality. As an implication, the identification of spatiotemporal cluster rainfall patterns is useful for hydrologists in analyzing environmental models and improves the assessment of climate change.

References

Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley

Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.

 

Aci, M., Inan, C., & Avci, M. (2010). A hybrid classification method of k nearest

neighbor, Bayesian methods and genetic algorithm. Expert Systems with

Applications, 37(7), 5061–5067.

 

Aggarwal, C. C. (2015). Data Mining: The Textbook. Springer.

 

Aissia, M. A. Ben, Chebana, F., & Ouarda, T. B. M. J. (2017). Multivariate missing

data in hydrology – Review and applications. Advances in Water Resources,

110, 299–309.

 

Alam, M. S., & Paul, S. (2019). A comparative analysis of clustering algorithms to

identify the homogeneous rainfall gauge stations of Bangladesh. Journal of

Applied Statistics, 47(8), 1460–1481.

 

Ali, M. U., Ahmed, S., Ferzund, J., Mehmood, A., & Rehman, A. (2017). Using PCA

and Factor Analysis for Dimensionality Reduction of Bio-informatics Data. In

IJACSA) International Journal of Advanced Computer Science and

Applications, 8(5).

 

Alias, N. E., Mohamad, H., Chin, W. Y., & Yusop, Z. (2016). Rainfall analysis of the

Kelantan big yellow flood 2014. Jurnal Teknologi, 78(9–4), 83–90.

 

Almazroui, M., Dambul, R., Islam, M. N., & Jones, P. D. (2015). Principal componentsbased

regionalization of the Saudi Arabian climate. International Journal of

Climatology, 35(9), 2555–2573.

 

Amiri, M., & Jensen, R. (2016). Missing data imputation using fuzzy-rough methods.

Neurocomputing, 205, 152–164.

 

Ansari, Z., Azeem, M. F., Ahmed, W., & Babu, A. V. (2015). Quantitative Evaluation

of Performance and Validity Indices for Clustering the Web Navigational

Sessions. 1(5), 217–226.

 

Arvind, G., Ashok Kumar, P., Girish Karthi, S., & Suribabu, C. R. (2017). Statistical

Analysis of 30 Years Rainfall Data: A Case Study. IOP Conference Series:

Earth and Environmental Science, 80(1).

 

Awan, J. A., Bae, D.-H., & Kim, K.-J. (2015). Identification and trend analysis of

homogeneous rainfall zones over the East Asia monsoon region. International

Journal of Climatology, 35, 1422–1433.

 

Ayugi, B., Wen, W., & Chepkemoi, D. (2016). Analysis of Spatial and Temporal

Patterns of Rainfall Variations over Kenya. Journal of Environment and Earth

Science.

 

Aziz, R., Verma, C. K., & Srivastava, N. (2017). A novel approach for dimension

reduction of microarray. Computational Biology and Chemistry, 71, 161–169.

 

Azlee, A. (2015). Worst floods in Kelantan, confirms NSC Malaysia . Malay Mail.

https://www.malaymail.com/news/malaysia/2015/01/05/worst-floods-inkelantan-

confirms-nsc/813959

 

Azman, M. A. Z., Zakaria, R., & Ahmad Radi, N. F. (2015). Estimation of missing

rainfall data in Pahang using modified spatial interpolation weighting methods.

AIP Conference Proceedings, 1643, 65–72.

 

Bali, J. L., Boente, G., Tyler, D. E., & Wang, J.-L. (2011). Robust Functional Principal

Components: A Projection-Pursuit Approach. 39(6), 2852–2882.

 

Behera, H. S., Ghosh, A., & Mishra, S. K. (2012). A New Improved Hybridized KMeans

Clustering Algorithm with Improved PCA Optimized with PSO for High

Dimensional Data Set. International Journal of Soft Computing and

Engineering (IJSCE), 2(2).

 

Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and

New Zealand Journal of Public Health, 25(5), 464–469.

 

Beretta, L., & Santaniello, A. (2016). Nearest neighbor imputation algorithms: A

critical evaluation. BMC Medical Informatics and Decision Making, 16(3).

 

Betancur, S. B., Gastmans, D., Vásquez, K. V., Santarosa, L. V., Santos, V. dos, &

Kirchheim, R. E. (2020). Hydrological responses in equatorial watersheds

indicated by Principal Components Analysis (PCA) – study case in Atrato River

Basin (Colombia). 25.

 

Bezdek, J. C. (1974). Cluster validity with fuzzy sets. Journal of Cybernetics, 3(3), 58–

73.

 

Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms

(1st ed.). Springe.

 

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

 

Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). ClValid: An R package for cluster

validation. Journal of Statistical Software, 25(4), 1–22.

 

Burhanuddin, S. N. Z. A., Mohd Deni, S., & Mohamed Ramli, N. (2016). Scientific

Research Journal. 13(1), 84–97.

 

Caldera, H. P. G. M., Piyathisse, V. R. P. C., & Nandalal, K. D. W. (2016). A

Comparison of Methods of Estimating Missing Daily Rainfall Data. Engineer:

Journal of the Institution of Engineers, Sri Lanka, 49(4), 1.

 

Caliñski, T., & Harabasz, J. (1974). A Dendrite Method Foe Cluster Analysis.

Communications in Statistics, 3(1), 1–27.

 

Campbell, N. A. (1980). Robust Procedures in Multivariate Analysis I: Robust

Covariance Estimation. Applied Statistics, 29(3), 231.

 

Campello, R. J. G. B., & Hruschka, E. R. (2006). A fuzzy extension of the silhouette

width criterion for cluster analysis. Fuzzy Sets and Systems, 157(21), 2858–

2875.

 

Celestino, A. M., Cruz, D. M., Sánchez, E. O., Reyes, F. G., & Soto, D. V. (2018).

Groundwater Quality Assessment: An Improved Approach to K-Means

Clustering, Principal Component Analysis and Spatial Analysis: A Case Study.

Water, 10(4), 437.

 

Chao, G., Luo, Y., & Ding, W. (2019). Recent Advances in Supervised Dimension

Reduction: A Survey. Machine Learning and Knowledge Extraction, 1(1), 341–

358.

 

Chhabra, G., Vashisht, V., & Ranjan, J. (2019). A Review on Missing Data Value

Estimation Using Imputation Algorithm. Journal of Advanced Research in

Dynamical and Control Systems, 11(7), 312–318.

 

Chin, R. J., Lai, S. H., Chang, K. B., Othman, F., & Jaafar, W. Z. W. (2016). Analysis

of rainfall events over Peninsular Malaysia. Weather, 71(5), 118–123.

 

Chiu, P. C., Selamat, A., Krejcar, O., & Kuok, K. K. (2019). Missing rainfall data

estimation using artificial neural network and nearest neighbor imputation.

Frontiers in Artificial Intelligence and Applications, 318, 132–143.

 

Chok, N. S. (2008). Pearson’s versus Spearman’s and Kendall’s correlation

coefficients for continuous data.

 

Chormunge, S., & Jena, S. (2018). Correlation based feature selection with clustering

for high dimensional data. Journal of Electrical Systems and Information

Technology, 5(3), 542–549.

 

Choulakian, V. (2001). Robust Q-mode principal component analysis in L1.

Computational Statistics and Data Analysis, 37(2), 135–150.

 

Croux, C., & Ruiz-Gazen, A. (2005). High breakdown estimators for principal

components: The projection-pursuit approach revisited. Journal of Multivariate

Analysis, 95(1), 206–226.

 

Dai, J. J., Lieu, L., & Rocke, D. (2006). Dimension reduction for classification with

gene expression microarray data. Statistical Applications in Genetics and

Molecular Biology, 5(1).

 

Darand, M., & Daneshvar, M. R. M. (2014). Regionalization of Precipitation Regimes

in Iran Using Principal Component Analysis and Hierarchical Clustering

Analysis. Environmental Processes, 1(4), 517–532.

 

Dave, R. N. (1996). Validating fuzzy partitions obtained through c-shells clustering.

Pattern Recognition Letters, 17(6), 613–623.

 

Donders, A. R. T., van der Heijden, G. J. M. G., Stijnen, T., & Moons, K. G. M. (2006).

Review: A gentle introduction to imputation of missing values. Journal of

Clinical Epidemiology, 59(10), 1087–1091.

 

Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. In

SpringerPlus (Vol. 2, Issue 1, pp. 1–17). SpringerOpen.

 

Doreswamy, & M. Vastrad, C. (2013). Identification of Outliers in Oxazolines and

Oxazoles High Dimension Molecular Descriptor Dataset Using Principal

Component Outlier Detection Algorithm and Comparative Numerical Study of

Other Robust Estimators. International Journal of Data Mining & Knowledge

Management Process, 3(4), 75–93.

 

Dubey, A. K., Gupta, U., & Jain, S. (2018). Comparative study of K-means and fuzzy

C-means algorithms on the breast cancer data. International Journal on

Advanced Science, Engineering and Information Technology, 8(1), 18–29.

 

Ebeling, B., Vargas, C., & Hubo, S. (2013). Combined Cluster Analysis and Principal

Component Analysis to Reduce Data Complexity for Exhaust Air Purification.

The Open Food Science Journal, 7(1), 8–22.

 

Enders, C. K. (2010). Applied missing data analysis. Guilford Publications.

 

Engel, D., Hüttenberger, L., & Hamann, B. (2012). A Survey of Dimension Reduction

Methods for High-dimensional Data Analysis and Visualization.

 

Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis, 5th Edition

(UK: Wiley Series in Probability and Statistics (ed.)). A John Wiley and Sons,

Ltd.

 

Feldman, R. M., & Valdez-Flores, C. (2010). Basics of Monte Carlo Simulation. In

Applied Probability and Stochastic Processes, Springer Berlin Heidelberg, 45–

72.

 

Filzmoser, P., & Todorov, V. (2011). Review of robust multivariate statistical methods

in high dimension. Analytica Chimica Acta, 705(1–2), 2–14.

 

Fragoso, M., & Gomes, T. P. (2008). Classification of daily abundant rainfall patterns

and associated large-scale atmospheric circulation types in Southern Portugal.

International Journal of Climatology, 28(4), 537–544.

 

Gao, Y., Merz, C., Lischeid, G., & Schneider, M. (2018). A review on missing

hydrological data processing. Environmental Earth Sciences, 77(2), 47.

 

Gautam, D. K. (2017). Identification of Hydrologically Similar Catchments Using

Fuzzy C-means Clustering Identification of Hydrologically Similar Catchments

Using Fuzzy C-Means Clustering.

 

Ghazali, S. M., Shaadan, N., & Idrus, Z. (2020). Missing data exploration in air quality

data set using r-package data visualisation tools. Bulletin of Electrical

Engineering and Informatics, 9(2), 755–763.

 

Ghosh, S., & Dubey, S. K. (2013). Comparative Analysis of K-Means and Fuzzy CMeans

Algorithms. International Journal of Advanced Computer Science and

Applications, 4(4).

 

Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on

performance of learning algorithms for hydrologic predictions: Implications to

an imputation technique. Water Resources Research, 43(7).

 

Gnanasankaran, N., Ramaraj, E., Fellow, P. D., & Head, P. &. (2020). A Multiple

Linear Regression Model To Predict Rainfall Using Indian Meteorological

Data. International Journal of Advanced Science and Technology, 29(8), 746–

758.

 

Gomes, E. P., Blanco, C. J. C., & Pessoa, F. C. L. (2019). Identification of

homogeneous precipitation regions via Fuzzy c-means in the hydrographic

region of Tocantins–Araguaia of Brazilian Amazonia. Applied Water Science,

9(1), 1–12.

 

Goyal, M. K., & Gupta, V. (2014). Identification of homogeneous rainfall regimes in

northeast region of India using fuzzy cluster analysis. Water Resources

Management, 28(13), 4491–4511.

 

Halim, S. A., Abas, N., & Shazwani, N. (2017). Rainfall analysis in the northern region

of Peninsular Malaysia. International Journal of Advanced And Applied

Sciences, 4(11), 11–16.

 

Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation

techniques. Journal of Intelligent Information Systems, 17(2–3), 107–145.

 

Hamzah, F. B., Mohd Hamzah, F., Mohd Razali, S. F., Jaafar, O., & Abdul Jamil, N.

(2020). Imputation methods for recovering streamflow observation: A

methodological review. Cogent Environmental Science, 6(1).

 

Hardin, J., Mitani, A., Hicks, L., & VanKoten, B. (2007). A robust measure of

correlation between two genes on a microarray. BMC Bioinformatics, 8.

 

Hasan, M. M., & Croke, B. F. W. (2013). Filling gaps in daily rainfall data: a statistical

approach.

 

Hashimi, H., Hafez, A., & Mathkour, H. (2014). Selection criteria for text mining

approaches.

 

Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy Cluster Analysis:

Methods for Classification, Data Analysis and Image Recognition.

 

Huber, P. J. (1981). Robust Statistics. John Wiley & Sons, Inc.

 

Hubert, M., Rousseeuw, P. J., & Vanden Branden, K. (2005). ROBPCA: A new

approach to robust principal component analysis. Technometrics, 47(1), 64–79.

 

Hubert, M., Rousseeuw, P., & Verdonck, T. (2009). Robust PCA for skewed data and

its outlier map. Computational Statistics and Data Analysis, 53(6), 2264–2274.

 

Ilaboya, I. ., & Igbinedion, O. E. (2019). Performance of Multiple Linear Regression

(MLR) and Artificial Neural Network (ANN) as Predictive Tool for Rainfall

Modelling. International Journal Of Engineering Science And Application,

3(1).

 

Ismail, W. N. W., & Zin, W. Z. W. (2017). Estimation of rainfall and stream flow

missing data for Terengganu, Malaysia by using interpolation technique

methods. Malaysian Journal of Fundamental and Applied Sciences, 13(3).

 

Jackson, D. A. (1993). Stopping rules in principal components analysis: A comparison

of heuristical and statistical approaches. Ecological Society of America, 74(8).

 

Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should

multiple imputation be used for handling missing data in randomised clinical

trials - A practical guide with flowcharts. BMC Medical Research Methodology,

17(1), 162.

 

Jolliffe, I. T. (1986). Principal Component Analysis. In Principal Component Analysis.

Springer-Verlag.

 

Jolliffe, Ian T., & Cadima, J. (2016). Principal component analysis: a review and recent

developments. Philosophical Transactions of the Royal Society A:

Mathematical, Physical and Engineering Sciences, 374(2065).

 

Kamaruzaman, I. F., Wan Zin, W. Z., & Mohd Ariff, N. (2017). A comparison of

method for treating missing daily rainfall data in Peninsular Malaysia.

Malaysian Journal of Fundamental and Applied Sciences, 13(4–1), 375–380.

 

Kamble, V. B., & Deshmukh, S. N. (2017). Comparision Between Accuracy and

MSE,RMSE by Using Proposed Method with Imputation Technique. Oriental

Journal of Computer Science and Technology, 10(04), 773–779.

 

Kantardzic, M. (2019). Data Mining: Concepts, Models, Methods, and Algorithms, 3rd

Edition | Wiley. In Wiley-IEEE Press.

 

Kim, H.-J. (2008). Common Factor Analysis Versus Principal Component Analysis:

Choice for Symptom Cluster Research. Asian Nursing Research, 2(1), 17–24.

 

Kim, M., Baek, S., Ligaray, M., Pyo, J., Park, M., & Cho, K. H. (2015). Comparative

studies of different imputation methods for recovering streamflow observation.

Water (Switzerland), 7(12), 6847–6860.

 

Komalasari, K. E., Pawitan, H., & Faqih, A. (2017). Descriptive Statistics and Cluster

Analysis for Extreme Rainfall in Java Island. IOP Conference Series: Earth and

Environmental Science, 58(1).

 

Koumare, I. (2014). Temporal/Spatial Distribution of Rainfall and the Associated

Circulation Anomalies over West Africa. Pakistan Journal of Meteorology,

10(20).

 

Kumar, M. (2018). Flash floods hit parts of KL city following heavy downpour . The

Star. https://www.thestar.com.my/news/nation/2018/11/11/kuala-lumpurflash-

floods/

 

Kumari, M., Singh, C. K., Bakimchandra, O., & Basistha, A. (2016). Geographically

weighted regression based quantification of rainfall–topography relationship

and rainfall gradient in Central Himalayas. International Journal of

Climatology, 37(3), 1299–1309.

 

Lakshminarayanan, B., & Behmardi, B. (2015). Dimensionality reduction methods for

text classification and visualization.

 

Legates, D. R., & McCabe, G. J. (1999). Evaluating the use of “goodness-of-fit”

measures in hydrologic and hydroclimatic model validation. Water Resources

Research, 35(1), 233–241.

 

Lei, T., Jia, X., Zhang, Y., He, L., Meng, H., & Nandi, A. K. (2018). Significantly Fast

and Robust Fuzzy C-Means Clustering Algorithm Based on Morphological

Reconstruction and Membership Filtering. IEEE Transactions on Fuzzy

Systems, 26(5), 3027–3041.

 

Lhazmir, S., Moudden, I. El, & Kobbane, A. (2018). Feature extraction based on

principal component analysis for text categorization. PEMWN 2017 - 6th IFIP

International Conference on Performance Evaluation and Modeling in Wired

and Wireless Networks, 1–6.

 

Li, G., & Chen, Z. (1985). Projection-Pursuit Approach to Robust Dispersion Matrices

and Principal Components: Primary Theory and Monte Carlo. Journal of the

American Statistical Association, 80(391), 759.

 

Li, X., & Reynolds, A. C. (2017). Generation of a proposal distribution for efficient

MCMC characterization of uncertainty in reservoir description and forecasting.

Society of Petroleum Engineers - SPE Reservoir Simulation Conference 2017,

1585–1609.

 

Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data. In

Statistical Analysis with Missing Data. John Wiley & Sons, Inc.

 

Lo Presti, R., Barca, E., & Passarella, G. (2010). A methodology for treating missing

data applied to daily rainfall data in the Candelaro River Basin (Italy).

Environmental Monitoring and Assessment, 160(1–4), 1–22.

 

Loke, E., Arnbjerg-Nielsen, K., & Harremoës, P. (1999). Artificial neural networks and

grey-box modelling: A comparison. The Institution of Engineers Australia.

 

Luo, L., Bao, S., & Tong, C. (2019). Sparse Robust Principal Component Analysis with

Applications to Fault Detection and Diagnosis. Industrial and Engineering

Chemistry Research, 58(3), 1300–1309.

 

M. B. Reddy, L. R. (2010). Dimensionality Reduction: An Empirical Study on the

Usability of IFE-CF (Independent Feature Elimination- by C-Correlation and FCorrelation)

Measures. International Journal of Computer Science Issues, 7(1),

74–81.

 

Machiwal, D., Dayal, D., & Kumar, S. (2017). Long-term rainfall trends and change

points in hot and cold arid regions of India. Hydrological Sciences Journal,

62(7), 1050–1066.

 

Malavika, S., & Selvam, K. (2015). Reduction of dimensionality for high dimensional

data using correlation measures. Global Journal of Pure and Applied

Mathematics (GJ, 11(1).

 

Martínez, J. L. M., Horta-Rangel, F. A., Segovia-Domínguez, I., Morua, A. R., &

Hernández, J. H. (2019). Analysis of a new spatial interpolation weighting

method to estimate missing data applied to rainfall records. Atmosfera, 32(3),

237–259.

 

Meng, C., Zeleznik, O. A., Thallinger, G. G., Kuster, B., Gholami, A. M., & Culhane,

A. C. (2016). Dimension reduction techniques for the integrative analysis of

multi-omics data. Briefing in Bioinformatics, 17(4), 628–641.

 

Mfwango, L. H., Salim, C. J., & Kazumba, S. (2018). Estimation of Missing River Flow

Data for Hydrologic Analysis: The Case of Great Ruaha River Catchment.

Hydrology: Current Research, 9(2).

 

Milligan, G. W., & Cooper, M. C. (1987). Methodology Review: Clustering Methods.

Applied Psychological Measurement, 11(4), 329–354.

 

Mills, G. F. (1995). Principal Component Analysis of precipitation and rainfall

regionalization in Spain. Theoretical and Applied Climatology, 50(3–4), 169–

183.

 

Mitra, A., Apte, A., Govindarajan, R., Vasan, V., & Vadlamani, S. (2018). A discrete

view of the Indian monsoon to identify spatial patterns of rainfall.

 

Moloy, D. J., & Khan, J. A. (2015). Product M Estimator Of Correlation For Bivariate

Data : A Simulation Study. In Journal of Science and Technology, 5(1).

 

Montazerolghaem, M., Vervoort, W., Minasny, B., & McBratney, A. (2015).

Spatiotemporal monthly rainfall forecasts for south-eastern and eastern

Australia using climatic indices. Theory Applications Climatology.

 

Moritz, S., Sardá, A., Bartz-Beielstein, T., Zaefferer, M., & Stork, J. (2015).

Comparison of different Methods for Univariate Time Series Imputation in R.

 

Moron, V., Robertson, A. W., Qian, J. H., & Ghil, M. (2015). Weather types across the

Maritime Continent: From the diurnal cycle to interannual variations. Frontiers

in Environmental Science, 2.

 

Muthukrishnan, R., Malar, K. T., Mahalakshmi, P., & Ramkumar, N. (2019). Robust

Approaches on the Estimation of Correlation. International Journal of Engine

Research, 6(1), 77–83.

 

Napoleon, D., & Pavalakodi, S. (2011). A New Method for Dimensionality Reduction

Using KMeans Clustering Algorithm for High Dimensional Data Set.

International Journal of Computer Applications, 13(7), 41–46.

 

Nathan, R., & Weinmann, E. (2013). Australian Rainfall & Runoff Discussion Paper:

Monte Carlo Simulation Technique.

 

Navid, MAI and Niloy, N. (2018). Multiple Linear Regressions for Predicting Rainfall

for Bangladesh. Communications, 6(1).

 

Nayak, J., Naik, B., & Behera, H. S. (2015). Fuzzy C-means (FCM) clustering

algorithm: A decade review from 2000 to 2014. In Smart Innovation, Systems

and Technologies (Vol. 32). Springer Science and Business Media Deutschland.

 

Neware, S., Mehta, K., & Zadgaonkar, A. S. (2013). Finger Knuckle Identification

using Principal Component Analysis and Nearest Mean Classifier. International

Journal of Computer Applications, 70(9).

 

Noguchi, S., Nik, A. R., Sammori, T., Tani, M., & Tsuboyama, Y. (1996). Rainfall

Characteristics Of Tropical Rain Forest And Temperate Forest: Comparison

Between Buktt Tarek In Peninsular Malaysia And Hitachi Ohta In Japan.

Journal of Tropical Forest Science, 9(2), 206–220.

 

Norazizi, N. A. A., & Deni, S. M. (2019). Comparison of Artificial Neural Network

(ANN) and Other Imputation Methods in Estimating Missing Rainfall Data at

Kuantan Station. Communications in Computer and Information Science, 298–

306.

 

Othman, M., Ash’aari, Z. H., & Mohamad, N. D. (2015). Long-term daily rainfall

pattern recognition: Application of principal component analysis . International

Conference on Environmental Forensics 2015 (IENFORCE2015) , 127–132.

 

Owen, M. (2010). Tukey’s Biweight Correlation and the Breakdown.

 

Padilha, V. A., & Campello, R. J. G. B. (2017). A systematic comparative evaluation

of biclustering techniques. BMC Bioinformatics, 18(1).

 

Pal, N. R., & Bezdek, J. C. (1995). On Cluster Validity for the Fuzzy c-Means Model.

IEEE Transactions on Fuzzy Systems, 3(3), 370–379.

 

Pan, R., Yang, T., Cao, J., Lu, K., & Zhang, Z. (2015). Missing data imputation by K

nearest neighbours based on grey relational structure and mutual information.

Applied Intelligence, 43(3), 614–632.

 

Peñarrocha, D., Estrela, M. J., & Millán, M. (2002). Classification of daily rainfall

patterns in a Mediterranean area with extreme intensity levels: the Valencia

region. International Journal of Climatology, 22(6), 677–695.

 

Pimentel, B. A., & De Souza, R. M. C. R. (2013a). A multivariate fuzzy c-means

method. Applied Soft Computing Journal, 13(4), 1592–1607.

 

Pimentel, B. A., & De Souza, R. M. C. R. (2013b). A multivariate fuzzy c-means

method. Applied Soft Computing Journal, 13(4), 1592–1607.

 

Pinidluek, P., Konyai, S., & Sriboonlue, V. (2020). Regionalization of Rainfall in

Northeastern Thailand. International Journal of GEOMATE, 18(68), 135–141.

 

Polyak, B. T., & Khlebnikov, M. V. (2017). Robust Principal Component Analysis: An

IRLS Approach. IFAC-PapersOnLine, 50(1), 2762–2767.

 

Pratama, I., Permanasari, A. E., Ardiyanto, I., & Indrayani, R. (2016). A review of

missing values handling methods on time-series data. International Conference

on Information Technology Systems and Innovation, ICITSI 2016.

 

Priyan, K. (2015). Spatial and Temporal Variability of Rainfall in Anand District of

Gujarat State. Aquatic Procedia, 4, 713–720.

 

Radi, N. F. A., Zakaria, R., & Azman, M. A. Z. (2015). Estimation of missing rainfall

data using spatial interpolation and imputation methods. AIP Conference

Proceedings, 1643, 42–48.

 

Rahman, A. S., & Rahman, A. (2020). Application of principal component analysis and

cluster analysis in regional flood frequency analysis: A case study in new South

Wales, Australia. Water (Switzerland), 12(3), 1–26.

 

Rehman, M. H. ur, Liew, C. S., Abbas, A., Jayaraman, P. P., Wah, T. Y., & Khan, S.

U. (2016). Big Data Reduction Methods: A Survey. Data Science and

Engineering, 1(4), 265–284.

 

Ren, M., Liu, P., Wang, Z., & Yi, J. (2016). A Self-Adaptive Fuzzy c-Means Algorithm

for Determining the Optimal Number of Clusters. Computational Intelligence

and Neuroscience, 2016(1).

 

Rizwan, M., & Kim, T.-W. (2013). Application of a Mixed Gumbel Distribution to

Construct Rainfall Depth-Duration-Frequency (DDF) Curves Considering

Outlier Effect in Hydrologic Data. Journal of Environmental Science, 6(2), 54–

60.

 

Romero, R., Ramis, C., & Guijarro, J. A. (1999). Daily rainfall patterns in the Spanish

Mediterranean area: an objective classification. International Journal of

Climatology, 19(1), 95–112.

 

Rousseeuw, P. (1985). Multivariate Estimation with High Breakdown Point.

Mathematical Statistics and Applications.

 

Saini, O., & Sharma, S. (2018). A Review on Dimension Reduction Techniques in Data

Mining. Computer Engineering and Intelligent Systems, 9(1), 7–14.

 

Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., Er, M. J., Ding,

W., & Lin, C. T. (2017). A review of clustering techniques and developments.

Neurocomputing, 267, 664–681.

 

Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data.

 

Serinaldi, F., & Kilsby, C. G. (2014). Simulating daily rainfall fields over large areas

for collective risk estimation. Journal of Hydrology, 512, 285–302.

 

Shafi, M. A., Rusiman, M. S., Ismail, S., & Kamardan, M. G. (2019). A hybrid of

Multiple Linear Regression Clustering model with support vector machine for

colorectal cancer tumor size prediction. International Journal of Advanced

Computer Science and Applications, 10(4), 323–328.

 

Shaharudin, S. M., & Ahmad, N. (2017). Choice of cumulative percentage in principal

component analysis for regionalization of peninsular Malaysia based on the

rainfall amount. Communications in Computer and Information Science, 752,

216–224.

 

Shaharudin, S. M., Ahmad, N., Zainuddin, N. H., & Mohamed, N. S. (2018).

Identification of rainfall patterns on hydrological simulation using robust

principal component analysis. Indonesian Journal of Electrical Engineering

and Computer Science, 11(3), 1162–1167.

 

Shaharudin, S. M., Andayani, S., Kismiantini, Binatari, N., Kurniawan, A., Basri, M.

A. A., & Zainuddin, N. H. (2020). Imputation methods for addressing missing

data of monthly rainfall in Yogyakarta, Indonesia. International Journal of

Advanced Trends in Computer Science and Engineering, 9(1.4 Special Issue),

646–651.

 

Shaharudin, S. M., Ismail, S., Nor, S. M. C. M., & Ahmad, N. (2019). An efficient

method to improve the clustering performance using hybrid robust principal

component analysis-spectral biclustering in rainfall patterns identification. IAES

International Journal of Artificial Intelligence, 8(3), 237–243.

 

Silva, R. P. De, Dayawansa, N. D. K., & Ratnasiri, M. D. (2007). A Comparison Of

Methods Used In Estimating Missing Rainfall Data. University of Peradeniya,

May, 101–108.

 

Sim, J., Lee, J. S., & Kwon, O. (2015). Missing values and optimal selection of an

imputation method and classification algorithm to improve the accuracy of

ubiquitous computing applications. Mathematical Problems in Engineering.

 

Simek, K., & Jarzab, M. (2007). SVD analysis of gene expression data. In Modeling

and Simulation in Science, Engineering and Technology (Vol. 38, pp. 361–372).

Springer Basel.

 

Stathis, D., & Myronidis, D. (2009). Principal component analysis of precipitation in

Thessaly region (central Greece). Global Nest Journal, 11(4), 467–476.

 

Stekhoven, D. J., & Buhlmann, P. (2012). MissForest--non-parametric missing value

imputation for mixed-type data. Bioinformatics, 28(1), 112–118.

 

Suartana, I. M., & Hidayat, A. I. N. (2018). Analysis of New Student Selection using

Clustering Algorithms. IOP Conference Series: Materials Science and

Engineering, 288(1).

 

Subbalakshmi, C., Rama Krishna, G., Krishna Mohan Rao, S., & Venketeswa Rao, P.

(2015). A method to find optimum number of clusters based on fuzzy silhouette

on dynamic data set. Procedia Computer Science, 46, 346–353.

 

Subbalakshmi, C., Sayal, R., & Saini, H. S. (2020). Cluster Validity Using Modified

Fuzzy Silhouette Index on Large Dynamic Data Set. In Himansu Sekhar Behera,

J. Nayak, B. Naik, & D. Pelusi (Eds.), Computational Intelligence in Data

Mining, 1–14.

 

Suhaimi, N., Ghazali, N. A., Nasir, M. Y., Mokhtar, M. I. Z., & Ramli, N. A. (2017).

Markov Chain Monte Carlo Method for Handling Missing Data in Air Quality

Datasets. Malaysian Journal of Analytical Science, 21(3), 552–559.

 

Supriya, P., Krishnaveni, M., & Subbulakshmia, M. (2015). Regression Analysis of

Annual Maximum Daily Rainfall and Stream Flow for Flood Forecasting in

Vellar River Basin. International Conference On Water Resources, Coastal And

Ocean Engineering (ICWRCOE 2015), 957–963.

 

Tan, M. L., Ibrahim, A. L., Duan, Z., Cracknell, A. P., & Chaplot, V. (2015). Evaluation

of six high-resolution satellite and ground-based precipitation products over

Malaysia. Remote Sensing, 7(2), 1504–1528.

 

Tang, F., & Ishwaran, H. (2017). Random forest missing data algorithms. Statistical

Analysis and Data Mining, 10(6), 363–377.

 

Taufik, A., & Ahmad, S. S. S. (2014). A Comparative Study Of Fuzzy C-Means And

K-Means Clustering Techniques. Malaysian Technical Universities Conference

on Engineering and Technology (MUCET), 10–11.

 

Tenenhaus, M. (1998a). La régression PLS: théorie et pratique.

 

Tenenhaus, M. (1998b). La régression PLS: théorie et pratique .

 

Varghese, N., Varghese, V., Gayathri, P., & Jaisankar, N. (2012). A Survey Of

Dimensionality Reduction And Classification Methods. International Journal

of Computer Science & Engineering Survey, 3(3), 45–54.

 

Velliangiri, S., Alagumuthukrishnan, S., & Thankumar Joseph, S. I. (2019). A Review

of Dimensionality Reduction Techniques for Efficient Computation. Procedia

Computer Science, 165, 104–111.

 

Wang, F. (2009). Factor Analysis and Principal-Components Analysis. In International

Encyclopedia of Human Geography,1–7.

 

Wang, L., & Alexander, C. A. (2016). Machine learning in big data. International

Journal of Mathematical, Engineering and Management Sciences, 1(2), 52–61.

 

Wasowicz, P., Pasierbiński, A., Przedpelska-Wasowicz, E. M., & Kristinsson, H.

(2014). Distribution Patterns in the Native Vascular Flora of Iceland. PLoS

ONE, 9(7).

 

Wickramagamage, P. (2010). Seasonality and spatial pattern of rainfall of Sri Lanka:

Exploratory factor analysis. International Journal of Climatology, 30(8), 1235–

1245.

 

Widaman, K. F. (2006). Missing Data: What To Do With Or Without Them. In

Monographs of the Society for Research in Child Development, 71(1), 210–211.

 

Wold, H. (1974). Causal flows with latent variables. Partings of the ways in the light of

NIPALS modelling. European Economic Review, 5(1), 67–86.

 

Wong, C. L., Venneker, R., Uhlenbrook, S., Jamil, A. B. M., & Zhou, Y. (2009).

Variability of rainfall in Peninsular Malaysia. Hydrology and Earth System

Sciences Discussions, 6(4), 5471–5503.

 

Xia, Y., Fabian, P., Stohl, A., & Winterhalter, M. (1999). Forest climatology:

Estimation of missing values for Bavaria, Germany. Agricultural and Forest

Meteorology, 96(1–3), 131–144.

 

Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 13(8), 841–847.

 

Xu, R., & Wunsch, D. (2005). Survey of Clustering Algorithms. IEEE Transactions On

Neural Networks, 16(3).

 

Yao, Y., & Ingelheim, B. (2019). The Diagnosis and Handling of Missing Data –

MCMC in Multiple Imputation. PharmaSUG China.

 

Young, K. C. (1992). A Three-Way Model for Interpolating for Monthly Precipitation

Values in: Monthly Weather Review Volume 120 Issue 11 (1992).

 

Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.

 

Zainuddin, N. H., Lola, M. S., & Kamar, N. S. (2016). The Performance of BBMCEWMA

Model: Case Study on Normal & Non-Normal Data. Social

Sciences Research Journal, 4(2), 155–163.

 

Zhang, B., & Cao, P. (2019). Classification of high dimensional biomedical data based

on feature selection using redundant removal. PLOS ONE, 14(4).

 

Zhang, Q., Lu, J., Zhang, M., Duan, H., & Lv, L. (2015). Hand Gesture Segmentation

Method Based on YCbCr Color Space and K-Means Clustering. International

Journal of Signal Processing, Image Processing and Pattern Recognition, 8(5),

105–116.

 


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to previous page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries with this repository, kindly contact us at pustakasys@upsi.edu.my or Whatsapp +60163630263 (Office hours only)