UPSI Digital Repository (UDRep)
|
|
|
Abstract : Universiti Pendidikan Sultan Idris |
The main objective of this study is to identify the spatiotemporal rainfall patterns using
Robust Principal Component Analysis and Fuzzy C-means (RPCA-FCM) of torrential
rainfall of the East Coast of Peninsular Malaysia. As a methodology, the RPCA-FCM
model was proposed to solve issues in identifying torrential rainfall. Generally, most
rainfall data were missing for various reasons. The missing data mechanism was
identified to choose suitable imputation methods. RF-MLR was chosen as the best
imputation method in handling missing rainfall data. Dimension reduction method
coupled with clustering approach was applied to reduce the data dimensions and
perform the cluster partition. An RPCA-based Tukey’s biweight correlation and the
optimum breakdown point to extract the number of components in RPCA were
proposed. The data used in this study was generated using Monte Carlo simulation to
evaluate the performance of the proposed statistical model. The result revealed that a
breakdown point of 0.4 at 85% cumulative variance percentage efficiently extracts the
number of components to avoid low-frequency variations or insignificant clusters’
spatial scale. This study also showed that there is an improvement where the RPCA
downweighed the far-from-center outliers and developed the cluster partitions.
However, K-Means allows each element to exclusively belong to a cluster. A solution
was attained where FCM was combined to allow the data elements to belong to more
than one cluster based on the rainfall data structure. In a conclusion, the results show
a substantial improvement with the RPCA-FCM than the classical model in terms of
the average number of clusters obtained and the cluster quality. As an implication, the
identification of spatiotemporal cluster rainfall patterns is useful for hydrologists in
analyzing environmental models and improves the assessment of climate change. |
References |
Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4), 433–459.
Aci, M., Inan, C., & Avci, M. (2010). A hybrid classification method of k nearest neighbor, Bayesian methods and genetic algorithm. Expert Systems with Applications, 37(7), 5061–5067.
Aggarwal, C. C. (2015). Data Mining: The Textbook. Springer.
Aissia, M. A. Ben, Chebana, F., & Ouarda, T. B. M. J. (2017). Multivariate missing data in hydrology – Review and applications. Advances in Water Resources, 110, 299–309.
Alam, M. S., & Paul, S. (2019). A comparative analysis of clustering algorithms to identify the homogeneous rainfall gauge stations of Bangladesh. Journal of Applied Statistics, 47(8), 1460–1481.
Ali, M. U., Ahmed, S., Ferzund, J., Mehmood, A., & Rehman, A. (2017). Using PCA and Factor Analysis for Dimensionality Reduction of Bio-informatics Data. In IJACSA) International Journal of Advanced Computer Science and Applications, 8(5).
Alias, N. E., Mohamad, H., Chin, W. Y., & Yusop, Z. (2016). Rainfall analysis of the Kelantan big yellow flood 2014. Jurnal Teknologi, 78(9–4), 83–90.
Almazroui, M., Dambul, R., Islam, M. N., & Jones, P. D. (2015). Principal componentsbased regionalization of the Saudi Arabian climate. International Journal of Climatology, 35(9), 2555–2573.
Amiri, M., & Jensen, R. (2016). Missing data imputation using fuzzy-rough methods. Neurocomputing, 205, 152–164.
Ansari, Z., Azeem, M. F., Ahmed, W., & Babu, A. V. (2015). Quantitative Evaluation of Performance and Validity Indices for Clustering the Web Navigational Sessions. 1(5), 217–226.
Arvind, G., Ashok Kumar, P., Girish Karthi, S., & Suribabu, C. R. (2017). Statistical Analysis of 30 Years Rainfall Data: A Case Study. IOP Conference Series: Earth and Environmental Science, 80(1).
Awan, J. A., Bae, D.-H., & Kim, K.-J. (2015). Identification and trend analysis of homogeneous rainfall zones over the East Asia monsoon region. International Journal of Climatology, 35, 1422–1433.
Ayugi, B., Wen, W., & Chepkemoi, D. (2016). Analysis of Spatial and Temporal Patterns of Rainfall Variations over Kenya. Journal of Environment and Earth Science.
Aziz, R., Verma, C. K., & Srivastava, N. (2017). A novel approach for dimension reduction of microarray. Computational Biology and Chemistry, 71, 161–169.
Azlee, A. (2015). Worst floods in Kelantan, confirms NSC Malaysia . Malay Mail. https://www.malaymail.com/news/malaysia/2015/01/05/worst-floods-inkelantan- confirms-nsc/813959
Azman, M. A. Z., Zakaria, R., & Ahmad Radi, N. F. (2015). Estimation of missing rainfall data in Pahang using modified spatial interpolation weighting methods. AIP Conference Proceedings, 1643, 65–72.
Bali, J. L., Boente, G., Tyler, D. E., & Wang, J.-L. (2011). Robust Functional Principal Components: A Projection-Pursuit Approach. 39(6), 2852–2882.
Behera, H. S., Ghosh, A., & Mishra, S. K. (2012). A New Improved Hybridized KMeans Clustering Algorithm with Improved PCA Optimized with PSO for High Dimensional Data Set. International Journal of Soft Computing and Engineering (IJSCE), 2(2).
Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, 25(5), 464–469.
Beretta, L., & Santaniello, A. (2016). Nearest neighbor imputation algorithms: A critical evaluation. BMC Medical Informatics and Decision Making, 16(3).
Betancur, S. B., Gastmans, D., Vásquez, K. V., Santarosa, L. V., Santos, V. dos, & Kirchheim, R. E. (2020). Hydrological responses in equatorial watersheds indicated by Principal Components Analysis (PCA) – study case in Atrato River Basin (Colombia). 25.
Bezdek, J. C. (1974). Cluster validity with fuzzy sets. Journal of Cybernetics, 3(3), 58– 73.
Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms (1st ed.). Springe.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Brock, G., Pihur, V., Datta, S., & Datta, S. (2008). ClValid: An R package for cluster validation. Journal of Statistical Software, 25(4), 1–22.
Burhanuddin, S. N. Z. A., Mohd Deni, S., & Mohamed Ramli, N. (2016). Scientific Research Journal. 13(1), 84–97.
Caldera, H. P. G. M., Piyathisse, V. R. P. C., & Nandalal, K. D. W. (2016). A Comparison of Methods of Estimating Missing Daily Rainfall Data. Engineer: Journal of the Institution of Engineers, Sri Lanka, 49(4), 1.
Caliñski, T., & Harabasz, J. (1974). A Dendrite Method Foe Cluster Analysis. Communications in Statistics, 3(1), 1–27.
Campbell, N. A. (1980). Robust Procedures in Multivariate Analysis I: Robust Covariance Estimation. Applied Statistics, 29(3), 231.
Campello, R. J. G. B., & Hruschka, E. R. (2006). A fuzzy extension of the silhouette width criterion for cluster analysis. Fuzzy Sets and Systems, 157(21), 2858– 2875.
Celestino, A. M., Cruz, D. M., Sánchez, E. O., Reyes, F. G., & Soto, D. V. (2018). Groundwater Quality Assessment: An Improved Approach to K-Means Clustering, Principal Component Analysis and Spatial Analysis: A Case Study. Water, 10(4), 437.
Chao, G., Luo, Y., & Ding, W. (2019). Recent Advances in Supervised Dimension Reduction: A Survey. Machine Learning and Knowledge Extraction, 1(1), 341– 358.
Chhabra, G., Vashisht, V., & Ranjan, J. (2019). A Review on Missing Data Value Estimation Using Imputation Algorithm. Journal of Advanced Research in Dynamical and Control Systems, 11(7), 312–318.
Chin, R. J., Lai, S. H., Chang, K. B., Othman, F., & Jaafar, W. Z. W. (2016). Analysis of rainfall events over Peninsular Malaysia. Weather, 71(5), 118–123.
Chiu, P. C., Selamat, A., Krejcar, O., & Kuok, K. K. (2019). Missing rainfall data estimation using artificial neural network and nearest neighbor imputation. Frontiers in Artificial Intelligence and Applications, 318, 132–143.
Chok, N. S. (2008). Pearson’s versus Spearman’s and Kendall’s correlation coefficients for continuous data.
Chormunge, S., & Jena, S. (2018). Correlation based feature selection with clustering for high dimensional data. Journal of Electrical Systems and Information Technology, 5(3), 542–549.
Choulakian, V. (2001). Robust Q-mode principal component analysis in L1. Computational Statistics and Data Analysis, 37(2), 135–150.
Croux, C., & Ruiz-Gazen, A. (2005). High breakdown estimators for principal components: The projection-pursuit approach revisited. Journal of Multivariate Analysis, 95(1), 206–226.
Dai, J. J., Lieu, L., & Rocke, D. (2006). Dimension reduction for classification with gene expression microarray data. Statistical Applications in Genetics and Molecular Biology, 5(1).
Darand, M., & Daneshvar, M. R. M. (2014). Regionalization of Precipitation Regimes in Iran Using Principal Component Analysis and Hierarchical Clustering Analysis. Environmental Processes, 1(4), 517–532.
Dave, R. N. (1996). Validating fuzzy partitions obtained through c-shells clustering. Pattern Recognition Letters, 17(6), 613–623.
Donders, A. R. T., van der Heijden, G. J. M. G., Stijnen, T., & Moons, K. G. M. (2006). Review: A gentle introduction to imputation of missing values. Journal of Clinical Epidemiology, 59(10), 1087–1091.
Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. In SpringerPlus (Vol. 2, Issue 1, pp. 1–17). SpringerOpen.
Doreswamy, & M. Vastrad, C. (2013). Identification of Outliers in Oxazolines and Oxazoles High Dimension Molecular Descriptor Dataset Using Principal Component Outlier Detection Algorithm and Comparative Numerical Study of Other Robust Estimators. International Journal of Data Mining & Knowledge Management Process, 3(4), 75–93.
Dubey, A. K., Gupta, U., & Jain, S. (2018). Comparative study of K-means and fuzzy C-means algorithms on the breast cancer data. International Journal on Advanced Science, Engineering and Information Technology, 8(1), 18–29.
Ebeling, B., Vargas, C., & Hubo, S. (2013). Combined Cluster Analysis and Principal Component Analysis to Reduce Data Complexity for Exhaust Air Purification. The Open Food Science Journal, 7(1), 8–22.
Enders, C. K. (2010). Applied missing data analysis. Guilford Publications.
Engel, D., Hüttenberger, L., & Hamann, B. (2012). A Survey of Dimension Reduction Methods for High-dimensional Data Analysis and Visualization.
Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster Analysis, 5th Edition (UK: Wiley Series in Probability and Statistics (ed.)). A John Wiley and Sons, Ltd.
Feldman, R. M., & Valdez-Flores, C. (2010). Basics of Monte Carlo Simulation. In Applied Probability and Stochastic Processes, Springer Berlin Heidelberg, 45– 72.
Filzmoser, P., & Todorov, V. (2011). Review of robust multivariate statistical methods in high dimension. Analytica Chimica Acta, 705(1–2), 2–14.
Fragoso, M., & Gomes, T. P. (2008). Classification of daily abundant rainfall patterns and associated large-scale atmospheric circulation types in Southern Portugal. International Journal of Climatology, 28(4), 537–544.
Gao, Y., Merz, C., Lischeid, G., & Schneider, M. (2018). A review on missing hydrological data processing. Environmental Earth Sciences, 77(2), 47.
Gautam, D. K. (2017). Identification of Hydrologically Similar Catchments Using Fuzzy C-means Clustering Identification of Hydrologically Similar Catchments Using Fuzzy C-Means Clustering.
Ghazali, S. M., Shaadan, N., & Idrus, Z. (2020). Missing data exploration in air quality data set using r-package data visualisation tools. Bulletin of Electrical Engineering and Informatics, 9(2), 755–763.
Ghosh, S., & Dubey, S. K. (2013). Comparative Analysis of K-Means and Fuzzy CMeans Algorithms. International Journal of Advanced Computer Science and Applications, 4(4).
Gill, M. K., Asefa, T., Kaheil, Y., & McKee, M. (2007). Effect of missing data on performance of learning algorithms for hydrologic predictions: Implications to an imputation technique. Water Resources Research, 43(7).
Gnanasankaran, N., Ramaraj, E., Fellow, P. D., & Head, P. &. (2020). A Multiple Linear Regression Model To Predict Rainfall Using Indian Meteorological Data. International Journal of Advanced Science and Technology, 29(8), 746– 758.
Gomes, E. P., Blanco, C. J. C., & Pessoa, F. C. L. (2019). Identification of homogeneous precipitation regions via Fuzzy c-means in the hydrographic region of Tocantins–Araguaia of Brazilian Amazonia. Applied Water Science, 9(1), 1–12.
Goyal, M. K., & Gupta, V. (2014). Identification of homogeneous rainfall regimes in northeast region of India using fuzzy cluster analysis. Water Resources Management, 28(13), 4491–4511.
Halim, S. A., Abas, N., & Shazwani, N. (2017). Rainfall analysis in the northern region of Peninsular Malaysia. International Journal of Advanced And Applied Sciences, 4(11), 11–16.
Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2–3), 107–145.
Hamzah, F. B., Mohd Hamzah, F., Mohd Razali, S. F., Jaafar, O., & Abdul Jamil, N. (2020). Imputation methods for recovering streamflow observation: A methodological review. Cogent Environmental Science, 6(1).
Hardin, J., Mitani, A., Hicks, L., & VanKoten, B. (2007). A robust measure of correlation between two genes on a microarray. BMC Bioinformatics, 8.
Hasan, M. M., & Croke, B. F. W. (2013). Filling gaps in daily rainfall data: a statistical approach.
Hashimi, H., Hafez, A., & Mathkour, H. (2014). Selection criteria for text mining approaches.
Höppner, F., Klawonn, F., Kruse, R., & Runkler, T. (1999). Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition.
Huber, P. J. (1981). Robust Statistics. John Wiley & Sons, Inc.
Hubert, M., Rousseeuw, P. J., & Vanden Branden, K. (2005). ROBPCA: A new approach to robust principal component analysis. Technometrics, 47(1), 64–79.
Hubert, M., Rousseeuw, P., & Verdonck, T. (2009). Robust PCA for skewed data and its outlier map. Computational Statistics and Data Analysis, 53(6), 2264–2274.
Ilaboya, I. ., & Igbinedion, O. E. (2019). Performance of Multiple Linear Regression (MLR) and Artificial Neural Network (ANN) as Predictive Tool for Rainfall Modelling. International Journal Of Engineering Science And Application, 3(1).
Ismail, W. N. W., & Zin, W. Z. W. (2017). Estimation of rainfall and stream flow missing data for Terengganu, Malaysia by using interpolation technique methods. Malaysian Journal of Fundamental and Applied Sciences, 13(3).
Jackson, D. A. (1993). Stopping rules in principal components analysis: A comparison of heuristical and statistical approaches. Ecological Society of America, 74(8).
Jakobsen, J. C., Gluud, C., Wetterslev, J., & Winkel, P. (2017). When and how should multiple imputation be used for handling missing data in randomised clinical trials - A practical guide with flowcharts. BMC Medical Research Methodology, 17(1), 162.
Jolliffe, I. T. (1986). Principal Component Analysis. In Principal Component Analysis. Springer-Verlag.
Jolliffe, Ian T., & Cadima, J. (2016). Principal component analysis: a review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065).
Kamaruzaman, I. F., Wan Zin, W. Z., & Mohd Ariff, N. (2017). A comparison of method for treating missing daily rainfall data in Peninsular Malaysia. Malaysian Journal of Fundamental and Applied Sciences, 13(4–1), 375–380.
Kamble, V. B., & Deshmukh, S. N. (2017). Comparision Between Accuracy and MSE,RMSE by Using Proposed Method with Imputation Technique. Oriental Journal of Computer Science and Technology, 10(04), 773–779.
Kantardzic, M. (2019). Data Mining: Concepts, Models, Methods, and Algorithms, 3rd Edition | Wiley. In Wiley-IEEE Press.
Kim, H.-J. (2008). Common Factor Analysis Versus Principal Component Analysis: Choice for Symptom Cluster Research. Asian Nursing Research, 2(1), 17–24.
Kim, M., Baek, S., Ligaray, M., Pyo, J., Park, M., & Cho, K. H. (2015). Comparative studies of different imputation methods for recovering streamflow observation. Water (Switzerland), 7(12), 6847–6860.
Komalasari, K. E., Pawitan, H., & Faqih, A. (2017). Descriptive Statistics and Cluster Analysis for Extreme Rainfall in Java Island. IOP Conference Series: Earth and Environmental Science, 58(1).
Koumare, I. (2014). Temporal/Spatial Distribution of Rainfall and the Associated Circulation Anomalies over West Africa. Pakistan Journal of Meteorology, 10(20).
Kumar, M. (2018). Flash floods hit parts of KL city following heavy downpour . The Star. https://www.thestar.com.my/news/nation/2018/11/11/kuala-lumpurflash- floods/
Kumari, M., Singh, C. K., Bakimchandra, O., & Basistha, A. (2016). Geographically weighted regression based quantification of rainfall–topography relationship and rainfall gradient in Central Himalayas. International Journal of Climatology, 37(3), 1299–1309.
Lakshminarayanan, B., & Behmardi, B. (2015). Dimensionality reduction methods for text classification and visualization.
Legates, D. R., & McCabe, G. J. (1999). Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resources Research, 35(1), 233–241.
Lei, T., Jia, X., Zhang, Y., He, L., Meng, H., & Nandi, A. K. (2018). Significantly Fast and Robust Fuzzy C-Means Clustering Algorithm Based on Morphological Reconstruction and Membership Filtering. IEEE Transactions on Fuzzy Systems, 26(5), 3027–3041.
Lhazmir, S., Moudden, I. El, & Kobbane, A. (2018). Feature extraction based on principal component analysis for text categorization. PEMWN 2017 - 6th IFIP International Conference on Performance Evaluation and Modeling in Wired and Wireless Networks, 1–6.
Li, G., & Chen, Z. (1985). Projection-Pursuit Approach to Robust Dispersion Matrices and Principal Components: Primary Theory and Monte Carlo. Journal of the American Statistical Association, 80(391), 759.
Li, X., & Reynolds, A. C. (2017). Generation of a proposal distribution for efficient MCMC characterization of uncertainty in reservoir description and forecasting. Society of Petroleum Engineers - SPE Reservoir Simulation Conference 2017, 1585–1609.
Little, R. J. A., & Rubin, D. B. (2002). Statistical Analysis with Missing Data. In Statistical Analysis with Missing Data. John Wiley & Sons, Inc.
Lo Presti, R., Barca, E., & Passarella, G. (2010). A methodology for treating missing data applied to daily rainfall data in the Candelaro River Basin (Italy). Environmental Monitoring and Assessment, 160(1–4), 1–22.
Loke, E., Arnbjerg-Nielsen, K., & Harremoës, P. (1999). Artificial neural networks and grey-box modelling: A comparison. The Institution of Engineers Australia.
Luo, L., Bao, S., & Tong, C. (2019). Sparse Robust Principal Component Analysis with Applications to Fault Detection and Diagnosis. Industrial and Engineering Chemistry Research, 58(3), 1300–1309.
M. B. Reddy, L. R. (2010). Dimensionality Reduction: An Empirical Study on the Usability of IFE-CF (Independent Feature Elimination- by C-Correlation and FCorrelation) Measures. International Journal of Computer Science Issues, 7(1), 74–81.
Machiwal, D., Dayal, D., & Kumar, S. (2017). Long-term rainfall trends and change points in hot and cold arid regions of India. Hydrological Sciences Journal, 62(7), 1050–1066.
Malavika, S., & Selvam, K. (2015). Reduction of dimensionality for high dimensional data using correlation measures. Global Journal of Pure and Applied Mathematics (GJ, 11(1).
Martínez, J. L. M., Horta-Rangel, F. A., Segovia-Domínguez, I., Morua, A. R., & Hernández, J. H. (2019). Analysis of a new spatial interpolation weighting method to estimate missing data applied to rainfall records. Atmosfera, 32(3), 237–259.
Meng, C., Zeleznik, O. A., Thallinger, G. G., Kuster, B., Gholami, A. M., & Culhane, A. C. (2016). Dimension reduction techniques for the integrative analysis of multi-omics data. Briefing in Bioinformatics, 17(4), 628–641.
Mfwango, L. H., Salim, C. J., & Kazumba, S. (2018). Estimation of Missing River Flow Data for Hydrologic Analysis: The Case of Great Ruaha River Catchment. Hydrology: Current Research, 9(2).
Milligan, G. W., & Cooper, M. C. (1987). Methodology Review: Clustering Methods. Applied Psychological Measurement, 11(4), 329–354.
Mills, G. F. (1995). Principal Component Analysis of precipitation and rainfall regionalization in Spain. Theoretical and Applied Climatology, 50(3–4), 169– 183.
Mitra, A., Apte, A., Govindarajan, R., Vasan, V., & Vadlamani, S. (2018). A discrete view of the Indian monsoon to identify spatial patterns of rainfall.
Moloy, D. J., & Khan, J. A. (2015). Product M Estimator Of Correlation For Bivariate Data : A Simulation Study. In Journal of Science and Technology, 5(1).
Montazerolghaem, M., Vervoort, W., Minasny, B., & McBratney, A. (2015). Spatiotemporal monthly rainfall forecasts for south-eastern and eastern Australia using climatic indices. Theory Applications Climatology.
Moritz, S., Sardá, A., Bartz-Beielstein, T., Zaefferer, M., & Stork, J. (2015). Comparison of different Methods for Univariate Time Series Imputation in R.
Moron, V., Robertson, A. W., Qian, J. H., & Ghil, M. (2015). Weather types across the Maritime Continent: From the diurnal cycle to interannual variations. Frontiers in Environmental Science, 2.
Muthukrishnan, R., Malar, K. T., Mahalakshmi, P., & Ramkumar, N. (2019). Robust Approaches on the Estimation of Correlation. International Journal of Engine Research, 6(1), 77–83.
Napoleon, D., & Pavalakodi, S. (2011). A New Method for Dimensionality Reduction Using KMeans Clustering Algorithm for High Dimensional Data Set. International Journal of Computer Applications, 13(7), 41–46.
Nathan, R., & Weinmann, E. (2013). Australian Rainfall & Runoff Discussion Paper: Monte Carlo Simulation Technique.
Navid, MAI and Niloy, N. (2018). Multiple Linear Regressions for Predicting Rainfall for Bangladesh. Communications, 6(1).
Nayak, J., Naik, B., & Behera, H. S. (2015). Fuzzy C-means (FCM) clustering algorithm: A decade review from 2000 to 2014. In Smart Innovation, Systems and Technologies (Vol. 32). Springer Science and Business Media Deutschland.
Neware, S., Mehta, K., & Zadgaonkar, A. S. (2013). Finger Knuckle Identification using Principal Component Analysis and Nearest Mean Classifier. International Journal of Computer Applications, 70(9).
Noguchi, S., Nik, A. R., Sammori, T., Tani, M., & Tsuboyama, Y. (1996). Rainfall Characteristics Of Tropical Rain Forest And Temperate Forest: Comparison Between Buktt Tarek In Peninsular Malaysia And Hitachi Ohta In Japan. Journal of Tropical Forest Science, 9(2), 206–220.
Norazizi, N. A. A., & Deni, S. M. (2019). Comparison of Artificial Neural Network (ANN) and Other Imputation Methods in Estimating Missing Rainfall Data at Kuantan Station. Communications in Computer and Information Science, 298– 306.
Othman, M., Ash’aari, Z. H., & Mohamad, N. D. (2015). Long-term daily rainfall pattern recognition: Application of principal component analysis . International Conference on Environmental Forensics 2015 (IENFORCE2015) , 127–132.
Owen, M. (2010). Tukey’s Biweight Correlation and the Breakdown.
Padilha, V. A., & Campello, R. J. G. B. (2017). A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics, 18(1).
Pal, N. R., & Bezdek, J. C. (1995). On Cluster Validity for the Fuzzy c-Means Model. IEEE Transactions on Fuzzy Systems, 3(3), 370–379.
Pan, R., Yang, T., Cao, J., Lu, K., & Zhang, Z. (2015). Missing data imputation by K nearest neighbours based on grey relational structure and mutual information. Applied Intelligence, 43(3), 614–632.
Peñarrocha, D., Estrela, M. J., & Millán, M. (2002). Classification of daily rainfall patterns in a Mediterranean area with extreme intensity levels: the Valencia region. International Journal of Climatology, 22(6), 677–695.
Pimentel, B. A., & De Souza, R. M. C. R. (2013a). A multivariate fuzzy c-means method. Applied Soft Computing Journal, 13(4), 1592–1607.
Pimentel, B. A., & De Souza, R. M. C. R. (2013b). A multivariate fuzzy c-means method. Applied Soft Computing Journal, 13(4), 1592–1607.
Pinidluek, P., Konyai, S., & Sriboonlue, V. (2020). Regionalization of Rainfall in Northeastern Thailand. International Journal of GEOMATE, 18(68), 135–141.
Polyak, B. T., & Khlebnikov, M. V. (2017). Robust Principal Component Analysis: An IRLS Approach. IFAC-PapersOnLine, 50(1), 2762–2767.
Pratama, I., Permanasari, A. E., Ardiyanto, I., & Indrayani, R. (2016). A review of missing values handling methods on time-series data. International Conference on Information Technology Systems and Innovation, ICITSI 2016.
Priyan, K. (2015). Spatial and Temporal Variability of Rainfall in Anand District of Gujarat State. Aquatic Procedia, 4, 713–720.
Radi, N. F. A., Zakaria, R., & Azman, M. A. Z. (2015). Estimation of missing rainfall data using spatial interpolation and imputation methods. AIP Conference Proceedings, 1643, 42–48.
Rahman, A. S., & Rahman, A. (2020). Application of principal component analysis and cluster analysis in regional flood frequency analysis: A case study in new South Wales, Australia. Water (Switzerland), 12(3), 1–26.
Rehman, M. H. ur, Liew, C. S., Abbas, A., Jayaraman, P. P., Wah, T. Y., & Khan, S. U. (2016). Big Data Reduction Methods: A Survey. Data Science and Engineering, 1(4), 265–284.
Ren, M., Liu, P., Wang, Z., & Yi, J. (2016). A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters. Computational Intelligence and Neuroscience, 2016(1).
Rizwan, M., & Kim, T.-W. (2013). Application of a Mixed Gumbel Distribution to Construct Rainfall Depth-Duration-Frequency (DDF) Curves Considering Outlier Effect in Hydrologic Data. Journal of Environmental Science, 6(2), 54– 60.
Romero, R., Ramis, C., & Guijarro, J. A. (1999). Daily rainfall patterns in the Spanish Mediterranean area: an objective classification. International Journal of Climatology, 19(1), 95–112.
Rousseeuw, P. (1985). Multivariate Estimation with High Breakdown Point. Mathematical Statistics and Applications.
Saini, O., & Sharma, S. (2018). A Review on Dimension Reduction Techniques in Data Mining. Computer Engineering and Intelligent Systems, 9(1), 7–14.
Saxena, A., Prasad, M., Gupta, A., Bharill, N., Patel, O. P., Tiwari, A., Er, M. J., Ding, W., & Lin, C. T. (2017). A review of clustering techniques and developments. Neurocomputing, 267, 664–681.
Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data.
Serinaldi, F., & Kilsby, C. G. (2014). Simulating daily rainfall fields over large areas for collective risk estimation. Journal of Hydrology, 512, 285–302.
Shafi, M. A., Rusiman, M. S., Ismail, S., & Kamardan, M. G. (2019). A hybrid of Multiple Linear Regression Clustering model with support vector machine for colorectal cancer tumor size prediction. International Journal of Advanced Computer Science and Applications, 10(4), 323–328.
Shaharudin, S. M., & Ahmad, N. (2017). Choice of cumulative percentage in principal component analysis for regionalization of peninsular Malaysia based on the rainfall amount. Communications in Computer and Information Science, 752, 216–224.
Shaharudin, S. M., Ahmad, N., Zainuddin, N. H., & Mohamed, N. S. (2018). Identification of rainfall patterns on hydrological simulation using robust principal component analysis. Indonesian Journal of Electrical Engineering and Computer Science, 11(3), 1162–1167.
Shaharudin, S. M., Andayani, S., Kismiantini, Binatari, N., Kurniawan, A., Basri, M. A. A., & Zainuddin, N. H. (2020). Imputation methods for addressing missing data of monthly rainfall in Yogyakarta, Indonesia. International Journal of Advanced Trends in Computer Science and Engineering, 9(1.4 Special Issue), 646–651.
Shaharudin, S. M., Ismail, S., Nor, S. M. C. M., & Ahmad, N. (2019). An efficient method to improve the clustering performance using hybrid robust principal component analysis-spectral biclustering in rainfall patterns identification. IAES International Journal of Artificial Intelligence, 8(3), 237–243.
Silva, R. P. De, Dayawansa, N. D. K., & Ratnasiri, M. D. (2007). A Comparison Of Methods Used In Estimating Missing Rainfall Data. University of Peradeniya, May, 101–108.
Sim, J., Lee, J. S., & Kwon, O. (2015). Missing values and optimal selection of an imputation method and classification algorithm to improve the accuracy of ubiquitous computing applications. Mathematical Problems in Engineering.
Simek, K., & Jarzab, M. (2007). SVD analysis of gene expression data. In Modeling and Simulation in Science, Engineering and Technology (Vol. 38, pp. 361–372). Springer Basel.
Stathis, D., & Myronidis, D. (2009). Principal component analysis of precipitation in Thessaly region (central Greece). Global Nest Journal, 11(4), 467–476.
Stekhoven, D. J., & Buhlmann, P. (2012). MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), 112–118.
Suartana, I. M., & Hidayat, A. I. N. (2018). Analysis of New Student Selection using Clustering Algorithms. IOP Conference Series: Materials Science and Engineering, 288(1).
Subbalakshmi, C., Rama Krishna, G., Krishna Mohan Rao, S., & Venketeswa Rao, P. (2015). A method to find optimum number of clusters based on fuzzy silhouette on dynamic data set. Procedia Computer Science, 46, 346–353.
Subbalakshmi, C., Sayal, R., & Saini, H. S. (2020). Cluster Validity Using Modified Fuzzy Silhouette Index on Large Dynamic Data Set. In Himansu Sekhar Behera, J. Nayak, B. Naik, & D. Pelusi (Eds.), Computational Intelligence in Data Mining, 1–14.
Suhaimi, N., Ghazali, N. A., Nasir, M. Y., Mokhtar, M. I. Z., & Ramli, N. A. (2017). Markov Chain Monte Carlo Method for Handling Missing Data in Air Quality Datasets. Malaysian Journal of Analytical Science, 21(3), 552–559.
Supriya, P., Krishnaveni, M., & Subbulakshmia, M. (2015). Regression Analysis of Annual Maximum Daily Rainfall and Stream Flow for Flood Forecasting in Vellar River Basin. International Conference On Water Resources, Coastal And Ocean Engineering (ICWRCOE 2015), 957–963.
Tan, M. L., Ibrahim, A. L., Duan, Z., Cracknell, A. P., & Chaplot, V. (2015). Evaluation of six high-resolution satellite and ground-based precipitation products over Malaysia. Remote Sensing, 7(2), 1504–1528.
Tang, F., & Ishwaran, H. (2017). Random forest missing data algorithms. Statistical Analysis and Data Mining, 10(6), 363–377.
Taufik, A., & Ahmad, S. S. S. (2014). A Comparative Study Of Fuzzy C-Means And K-Means Clustering Techniques. Malaysian Technical Universities Conference on Engineering and Technology (MUCET), 10–11.
Tenenhaus, M. (1998a). La régression PLS: théorie et pratique.
Tenenhaus, M. (1998b). La régression PLS: théorie et pratique .
Varghese, N., Varghese, V., Gayathri, P., & Jaisankar, N. (2012). A Survey Of Dimensionality Reduction And Classification Methods. International Journal of Computer Science & Engineering Survey, 3(3), 45–54.
Velliangiri, S., Alagumuthukrishnan, S., & Thankumar Joseph, S. I. (2019). A Review of Dimensionality Reduction Techniques for Efficient Computation. Procedia Computer Science, 165, 104–111.
Wang, F. (2009). Factor Analysis and Principal-Components Analysis. In International Encyclopedia of Human Geography,1–7.
Wang, L., & Alexander, C. A. (2016). Machine learning in big data. International Journal of Mathematical, Engineering and Management Sciences, 1(2), 52–61.
Wasowicz, P., Pasierbiński, A., Przedpelska-Wasowicz, E. M., & Kristinsson, H. (2014). Distribution Patterns in the Native Vascular Flora of Iceland. PLoS ONE, 9(7).
Wickramagamage, P. (2010). Seasonality and spatial pattern of rainfall of Sri Lanka: Exploratory factor analysis. International Journal of Climatology, 30(8), 1235– 1245.
Widaman, K. F. (2006). Missing Data: What To Do With Or Without Them. In Monographs of the Society for Research in Child Development, 71(1), 210–211.
Wold, H. (1974). Causal flows with latent variables. Partings of the ways in the light of NIPALS modelling. European Economic Review, 5(1), 67–86.
Wong, C. L., Venneker, R., Uhlenbrook, S., Jamil, A. B. M., & Zhou, Y. (2009). Variability of rainfall in Peninsular Malaysia. Hydrology and Earth System Sciences Discussions, 6(4), 5471–5503.
Xia, Y., Fabian, P., Stohl, A., & Winterhalter, M. (1999). Forest climatology: Estimation of missing values for Bavaria, Germany. Agricultural and Forest Meteorology, 96(1–3), 131–144.
Xie, X. L., & Beni, G. (1991). A validity measure for fuzzy clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(8), 841–847.
Xu, R., & Wunsch, D. (2005). Survey of Clustering Algorithms. IEEE Transactions On Neural Networks, 16(3).
Yao, Y., & Ingelheim, B. (2019). The Diagnosis and Handling of Missing Data – MCMC in Multiple Imputation. PharmaSUG China.
Young, K. C. (1992). A Three-Way Model for Interpolating for Monthly Precipitation Values in: Monthly Weather Review Volume 120 Issue 11 (1992).
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353.
Zainuddin, N. H., Lola, M. S., & Kamar, N. S. (2016). The Performance of BBMCEWMA Model: Case Study on Normal & Non-Normal Data. Social Sciences Research Journal, 4(2), 155–163.
Zhang, B., & Cao, P. (2019). Classification of high dimensional biomedical data based on feature selection using redundant removal. PLOS ONE, 14(4).
Zhang, Q., Lu, J., Zhang, M., Duan, H., & Lv, L. (2015). Hand Gesture Segmentation Method Based on YCbCr Color Space and K-Means Clustering. International Journal of Signal Processing, Image Processing and Pattern Recognition, 8(5), 105–116.
|
This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials. You may use the digitized material for private study, scholarship, or research. |