UPSI Digital Repository (UDRep)
Start | FAQ | About

QR Code Link :

Type :article
Subject :Q Science
Main Author :Shazlyn Milleana Shaharudin
Additional Authors :Shuhaida Ismail
Siti Mariana Che Mat Nor
Norhaiza Ahmad
Title :An efficient method to improve the clustering performance using hybrid robust principal component analysis-spectral biclustering in rainfall patterns identification
Place of Production :Tanjong Malim
Publisher :Fakulti Sains dan Matematik
Year of Publication :2019
Corporate Name :Universiti Pendidikan Sultan Idris
PDF Guest :Click to view PDF file

Abstract : Universiti Pendidikan Sultan Idris
In this study, hybrid RPCA-spectral biclustering model is proposed in identifying the Peninsular Malaysia rainfall pattern. This model is a combination between Robust Principal Component Analysis (RPCA) and biclustering in order to overcome the skewness problem that existed in the Peninsular Malaysia rainfall data. The ability of Robust PCA is more resilient to outlier given that it assesses every observation and downweights the ones which deviate from the data center compared to classical PCA. Meanwhile, two way-clustering able to simultaneously cluster along two variables and exhibit a high correlation compared to one-way cluster analysis. The experimental results showed that the best cumulative percentage of variation in between 65%-70% for both Robust and classical PCA. Meanwhile, the number of clusters has improved from six disjointed cluster in Robust PCA-kMeans to eight disjointed cluster for the proposed model. Further analysis shows that the proposed model has smaller variation with the values of 0.0034 compared to 0.030 in Robust PCAkMeans model. Evident from this analysis, it is proven that the proposed RPCA-spectral biclustering model is predominantly acclimatized to the identifying rainfall patterns in Peninsular Malaysia due to the small variation of the clustering result.  


[1] V. Moron, et al., “Weather Types Across the Maritime Continent: from the Diurnal Cycle to Interannual Variations,” Frontiers in Environmental Science. 3(44), 2015.

[2] N. H. Ahmad, et al., “Hierarchical Cluster Approach for Regionalization of Peninsular Malaysia based on the Precipitation Amount,” Journal of Physics: Conference Series. 423, pp. 1-10, 2013.

[3] G. S. Siva, et al., “Cluster Analysis Approach to Study the Rainfall Pattern in Visakhapatnam District,” Weekly Science Research Journal. 1(31), 2014.

[4] R. Romero, et al., “Daily Rainfall Patterns in the Spanish Mediterranean Area: An Objective Classification,” International Journal of Climatology.19, pp. 95-112, 1999.

[5] D. Penarrocha, “Classification of Daily Rainfall Patterns in a Mediterranean Area with Extreme Intensity Levels: The Valencia Region,” Internation Journal of Climatology.22, pp. 677-695, 2002.

[6] G. Sumner, et al., “The Impact of Surface Circulations on the Daily Rainfall over Mallorca,” International Journal of Climatology.15, pp. 673–696, 1995.

[7] P. Wickramagamage, “Seasonality and spatial pattern of rainfall of Sri Lanka: Exploratory factor analysis,” International Journal of Climatology. 30, pp. 1235-1245, 2010.

[8] N. S. Chok, “Pearson’s Versus Sperman’s and Kendall’s Correlation Coefficients for Continuous Data”, 2008.

[9] Doreswamy and C. M. Vastrad, “Identification of Outliers in Oxazolines and Oxazoles High Dimension Molecular Descriptor Dataset using Principal Component Outlier Detection Algorithm and Comparative Numerical Study of Other Robust Estimators,” International Journal of Data Mining and Knowledge Management Process. 3(4), pp. 75-93, 2013.

[10] M. G. Sefidmazgi, et al., “Trend analysis using non-stationary time series clustering based on the finite element method,” Nonlinear Processes in Geophysics, vol. 21, no. 3, pp. 605–615, 2014.

[11] H. Wan, et al., “Attributing northern high-latitude precipitation change over the period 1966–2005 to human influence,” Climate Dynamics, vol. 45, no. 7, pp. 1713–1726, 2015.

[12] A. M. Rad and D. Khalil, “Appropriateness of Clustered Raingauge Stations for Spatio-Temporal Meteorological Drought Applications,” Water Resources Management, vol. 29, no. 11, pp. 4157–4171, 2015.

[13] Y. Zhang, et al., “Optimal Cluster Analysis for Objective Regionalization of Seasonal Precipitation in Regions of High Spatial–Temporal Variability: Application to Western Ethiopia,” Journal of Climate, vol. 29, no. 10, pp. 3697–3717, 2016.

[14] X. Huang, et al., “Analysis of dynamic trend-based clustering on Central Germany precipitation,” in Fifth International Workshop on Climate Informatics, (Boulder), 2015.

[15] M. G. Sefidmazgi and C.T. Marrison, “Spatiotemporal Analysis of seasonal Precipitation over US using Coclustering,” in 6 th International Workshop on Climate Informatics, 2016. Proceedings of the 6th International Workshop on Climate Informatics, pp. 41-44, 2016.

[16] S. M. Shaharudin, et al., “Identification of Rainfall Patterns on Hydrological Simulation using Robust Principal Component Analysis,” Indonesian Journal of Electrical Engineering and Computer Science (IJEECS),” vol. 11, no. 3, pp. 1162-1167, September 2018.

[17] J. Hardin, et al., “A Robust Measure of Correlation between Two Genes on a Microarray,” BMC Bioinformatics, 8(220), 2007.

[18] S. M. Shaharudin, et al., “Modified Singular Spectrum Analysis in Identifying Rainfall Trend over Peninsular Malaysia,” Indonesian Journal of Electrical Engineering and Computer Science (IJEECS),” vol. 15, no. 1, pp. 283-293, July 2019.

[19] R. Romero, et al., “A Classification of the Atmospheric Circulation Patterns Producing Significant Daily Rainfall in the Spanish Mediterranean Area”, Int. J. Climatol. 19, pp. 765-785, 1999.

[20] R. H. Compagnucci, et al., “Principal Sequence Pattern Analysis: A New Approach to Classifying the Evolution of Atmospheric Systems,” International Journal of Climatology.21, pp. 197-217, 2001.

[21] S. M. Shaharudin, et al., “The Comparison of T-mode and Pearson Correlation Matrices in Classification of Daily Rainfall Patterns in Peninsular Malaysia,” Matematika, pp. 187-194, 2013.

[22] J. Meng, and Y. Yang, “Symmetrical Two-Dimensional PCA with Image Measures in Face Recognition,” Int J Adv Robotic Sy. 9, 2012.

[23] P. Rousseeuw, and A. Leroy, “Robust Regression and Outlier Detection. New York, USA: John Wiley and Sons, Inc. 1987.

[24] M. Owen, “Tukey's Biweight Correlation and the Breakdown Thesis,” Pomona College. 2010.

[25] M. G. Sefidmazg and C. T. Morrison, “Spatiotemporal Analysis of Seasonal Precipitation over US using Coclustering,” 6 th International Workshop on Climate Informatics, pp 41-44. 2016.

[26] G. M. Mimmack, et al., “Choice of Distance Matrices in Cluster Analysis: Defining Regions,” Journal of Climate. 14, pp. 2790-2797, 2002.

[27] W.C. Chang, “On using principal components before separating a mixture of two multivariate normal populations,” J. Appl. Stat. 32, pp. 267–275, 1983.

[28] I. Mahlstein and R. Knutti, “Regional climate change patterns identified by cluster analysis,” Clim. Dyn., vol. 35, pp. 587-600, 2010.

[29] A. Kasim, et al., Applied Biclustering Methods for Big and High-Dimensional Data Using R, Taylor & Francis Group, 2017, p. 89.


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to previous page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries with this repository, kindly contact us at or Whatsapp +60163630263 (Office hours only)