UPSI Digital Repository (UDRep)
Start | FAQ | About
Menu Icon

QR Code Link :

Type :thesis
Subject :TK Electrical engineering. Electronics Nuclear engineering
Main Author :Noor Alhusna Madzlan
Title :Development of an automatic attitude recognition system: a multimodal analysis of video blogs
Place of Production :Tanjong Malim
Publisher :Fakulti Bahasa dan Komunikasi
Year of Publication :2017
Corporate Name :Universiti Pendidikan Sultan Idris
PDF Guest :Click to view PDF file

Abstract : Universiti Pendidikan Sultan Idris
Communicative  content  in  human  communication  involves  expressivity  of  socio-affective  states.  Research  in  Linguistics,  Social Signal  Processing  and  Affective Computing  in  par­  ticular, highlights the importance  of affect, emotion  and attitudes as sources of information for  communicative  content.   Attitudes,  considered  as socio-affective states of speakers,  are  conveyed  through a multitude of signals during communication.  Understanding  the expres­ sion of  attitudes of speakers is essential  for establishing  successful  communication.  Taking the  empirical  approach  to studying  attitude expressions,  the main objective  of this  research is  to contribute  to the development  of an  automatic  attitude classification system  through  a  fusion of  multimodal  signals expressed  by speakers  in  video biogs.  The present  study  de­  scribes a new communicative genre of self-expression through social media:  video blogging, which  provides opportunities  for interlocutors  to disseminate information  through  a myriad of multi  modal characteristics.  This study describes main features of this novel communica­ tion medium and  focuses attention to its possible exploitation  as a rich source of information for human  communication. The dissertation describes manual annotation of attitude expres­ sions from the vlog  corpus, multimodal feature analysis and processes for development of an automatic attitude  annotation system.  An ontology of attitude annotation scheme for speech in  video  biogs  is  elaborated  and  five  attitude  labels  are  derived.   Prosodic  and  visual  fea­ ture  extraction  procedures are explained  in detail. Discussion on processes of developing an automatic   attitude classification model  includes analysis of automatic  prediction  of  attitude labels  using prosodic and visual features through machine-learning methods. This study also elaborates  detailed analysis of individual feature contributions  and their predictive  power to the classification task  

References

 

[l]   Jens  Allwood.  A framework for studying human  multimodal  communication.  Coverbal  Syn­ 

chrony in Human-Machine  Interaction, page 17, 2013.

 

[2]  Antonio   Damasio   and   Raymond   J  Dolan.       The  feeling   of   what   happens.       

Nature,

401(6756):847-847, 1999.

 

[3]   Antonio Damasio.  Descartes'  error:  Emotion, reason and the human brain.  Random  House, 

2008.

 

[4]  James A Russell and Doreen  Ridgeway.  Dimensions  underlying children's emotion concepts.

 

Developmental  Psychology, 19(6):795,  1983.

 

[5]   M.  Wetherell.  Affect and Emotion: A New Social Science Understanding. SAGE Publications, 

2012.

 

[6]  Mark P Zanna and John  K Rempel.  Attitudes:  A new look at an old concept.  1988.

 

[7]   Eric Shouse.  Feeling, emotion, affect. Mic journal, 8(6):26, 2005.

 

[8]   Stuart Oskamp and P Wesley Schultz.  Attitudes and opinions. Psychology  Press, 1977.

 

[9]   Martin  Fishbein  and  leek  Ajzen.    Attitudes  and  opinions.    Annual  review  of  

psychology,

 

23(1):487-544,  I 972.

 

[ IO]  Veronique Auberge.  A gestalt morphology  of prosody directed  by functions:  the example 

ofa step by step model developed at icp. In Speech Prosody 2002, International Conference, 2002.

 

[11)   Jens Allwood.   Multimodal  corpora.  Corpus linguistics:  An international  handbook, l  

:207-

224, 2008

 

[12)   Alessandro  Vinciarelli  and  Gelareh  Mohammadi.   Towards  a  technology  of  nonverbal  

com­ munication:  Vocal  behavior  in social  and affective  phenomena.  Technical  report. 

igi-global,

2010

 

[13]  Mark Knapp, Judith Hall, and Terrence  Horgan.  Nonverbal communication in human interac­ 

tion. Cengage Learning, 2013.

 

[14]  Virginia P Richmond, James C McCroskey, and Steven K Payne. N0/1\'erbaI beIiav1·or 1·?? 

1·11ter- personal  relations. Prentice Hall Englewood Cliffs, NJ, 1991.

 

[15]  Paul Ekman.  Facial expression and emotion.  American  psychologist, 48(4):384, 1993.

 

[16]  Alessandro Vinciarelli  and Fabio Valente.  Social signal processing:  Understanding  

nonverbal communication in social interactions. In Proceedings of Measuring  Behavior 2010, 

Eindhoven (The Netherlands), number EPFL-CONF-163182, 20 I 0.

 

[17]  Louis-Philippe  Morency,  Rada  Mihalcea,  and  Paya)  Doshi.   Towards  multimodal  

sentiment analysis:  Harvesting opinions from the web.  In Proceedings of the 13th international 

confer­ ence on multimodal  inte1faces, pages 169-176. ACM, 2011.

 

[18]  Veronica  Rosas,  Rada Mihalcea,  and L Morency.   Multimodal  sentiment  analysis of spanish 

online videos.  2013.

 

[19]  Christer  Gobi,  Ailbhe  N1, et al.  The role of  voice quality  in communicating   emotion,  

mood and attitude.  Speech communication, 40(1):189-212, 2003.

 

[20]  Ginevra Castellano, Santiago D Villalba, and Antonio Camurri.  Recognising human emotions 

from body movement and gesture dynamics. In Affective computing and intelligent interaction, pages 

71-82. Springer, 2007.

 

[21]  Nadia  Bianchi-Berthouze, Paul Cairns,  Anna Cox, Charlene Jennett, and Whan Woong Kim.

On  posture  as  a  modality  for  expressing  and  recognizing  emotions.    In  Emotion  and  HCI 

workshop at BCS  HCI  London, 2006.

 

[22]  Dang-Khoa  Mac, Veronique  Auberge,  Albert  Rilliard,  and Eric Castelli.  Cross-cultural  

per­ ception of vietnamese audio-visual  prosodic attitudes.  In Speech Prosody, 2010.

 

[23]  Joao Antonio de Moraes, Albert Rilliard, Bruno Alberto de Oliveira Mota, and Takaaki Shochi.

 

Multimodal  perception and production of attitudinal  meaning in brazilian portuguese.  In Proc. of 

Speech Pmsody, 20 I 0.

 

[24]  Yann  Morlee,  Grard  Bailly,  and  Vronique  Auberg.   Generating  prosodic  attitudes  in  

french: Data. model and evaluation.  Speech Communication. 33(4):357-371. 2001.

 

[25] Bonn ie  A   Nardi, Diane J Schia no' and Michelle Gumbrecht. B l  ogg·mg as soc·ial act1· v1· 

ty.  or. would you let 900 million people read your ct·     ? I     p         1·

rnry.   n   rocee< mgs of the 2004 ACM  co1if<'re11ce

on   Computer supported  cooperative  work, pages 222-231. ACM. 2004.

 

[26]  DianeJ   Schiano, Bonnie  A Nardi, Michelle Gumbrecht, and Luke Swartz.  Blogging  by  the

rest of us. In CH/'04 extended  abstracts on Human factors in computing  systems, pages J  143- 

1146. ACM, 2004.

 

[27]  Lilia  Efimova  and Aldo De Moor.  Beyond  personal  webpublishing:  An exploratory  study of 

conversational  blogging  practices.   In System  Sciences,  2005.  H/CSS'05.  Proceedings  of  the 

38th Annual Hawaii  International Conference 011, pages 107a-107a. IEEE, 2005.

 

[28]  Nicole B Ellison, Charles Steinfield, and Cliff Lampe. The benefits of facebook friends: 

social capital and college students use of online social network sites. Journal of 

Computer-Mediated Communication, 12(4):1143-1168, 2007.

 

[29]  Alessandro  Acquisti  and Ralph Gross.  Imagined communities:  Awareness, information  shar­ 

ing, and privacy on the facebook.  In Privacy enhancing  technologies, pages 36-58. Springer, 2006.

 

[30]  Yusuke Yanbe, Adam Jatowt, Satoshi  Nakamura, and Katsumi Tanaka. Can social bookmark­ ing 

enhance search in the web?  In Proceedings of the 7th ACM/IEEE-CS  joint conference on Digital 

libraries, pages 107-116. ACM, 2007.

 

[31]  Thorsten  Joachims.   Optimizing  search  engines  using clickthrough  data.   In  

Proceedings  of the  eighth ACM  SJGKDD international conference  on Knowledge  discovery and data 

mining,

pages 133-142. ACM, 2002.

 

[32] Xintian Yang, Amol Ghoting, Yiye Ruan, and Srinivasan Parthasarathy. A framework for summarizing

and analyzing twitter feeds. In Proceedings of the J 8th ACM SIGKDD international

conference on Knowledge discovery and data mining, pages 370-378. ACM, 2012.

II H· M nd Xiaoiun Zeng Twitter mood predicts the stock market. Journal

 

[33] Johan Bo en, uma ao, a J •

of Computational Science, 2( I): 1-8,20 II.

G· P You are known by how you vlog: Per-

 

[34] loan-Isaac Biel, Oya Aran, and Daniel atrca- erez.

.' b'l b havior in youtube. In ICWSM. 2011.

sonality nnpressions and nonver

 

[35] Oya Aran, Joan-Isaac Biel, Daniel Gatica-Perez,et.al You are known b hiP y ow you v og: ersonality

impressions and nonverbal behavior in youtube In Proceedin 1AAAII . . I gs 0 ntemationa

Conference on Weblogs and Social Media, number EPFL-CONF-165900, 201 I.

 

[36] Martin W61Imer, Felix Weninger, Tobias Knaup, Bjorn Schuller, Congkai Sun, Kenji Sagae,

and L Morency. Youtube movie reviews: In, cross, and open-domain sentiment analysis in an

audiovisual context. 2013.

 

[37] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer,

R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot,

and E. Duchesnay. Scikit-Iearn: Machine learning in Python. Journal ofMachine Learning

Research, 12:2825-2830, 20 II.

 

[38] Sotiris B Kotsiantis, I Zaharakis, and P Pintelas. Supervised machine learning: A review of

classification techniques, 2007.

 

[39] Naresh K Malhotra. Attitude and affect: new frontiers of research in the 21 st century. Journal

ofBusiness Research, 58(4):477-482, 2005.

 

[40] I. Ajzen. Attitudes, Personality, and Behavior. Mapping social psychology. McGraw-Hili

Education, 2005.

 

[41] John A Bargh and Tanya L Chartrand. The unbearable automaticity of being. American psychologist,

54(7):462, 1999.

 

[42] James Agarwal and Naresh K Malhotra. An integrated model of attitude and affect: Theoretical

foundation and an empirical investigation. Journal ofBusiness research, 58(4):483-493, 2005.

 

[43] Yan Lu, Veronique Auberge, Albert Rilliard, et al. Do you hear my attitude? prosodic perception

of social affects in mandarin. Proceedings of Speech Prosody 2012, pages 685-688,

2012.

 

[44] Jens Allwood, Stefano Lanzini, and Elisabeth Ahlsen. Contributions of different modalities to

the attribution of affective-epistemic states. In Proceedings from the l st European Symposium

on Multimodal Communication University of Malta, pages 1-6.

 

[45] Massimo Chindamo, Julian Allwood, and Elisabeth Ahlsen. Some suggestions for the study of

In P"I"'(IL',", Security Risk and Trtlsl (PA55AT). 2012 lnternatirmal

stance in communication,

Conference on and 2012 International Con,Fje. rn• ece 011 Soc'ial Computing (Socia/Colli), pages

617-622. IEEE, 2012.

 

[46] Patrizia Paggio, Jens Allwood, Elisabeth Ahlsen a d K'" J ki

, n nstuna 0 men. The nomco multimodal

nordic resource-goals and characteristics. 20 I O.

 

[47] Peter Juel Henrichsen and Jens Allwood Predicting the attit d fI . di I b d . u e ow 10 ia ogue ase on

multi-modal speech cues. NEALT PROCEEDINGS SERIES, 2012.

 

[48] Yann Morlec, Gerard Bailly, and Veronique Auberge. Generating the prosody of attitudes. In

Intonation: Theory, Models and Applications, 1997.

 

[49] Jean-Marc Blanc and Peter Ford Dominey. Identification of prosodic attitudes by a temporal

recurrent network. Cognitive Brain Research, 17(3):693-699,2003.

 

[50] Albert Rilliard, Jean-Claude Martin, Veronique Auberge, Takaaki Shochi, et al. Perception of

french audio-visual prosodic attitudes. Speech Prosody. Campinas, Brasil, 2008.

 

[51] Plinio Barbosa and Gerard Bailly. Characterisation of rhythmic patterns for text-to-speech

synthesis. Speech Communication, 15(1-2):127-137, 1994.

 

[52] Vasiliki Orgeta and Louise H Phillips. Effects of age and emotional intensity on the recognition

of facial emotion. Experimental aging research, 34( I ):63-79,2007.

 

[53] Alberto Di Domenico, Rocco Palumbo, Nicola Mammarella, and Beth Fairfield. Aging and

emotional expressions: is there a positivity bias during dynamic emotion recognition? Frantiers

in psychology, 6, 2015.

 

[54] Judith A Hall and David Matsumoto. Gender differences in judgments of multiple emotions

from facial expressions. Emotion, 4(2):201,2004.

 

[55] D Matsumoto and P Ekman. Japanese and caucasian brief affect recognition test (jacbart), i,

ii, iii [videotapes]. Available from Culture and Emotion Research Laboratory, Department of

Psychology, San Francisco State University, 1600, 1992.

 

[56] Jens Allwood. Cooperation and flexibility in multi modal communication. In Cooperative

Multimodal Communication, pages 113-124. Springer, 200 I.

 

[57] Carlos Busso, Zhigang Deng, Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe

b k L UI' h Neumann and Shrikanth Narayanan. Analysis of erno-

Kazernzadeh, Sung 0 ee,tion recognition using facial expressions, speech and multi d I' ". . mo a inrormanon. In Proceedings

ofthe 6th international conference on Multilllodal interfaces; pages 205-211. ACM, 2004.

 

[58] Christina Regenbogen, Daniel A Schneider' Raquel E Gur" Fral'k Schnelider, Ute Habel, and

Thilo Kellermann. Multimodal human communicationtargeting facial . expressions. speech content

and prosody. Neuroimage, 60(4):2346-2356, 2012.

 

[59] David Crystal. The English tone of voice: Essays in intonation, prosody and paralanguage.

Hodder Arnold, 1975.

 

[60] Peter Roach. English Phonetics and Phonology Fourth Edition: A Practical Course. Ernst

Klett Sprachen, 2010.

 

[61] Sieb Nooteboom. The prosody of speech: melody and rhythm. The handbook of phonetic

sciences, 5:640-673, 1997.

 

[62] John Laver. Principles ofphonetics. Cambridge University Press, 1994.

 

[63] Peter Roach. Techniques for the phonetic description of emotional speech. In ISCA Tutorial

and Research Workshop (ITRW) on Speech and Emotion, 2000.

 

[64] Petri Laukka, Patrik Juslin, and Roberto Bresin. A dimensional approach to vocal expression

of emotion. Cognition & Emotion, 19(5):633-653,2005.

 

[65] Helen M Hanson and Erika S Chuang. Glottal characteristics of male speakers: Acoustic

correlates and comparison with female data. The Journal ofthe Acoustical Society ofAmerica,

106(2):1064-1077,1999.

 

[66] Nick Campbell and Parham Mokhtari. Voice quality: the 4th prosodic dimension. In 15th

ICPhS, pages 2417-2420, 2003.

 

[67] Frank Dellaert, Thomas Polzin, and Alex Waibel. Recognizing emotion in speech. In Spoken

Language, 1996. 1CSLP 96. Proceedings., Fourth International Conference on, volume 3,

pages 1970-1973. IEEE, 1996.

 

[68] Tin Lay Nwe, Foo Say Wei, and Liyanage C De Silva. Speech based emotion classification. In

TENCON 2001. Proceedings of IEEE Region 10 International Conference 011 Electrical and

Electronic Technology, volume I, pages 297-30 I. IEEE, 200 I.

 

[69]

. d AILBHE NI CHASAIDE Voice quality and loudness

Irena Yanushevskaya, Chnster GobI, an

.

in affect perception. 2008.

 

[70]  Sylvie Mozzicon acci.  Emotion  and attitude  conveyed  in speech  by means of  prosody.  I n 

2nd Workshop on Attitude,  Personality and  Emotions in User-Adapted  illteraction. Citeseer. 200 

I.

 

[71]  Julia Hirschberg, Diane Litman, and Marc Swerts.  Prosodic and other cues to speech  recogni­ 

tion failures.  Speech Communication, 43(1):155-175, 2004.

 

[72]  Paul  Ekman  and  Harriet  Oster.  Facial  expressions  of emotion.  Annual  review of  

psychology,

 

30(1):527-554,  1979.

 

[73]   Klaus R Scherer. The Functions of Nonverbal Signs in Conversation, chapter 8, pages 225-244. 

Hillsdale:  L. Erlbaum,  1980.

 

[74]  Juergen  Luettin  and  Neil  A  Thacker.   Speechreading  using  probabilistic  models.   

Computer Vision and Image Understanding, 65(2):163-178,  I 997.

 

[75]  Paul  Ekman.   About  brows:  Emotional  and  conversational  signals.   Human  ethology,  

pages

169-202, 1979.

 

[76]  Paul  Ekman.   Telling  lies:  Clues  to deceit in  the marketplace,  politics,  and  

marriage.   WW

Norton & Company,  2009 .

 

[77]  Javid Sadro, lzzat Jarudi, and  Pawan  Sinhao.  The role of eyebrows  in face recognition.   

Per­ ception, 32(3):285-293, 2003.

 

[78]  Timothy  F  Cootes,  Christopher  J  Taylor,  David  H  Cooper,  and  Jim  Graham.   Active  

shape models-their  training  and  application.   Computer vision  and image understanding, 61(I 

):38-59, 1995.

 

[79]  Javier R Movellan.   Visual  speech  recognition  with stochastic  networks.  Advances  in 

neural infonnation  processing  systems, pages 851-858, 1995.

 

[80]  Beat Fasel and Juergen  Luettin.  Automatic facial expression  analysis: a survey.  Pattern 

recog­ nition, 36(1):259-275, 2003.

 

[81]  Paul Ekman  and Wallace  V Friesen.  Measuring  facial  movement.  Environmental  psyclwlogr 

and nonverbal behavior,  I (I   ):56-75,  1976.

 

[82]  Marian  Stewart  Bartlett,  Gwen  C Littlewort,  Mark  G  Frank, Claudia  Lainscsek,  Ian R   

Fasel.,j      and  JavierR   Movellan.   Automat ic  recognition   of  facial  actions  in  spontaneous  

exp r e s s  ions.Journal cl multimedia, 1(6):22-35. 2006.

 

[83]  Takeo  Kanade,  Jeffrey  F Cohn,  and  Yingli  Tian.  Comprehensive  database  for facial  

expres­ sion  analysis.  In  Automatic  Face and Gesture  Recognition,  2000.  Proceedings.  Fourth 

 IEEE International  Conference on, pages 46-53. IEEE, 2000.

 

[84]  Timothy F Cootes, Gareth J Edwards, and Christopher J Taylor.  Active appearance models.  In

 

Computer VisionECCV98, pages 484-498. Springer, 1998.

 

[85]  Timothy  F  Cootes,  Gareth  J  Edwards,  and  Christopher  J  Taylor.    Comparing  active  

shape models with active appearance  models.  In BMVC, volume 99, pages 173-182, 1999.

 

[86]  Chalapathy  Neti, Gerasimos Potamianos, Juergen Luettin, lain Matthews, Herve Glotin, Dimi­ 

tra Vergyri, June Sison, and Azad Mashari.  Audio visual speech  recognition.  Technical report, 

IDIAP, 2000.

 

[87]  Minh  Tue  Vo  and  Alex  Waibel.   Multimodal  human-computer  interaction.    Proceedings  

of ISSD, 93, 1993.

[88]  Maja Pantie and Leon JM Rothkrantz. Toward an affect-sensitive multimodal human-computer 

interaction.  Proceedings of the IEEE, 91(9):1370-1390, 2003.

 

[89]  Matthew  Turk.    Multi modal  human-computer  interaction.    In  Real-time  vision  for  

human­ computer interaction, pages 269-283. Springer, 2005.

 

[90]  Matthew  Turk  and  Mathias  Kolsch.    Perceptual  interfaces.    Emerging  Topics  in  

Computer Vision, Prentice Hall, 2004.

 

[91]  Takaaki  Shochi, Donna  Erickson,  Albert  Rilliard,  Veronique  Auberge,  Jean-Claude  

Martin, et al.   Recognition  of  japanese  attitudes  in  audio-visual  speech.   In  Speech  

prosody,  volume 2008, pages 689-692, 2008.

 

[92]  Loic Kessous, Ginevra Castellano, and George Caridakis.  Multimodal emotion recognition in 

speech-based  interaction  using facial expression, body gesture and acoustic analysis.  Journal on 

Multimodal User lnte1faces, 3(1-2):33--48, 20 I 0.

 

[93]  Heather  Molyneaux,  Kerri  Gibson,  Susan  O'Donnell,  and  Janice Singer.   New  visual  

media and gender:  A content, visual and audience analysis of youtube vlogs.  2008.

 

[94]  James Trier.  cool engagements  with youtube:  Part  I.  Journal of Adolescent  & Adult  

Literacy,50(5):408--4 I 2, 2007

 

[95]  Prince  Deh.   Promoting  information-sharing  in  ghana  using  video  blogging.    

Participatory Learning and Action, 59( I ):40-43, 2009.

 

[96]  Wen Gao, Yonghong Tian, Tiejun  Huang, and Qiang Yang.  Vlogging:  A survey of videoblog­ 

ging technology on the web.  ACM  Computing  Surveys (CSUR), 42(4):15, 2010.

 

[97]  Elisabetta Adami.  we/youtube: exploring sign-making in video-interaction. Visual 

Co1111n11ni­ cation, 8(4):379-399, 2009.

 

[98]  Joan-Isaac Biel and Daniel Gatica-Perez.  Vlogcast yourself:  Nonverbal  behavior and 

attention in social  media.  In  International  Conference  on Multimodal  Inte1faces and  the 

Workshop  on Machine Learning for Multimodal  Interaction, page 50. ACM, 20 I 0.

 

[99]  Joan-Isaac  Biel  and  Daniel  Gatica-Perez.   The  youtube  lens:  Crowdsourced  personality 

 im­ pressions and audiovisual  analysis of vlogs.  Multimedia, IEEE Transactions on,  15( I 

):41-55,2013

 

[100]  Dairazalia  Sanchez-Cortes, Joan-Isaac  Biel, Shiro  Kumano, Junji  Yamato,  Kazuhiro 

Otsuka,and Daniel Gatica-Perez.  Inferring  mood  in ubiquitous conversational  video.  In  Proceedings

I of  the 12th International  Conference  on Mobile and  Ubiquitous  Multimedia, page 

22. ACM,2013.

 

[101]  Joan-Isaac Biel and Daniel Gatica-Perez.  Wearing a youtube hat: directors, comedians, 

gurus, and  user aggregated  behavior.   In  Proceedings  of the  17th ACM  international  

conference  on Multimedia, pages 833-836. ACM, 2009.

 

[102]  Joan-Isaac Biel  and  Daniel  Gatica-Perez.   Vlogsense:  Conversational behavior and  

social  at­ tention in youtube.  ACM  Transactions  on Multimedia  Computing,  Communications, and  

Ap­ plications (TOMCCAP ), 7(1 ):33, 2011.

 

[103)  Joan-Isaac Biel and Daniel Gatica-Perez.  The good, the bad, and the angry:  Analyzing 

crowd­ sourced impressions of vloggers.  Ethnicity, 16(4.8):0-7, 2012.

 

[104]  Joan-Isaac  Biel,  Vagia  Tsiminaki,  John  Dines,  and  Daniel  Gatica-Perez.   Hi  

youtube!:   per­ sonality  impressions  and  verbal  content  in social  video.  In  Proceedings  

of the  15th ACM  on International  conference on multimodal interaction, pages 119-126. ACM, 20 I 

3.

 

[105]  Robert R McCrae and Oliver P John.  An introduction  to the five-factor model  and its 

applica­ tions.  Personality:  critical concepts in psychology, 60:295. 1998.

 

[106]  Paul  Boersma.     Praat,a system  for  doing  phonetics  by  computer.     Glot  International,

5(9/10):34 1- 3 4 5, 2001.

 

[107]  Ligia Maria Batrinca,  Nadia  Mana,  Bruno  Lepri,  Fabio  Pianesi,  and  Nicu Sebe.   

Please,  tell me about  yourself:  automatic  personality  assessment  using short 

self-presentations.  In  Pro­ ceedings of the 13th international conference on multimoda/  

inte1faces, pages 255-262. ACM, 2011.

 

[108]  Rosalind  W Picard  and  Roalind  Picard.   Affective  computing,  volume  252.  MIT press 

Cam­ bridge, 1997.

 

[109]  Cindy  L Bethel  and  Robin  R Murphy.  Survey  of  non-facial/non-verbal affective 

expressions for appearance-constrained  robots.  Systems,  Man, and  Cybernetics,  Part C: 

Applications and Reviews, IEEE Transactions on, 38(1):83-92, 2008.

 

[110]  Ashish  Kapoor, Winslow  Burleson,  and Rosalind  W Picard.  Automatic  prediction of 

frustra­ tion.  International  journal of human-computer studies, 65(8):724-736, 2007.

 

[111]  Ashish  Kapoor  and  Rosalind  W Picard.   Multimodal  affect  recognition  in  learning  

environ­ ments.  In Proceedings of the 13th annual ACM international conference on Multi media, 

pages 677-682. ACM, 2005.

 

[112]  Laurel  D  Riek,  Maria  F Oconnor,  and  Peter  Robinson.   Guess  what?   a  game for 

affective annotation of  video using crowd sourcing.  In Affective computing  and intelligent 

interaction, pages 277-285. Springer, 2011.

 

[113]  Eva Schultze-Berndt.  Linguistic  annotation.  Essentials of language documentation, 

178:213,

2006

 

[114]  Stefanie  Nowak and Stefan  Ruger.  How  reliable are annotations  via crowdsourcing:  a 

study about inter-annotator agreement for multi-label  image annotation.  In Proceedings of the 

inter­ national conference on Multimedia information  retrieval, pages 557-566. ACM, 2010.

 

[115]  Jens Allwood, Loredana Cerrato, Laila Dybkjaer,  Kristiina Jokinen, Costanza  Navarretta, 

and Patrizia  Paggio.  The  mumin  multimodal  coding scheme.   In  Proc. Workshop  on Multimodal 

Corpora and Annotation, 2005.

 

[116]  Michael  Kipp.   Graduate  college  for cognitive  sciences  university  of  the saarland,  

germany.2001

 

[117]  Magnus Gunnarsson.   User  man ual  for multitool.  Technical  report, Technical  Repo rt  

avail ale from http://www. ling. gu. se/mgunnar/multitool/MT-manual. pdf, 2002.

 

[118]  Niels Ole  Bernsen, Laila  Dybkjaer,  and  Mykola  Kolodnytsky.  The  nite  workbench-a  

tool  for annotation  of  natural interactivity  and  multimodal  data.  In Las Pa/mas. Citeseer, 

2002.

 

[119]  Ron Artstein and Massimo Poesio.  Inter-coder agreement for computational  linguistics.  

Com­ putational  Linguistics, 34(4):555-596, 2008.

 

[120]  Jacob Cohen et al.  A coefficient  of agreement for nominal  scales.  Educational and  

psycholog­ ical measurement, 20( I ):37-46, 1960.

 

[121]  Joseph  L  Pleiss,  Jacob  Cohen,  and  BS  Everitt.   Large  sample  standard  errors  of  

kappa  and weighted  kappa.  Psychological  Bulletin, 72(5):323,  1969.

 

[122]  Ron  Kohavi  and  Foster  Provost.    Glossary  of  terms.   Machine  Learning,  

30(2-3):271-274, 1998.

 

[123]  Eva Szekely, John  Kane, Stefan  Scherer, Christer  Gobi,  and  Julie Carson-Berndsen.   

Detect- ')            ing  a targeted  voice style in  an audiobook   using  voice quality  

features.   In  Acoustic s, Speech and Signal  Processing (ICASSP), 2012 IEEE International  

Conference on,  pages 4593-4596.

IEEE, 2012.

 

[124]  Chu-Hsing  Lin, Jung-Chun  Liu, and Chia-Han  Ho.  Anomaly  detection  using libsvm  

training tools.  In  Information  Security and  Assurance,  2008. ISA 2008. International  

Cmiference  on, pages 166-171. IEEE, 2008.

 

[125]  Guido W Imbens and Richard Spady.  Confidence intervals in generalized  method  of  moments 

models.  Journal of econometrics,  I 07( I ):87-98, 2002.

 

[126]  Marian Stewart Bartlett, Gwen  Littlewort, Claudia Lainscsek,  Ian Fasel, and Javier 

Movellan.Machine learning  methods  for fully automatic  recognition  of facial expressions  and  facial  

ac­ tions.  In  Systems,  Man and  Cybernetics,  2004  IEEE International  Conference 011,  volume  

I, pages 592-597. IEEE, 2004.

 

[127]  Kristiina Jokinen, Costanza Navarretta, and Patrizia Paggio. Distinguishing the 

communicative functions of gestures. In Machine Learning for Multimodal Interaction, pages 38-49. 

Springer,2008

 

[ 128]  Chih-Chung  Chang and Chih- Jen Lin.  Libsvm:  A library for support  vector machines.  ACM 

Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011.

 

[129]  Hatice Gunes, Caifeng Shan, Shizhi  Chen, and  YingLi  Tian.  Bodily expression  for 

automatic affect recognition.  Emotion  Recognition:  A Plittem  Analysis Approach, pages 343-377, 

2015.

 

[130]  I Mccowan, G Lathoud, M Lincoln, A Lisowska, W Post, D Reidsma, and P Wellner. The ami 

meeting corpus.  In In:  Proceedings  Mells11ri11g  Behavior 2005, 5th Intemational  Conference on 

Methods and Techniques in Behavioral Research. LP11 Nold11s, F. Grieco, LWS Loijens llnd PH 

Zimmerman (Eds.), Wageningen:  Noldus  Informlltion Technology, 2005.

 

[131]  Shannon   Hennig,   Ryad   Chellali,   and   Nick  Campbell.      The  d-ans  corpus:    the 

 dublin­ autonomous  nervous system corpus of  biosignal  and  multimodal  recordings of 

conversational speech. In Proceedings of the ELRA, the 9th Edition of the Language Resources and 

Evaluation Conference.  Reykjavik, Iceland, pages 26-31, 2014.

 

[132]  Joan-Isaac Biel, Daniel Gatica-Perez, et al.  Voices of vlogging.  In ICWSM, 20 I 0.

 

[133]  Casey    Neistat.           How    To    Vlog.          https://www.youtube.com/watch?v= 

dGLE E Z z l  5N4, 2015.  [Online ; accessed 07-July-2015].

 

[134]  Miss  Fenderr.    HOW  I  MAKE  MY  VIDEOS!(Step-By-Step  Tutorial).    https: / /www. 

youtube. com/watch?v=dGLEEZZ15N4, 2014.  [Online; accessed 07-July-2015].

 

[135]  Ruchard  Dufour,  Vincent  Jousse,  Yannick  Esteve,  Frederic  Bechet,  and  Georges  

Linares.Spontaneous  speech  characterization  and  detection  in  large  audio  database.    SPECOM,  St. 

Petersburg, 2009.

 

[136]  Timothy  Delaghetto.     Annoying  People  I  Hate  #2.     https://www.youtube.com/ 

watch?v=pY7n5mpG5mU, 2014.  [Online; accessed 07-March-2014].

 

[137]  Timothy  Delaghetto.   Be  a Gentleman,  Get  the  Booty.   https: //www.youtube.com/ 

watch?v=jbqW7SUjuT0, 2012.  [Online; accessed 07-March-2014].

 

[138]  British  Psychological  Society.   Report  of  the  working  party  on  conducting  research 

 on  the internet: guidelines for ethical practice in psychological  research online. British 

Psychological Society Leicester, 2007.

 

[139]  Paul Reilly.  The battle of stokes crofton youtube: The development of an ethical stance 

for the study of online comments.  201 3 .

 

[140]  Fabian Neuhaus   and Timothy Webmoor.  Agile ethics for massi fie d research and vi 

sualization.

Infonnation, Communication  & Society, 15( I ):43-65, 2012.

 

[141]  Stephen Pihlaja.  Antagonism  on Yo11Tube:  Metaphor in Online Discourse.  Bloomsbury  Pub­ 

lishing, 2014.

 

[142]  Youtube     copyright     centre.              https://www.youtube.com/yt/copyright/ 

fair-use. html#yt-copyright-resources. Accessed:  2012-12-14.

 

[143]  Niga  Higa.  Off The Pill - Arrogant  People).  ht  tps: / /www. yout ube. com/watch ?v= 

7sz5cI5lenE, 2011.  [Online; accessed 07 March-2014].

 

[144]  Sound systems:  Mono versus stereo.  http://www. mcsquared. com/mono-stereo. htm. Accessed:  

2012-12-14.

 

[145]  James  A Russell.  Core affect and the psychological  construction  of emotion.   

Psychological review, 110(1):145, 2003.

 

[146]  Kre Sjlander and Jonas Beskow.  Wavesurfer - an open source speech tool, 2000.

 

[147]  P. Wittenburg, H. Brugman,  A. Russel,  A. Klassmann,  and H. S!oetjes.  Elan:  a  profe 

ssional framework for multimodality  research.  In Proceedings of Language  Resources and  

Evaluation Conference (LREC), 2006.

 

[148]  Noor A Madzlan, J Reverdy,  Francesca  Bonin, Loredana Sundberg Cerrato, and  Nick Camp­ 

bell. Multi modal perception of attitudes:  A study on video biogs. In 3rd European Symposium on 

Multimodal  Communication, 2015.

 

[149]  Ekaterina  P  Volkova,  Betty  J  Mohler,  Detmar  Meurers,  Dale  Gerdemann,  and  Heinrich 

 H Biilthoff.  Emotional  perception  of  fairy tales:  achieving agreement  in emotion  annotation 

 of text.   In  Proceedings  of  the  NAACL  HLT 2010  Workshop  on  Computational  Approaches  to 

Analysis  and  Generation  of  Emotion  in  Text,  pages  98-106.  Association  for Computational

Linguistics, 20 I 0.

 

[150]  Bjorn Schuller.  Multi modal affect databases:  Collection, challenges, and chances.  T/1e 

Ox.frml Handbook  of Affective Computing, pages 323-333, 2014 .

 

[151]  Judith  A Hall.  Gender effects in decoding nonverbal cues.  Psychological bulletin, 

85(4):845,1978

 

[152]  Hillary  Anger  Elfenbein  and  Nalini  Ambady.   On  the universality  and  cultural  

specificity  of emotion recognition:  a meta-analysis.  Psychological bulletin, 128(2):203, 2002.

 

[153]  Aoju Chen.  Unil'ersal  and  language-spec(fic perception

mean­ ing. Utrecht:  LOT, 2005.

 

[154]  Loredana  Sundberg  Cerrato.   Investigating  communicative  feedback  phenomena  across  

lan­ guages and  modalities.  2007.

 

[155]  Marc  Schroder,  Roddy  Cowie,  Ellen  Douglas-Cowie,  Machiel  Westerdijk,  and  Stan  CAM 

Gielen.   Acoustic  correlates  of  emotion  dimensions  in  view  of speech  synthesis.   In  

INTER­ SPEECH, pages 87-90, 2001.

 

[156]  James Hillenbrand, Ronald A Cleveland, and Robert L Erickson. Acoustic correlates of breathy 

vocal quality.  Journal of Speech, Language, and Hearing  Research, 37(4):769-778, 1994.

 

[157]   Yen-Liang  Shue,  Gang  Chen,  and  Abeer  Alwan.   On  the  interdependencies between  

voice quality,  glottal  gaps, and  voice-source  related  acoustic  measures.   In  /NTERSPEECH,  

pages 34-37, 2010.

 

[158]   Pedro Tome, Julian  Fierrez,  Ruben  Vera-Rodriguez. and  Daniel  Ramos.   Identification  

using face regions:  Application and assessment in forensic scenarios.  Forensic science 

international, 233(1 ):75-83, 2013.

 

[159]   Hua Wang, Heng  Huang, and  Fillia  Makedon.   Emotion  detection  via discriminant  

laplacian embedding.  Universal access in the information  society, I 3(1):23-31, 2014.

 

[160]    Isabelle Guyon  and  Andre Elisseeff.   An  introduction  to variable  and feature 

selection.   The Journal of Machine Learning  Research, 3:1157-1182, 2003.

 

[161]   R Core Team.  R: A Language  and  Environment  for Statistical  Computing.  R Foundation  

for Statistical Computing, Vienna, Austria, 2013.

 

[162]   Karl  Pearson.  Note on regression  and  inheritance  in  the case of  two parents.  

Proceedings of the Royal Society of London, pages 240-242,  I 895.

 

[163]  Bernard  L Welch.  The generalization ofstudent's'  problem  when several different  

population variances are involved.  Biometrika, pages 28-35, 1947.

 

[164]  R. Mankiewicz.   The  story of  mathematics.   The story  of  mathematics.  Princeton  

University Press, 2000.

 

[165]   Graeme D Ruxton.  The  unequal  variance  Hest  is an  underused  alternative  to 

student's  t-test and the mann-whitney  u test.  Behavioral  Ecology, 17(4):688-690, 2006.

[I 66]   Jarkko Isotalo.  Ba,ics of statistics.  University

 

[167]   CORRELATION  DEDUCED  FROM A SMALL SAMPLE.  correlation  coefficients covering the  cases 

(i)the frequency  dis-tribution  of  the  values  of  the correlation  coefficient  in  samples 

from an indefinitely  large population, biometrika,  vol.  I 0, pp. 507v52 I, 1915. here the method 

of defining the sample by the coordinates of.  I 921.

 

[168]   David  Martin Powers.  Evaluation:  from  precision,  recall and f-measure to roe, 

informedness, markedness and correlation.  201 I.

 

[169]   Herve  Abdi  and  Lynne  J  Williams.   Principal  component  analysis.   Wiley  

Interdisciplinary Reviews:  Computational  Statistics, 2(4):433-459,  2010.

 

[170]   Noor  Alhusna  Madzlan,  Jingguang  Han, Francesca  Bonin, and  Nick  Campbell.  Towards  

au­ tomatic  recognition  of  attitudes:  Prosodic  analysis  of  video  biogs.   Speech  Prosody,  

Duhlin, Ireland, pages 91-94, 20 I 4.

 

[171]   Noor  Alhusna  Madzlan,  Yuyun  Huang,  and  Nick  Campbell.    Speech  and  Computer:   

17th International  Conference,  SPECOM  2015, Athens, Greece, September 20-24, 2015,  Proceed­ 

ings, volume 9319, chapter Automatic Classification and Prediction of Attitudes:  Audio-Visual 

Analysis of  Video Biogs.  Springer, 2015.

 

[172]   Helen M Hanson.  Glottal characteristics offemale speakers:  Acoustic correlates.  The 

Journal of the Acoustical  Society of America,  IOI :466, 1997.

 

[173]   Noor  Alhusna  Madzlan,  JingGuang  Han,  Francesca  Bonin,  and  Nick  Campbell.   

Automatic recognition of attitudes in video biogs-prosodic and visual feature analysis. In 

Fifteenth Annual Conference of the International Speech Communication  Association, 2014.

 

[174]   Gerasimos Potamianos, Chalapathy  Neti, Guillaume Gravier, Ashutosh Garg, and  AndrewW 

Senior.  Recent  advances  in the automatic  recognition  of  audiovisual  speech.   Proceeding.1·  

cf the IEEE, 91(9):1306-1326, 2003.

 

[175]  Sanaul  Haq, Philip JB Jackson, and James Edge.  Audio-visual  feature selection and 

reduction for emotion classification. In Proc. Int. Conj. on Auditory-Visual Speech Processing (  

AVSPOR ),Tangalooma, Australia, 2008.

 

[176]  Wei Fan.  Systematic data selec tion  to mine concept-drifting data streams.  In  

Proceedings

 mining, pages 128-137. ACM, 2004.

 

[177]  Chih-Wei  Hsu, Chih-Chung  Chang, Chih-Jen  Lin, et al.   A practical  guide  to support  

vector classification, 2003.

 

[178]  Tin  Kam  Ho.  Random  decision  forests.  In  Document  Analysis and  Recognition, 1995.,  

Pro­ ceedings of the Third  International  Co1ifere11ce on, volume  I, pages 278-282. IEEE, 1995.

 

[179]  J Hohnsbein and S Mateeff. The time it takes to detect changes in speed and direction of 

visu:..I motion.  Vision  Research, 38( 17):2569-2573, 1998.

 

[180]  SK DMello  and  AC Graesser.   Feeling,  thinking,  and  computing  with  affect-aware  

learning technologies.  The Oxford  Handbook of Affective Computing, pages 4 I 9-434, 20 I 4.

 

[181]  Stephen Brown.  Meet pepper, the emotion reading robot.  TECHNOLOGY, 2014.

 

[182]  Cory D Kidd, Will Taggart, and Sherry Turkle.  A sociable robot to encourage social 

interaction among  the elderly.   In  Robotics and  Automation,  2006.  JCRA 2006.  Proceedings  

2006  IEEE International  Conference 011, pages 3972-3976. IEEE, 2006.

 

[183]  Rosalind W Picard, Seymour Papert, Walter Bender, Bruce Blumberg, Cynthia Breazeal, David 

Cavallo, Tod Machover, Mitchel Resnick, Deb Roy, and Carol Strohecker.  Affective learninga 

manifesto.  BT Technology  Journal, 22(4):253-269, 2004.

 

[184]  J.  Burgess,  J. Green,  H. Jenkins,  and J. Hartley.   YouTube:  Online Video and  

Participatory Culture. DMS - Digital  Media and Society. Wiley, 2013.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials.
You may use the digitized material for private study, scholarship, or research.

Back to previous page

Installed and configured by Bahagian Automasi, Perpustakaan Tuanku Bainun, Universiti Pendidikan Sultan Idris
If you have enquiries, kindly contact us at pustakasys@upsi.edu.my or 016-3630263. Office hours only.