UPSI Digital Repository (UDRep)
|
|
|
Abstract : Universiti Pendidikan Sultan Idris |
Communicative content in human communication involves expressivity of socio-affective
states. Research in Linguistics, Social Signal Processing and Affective Computing in par
ticular, highlights the importance of affect, emotion and attitudes as sources of information for
communicative content. Attitudes, considered as socio-affective states of speakers, are
conveyed through a multitude of signals during communication. Understanding the expres sion of
attitudes of speakers is essential for establishing successful communication. Taking the
empirical approach to studying attitude expressions, the main objective of this research is
to contribute to the development of an automatic attitude classification system through a
fusion of multimodal signals expressed by speakers in video biogs. The present study de
scribes a new communicative genre of self-expression through social media: video blogging, which
provides opportunities for interlocutors to disseminate information through a myriad of multi
modal characteristics. This study describes main features of this novel communica tion medium and
focuses attention to its possible exploitation as a rich source of information for human
communication. The dissertation describes manual annotation of attitude expres sions from the vlog
corpus, multimodal feature analysis and processes for development of an automatic attitude
annotation system. An ontology of attitude annotation scheme for speech in video biogs is
elaborated and five attitude labels are derived. Prosodic and visual fea ture
extraction procedures are explained in detail. Discussion on processes of developing an automatic
attitude classification model includes analysis of automatic prediction of attitude labels
using prosodic and visual features through machine-learning methods. This study also elaborates
detailed analysis of individual feature contributions and their predictive power to
the classification task
|
References |
[l] Jens Allwood. A framework for studying human multimodal communication. Coverbal Syn chrony in Human-Machine Interaction, page 17, 2013.
[2] Antonio Damasio and Raymond J Dolan. The feeling of what happens. Nature, 401(6756):847-847, 1999.
[3] Antonio Damasio. Descartes' error: Emotion, reason and the human brain. Random House, 2008.
[4] James A Russell and Doreen Ridgeway. Dimensions underlying children's emotion concepts.
Developmental Psychology, 19(6):795, 1983.
[5] M. Wetherell. Affect and Emotion: A New Social Science Understanding. SAGE Publications, 2012.
[6] Mark P Zanna and John K Rempel. Attitudes: A new look at an old concept. 1988.
[7] Eric Shouse. Feeling, emotion, affect. Mic journal, 8(6):26, 2005.
[8] Stuart Oskamp and P Wesley Schultz. Attitudes and opinions. Psychology Press, 1977.
[9] Martin Fishbein and leek Ajzen. Attitudes and opinions. Annual review of psychology,
23(1):487-544, I 972.
[ IO] Veronique Auberge. A gestalt morphology of prosody directed by functions: the example ofa step by step model developed at icp. In Speech Prosody 2002, International Conference, 2002.
[11) Jens Allwood. Multimodal corpora. Corpus linguistics: An international handbook, l :207- 224, 2008
[12) Alessandro Vinciarelli and Gelareh Mohammadi. Towards a technology of nonverbal com munication: Vocal behavior in social and affective phenomena. Technical report. igi-global, 2010
[13] Mark Knapp, Judith Hall, and Terrence Horgan. Nonverbal communication in human interac tion. Cengage Learning, 2013.
[14] Virginia P Richmond, James C McCroskey, and Steven K Payne. N0/1\'erbaI beIiav1·or 1·?? 1·11ter- personal relations. Prentice Hall Englewood Cliffs, NJ, 1991.
[15] Paul Ekman. Facial expression and emotion. American psychologist, 48(4):384, 1993.
[16] Alessandro Vinciarelli and Fabio Valente. Social signal processing: Understanding nonverbal communication in social interactions. In Proceedings of Measuring Behavior 2010, Eindhoven (The Netherlands), number EPFL-CONF-163182, 20 I 0.
[17] Louis-Philippe Morency, Rada Mihalcea, and Paya) Doshi. Towards multimodal sentiment analysis: Harvesting opinions from the web. In Proceedings of the 13th international confer ence on multimodal inte1faces, pages 169-176. ACM, 2011.
[18] Veronica Rosas, Rada Mihalcea, and L Morency. Multimodal sentiment analysis of spanish online videos. 2013.
[19] Christer Gobi, Ailbhe N1, et al. The role of voice quality in communicating emotion, mood and attitude. Speech communication, 40(1):189-212, 2003.
[20] Ginevra Castellano, Santiago D Villalba, and Antonio Camurri. Recognising human emotions from body movement and gesture dynamics. In Affective computing and intelligent interaction, pages 71-82. Springer, 2007.
[21] Nadia Bianchi-Berthouze, Paul Cairns, Anna Cox, Charlene Jennett, and Whan Woong Kim. On posture as a modality for expressing and recognizing emotions. In Emotion and HCI workshop at BCS HCI London, 2006.
[22] Dang-Khoa Mac, Veronique Auberge, Albert Rilliard, and Eric Castelli. Cross-cultural per ception of vietnamese audio-visual prosodic attitudes. In Speech Prosody, 2010.
[23] Joao Antonio de Moraes, Albert Rilliard, Bruno Alberto de Oliveira Mota, and Takaaki Shochi.
Multimodal perception and production of attitudinal meaning in brazilian portuguese. In Proc. of Speech Pmsody, 20 I 0.
[24] Yann Morlee, Grard Bailly, and Vronique Auberg. Generating prosodic attitudes in french: Data. model and evaluation. Speech Communication. 33(4):357-371. 2001.
[25] Bonn ie A Nardi, Diane J Schia no' and Michelle Gumbrecht. B l ogg·mg as soc·ial act1· v1· ty. or. would you let 900 million people read your ct· ? I p 1· rnry. n rocee< mgs of the 2004 ACM co1if<'re11ce on Computer supported cooperative work, pages 222-231. ACM. 2004.
[26] DianeJ Schiano, Bonnie A Nardi, Michelle Gumbrecht, and Luke Swartz. Blogging by the rest of us. In CH/'04 extended abstracts on Human factors in computing systems, pages J 143- 1146. ACM, 2004.
[27] Lilia Efimova and Aldo De Moor. Beyond personal webpublishing: An exploratory study of conversational blogging practices. In System Sciences, 2005. H/CSS'05. Proceedings of the 38th Annual Hawaii International Conference 011, pages 107a-107a. IEEE, 2005.
[28] Nicole B Ellison, Charles Steinfield, and Cliff Lampe. The benefits of facebook friends: social capital and college students use of online social network sites. Journal of Computer-Mediated Communication, 12(4):1143-1168, 2007.
[29] Alessandro Acquisti and Ralph Gross. Imagined communities: Awareness, information shar ing, and privacy on the facebook. In Privacy enhancing technologies, pages 36-58. Springer, 2006.
[30] Yusuke Yanbe, Adam Jatowt, Satoshi Nakamura, and Katsumi Tanaka. Can social bookmark ing enhance search in the web? In Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries, pages 107-116. ACM, 2007.
[31] Thorsten Joachims. Optimizing search engines using clickthrough data. In Proceedings of the eighth ACM SJGKDD international conference on Knowledge discovery and data mining, pages 133-142. ACM, 2002.
[32] Xintian Yang, Amol Ghoting, Yiye Ruan, and Srinivasan Parthasarathy. A framework for summarizing and analyzing twitter feeds. In Proceedings of the J 8th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 370-378. ACM, 2012. II H· M nd Xiaoiun Zeng Twitter mood predicts the stock market. Journal
[33] Johan Bo en, uma ao, a J • of Computational Science, 2( I): 1-8,20 II. G· P You are known by how you vlog: Per-
[34] loan-Isaac Biel, Oya Aran, and Daniel atrca- erez. .' b'l b havior in youtube. In ICWSM. 2011. sonality nnpressions and nonver
[35] Oya Aran, Joan-Isaac Biel, Daniel Gatica-Perez,et.al You are known b hiP y ow you v og: ersonality impressions and nonverbal behavior in youtube In Proceedin 1AAAII . . I gs 0 ntemationa Conference on Weblogs and Social Media, number EPFL-CONF-165900, 201 I.
[36] Martin W61Imer, Felix Weninger, Tobias Knaup, Bjorn Schuller, Congkai Sun, Kenji Sagae, and L Morency. Youtube movie reviews: In, cross, and open-domain sentiment analysis in an audiovisual context. 2013.
[37] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-Iearn: Machine learning in Python. Journal ofMachine Learning Research, 12:2825-2830, 20 II.
[38] Sotiris B Kotsiantis, I Zaharakis, and P Pintelas. Supervised machine learning: A review of classification techniques, 2007.
[39] Naresh K Malhotra. Attitude and affect: new frontiers of research in the 21 st century. Journal ofBusiness Research, 58(4):477-482, 2005.
[40] I. Ajzen. Attitudes, Personality, and Behavior. Mapping social psychology. McGraw-Hili Education, 2005.
[41] John A Bargh and Tanya L Chartrand. The unbearable automaticity of being. American psychologist, 54(7):462, 1999.
[42] James Agarwal and Naresh K Malhotra. An integrated model of attitude and affect: Theoretical foundation and an empirical investigation. Journal ofBusiness research, 58(4):483-493, 2005.
[43] Yan Lu, Veronique Auberge, Albert Rilliard, et al. Do you hear my attitude? prosodic perception of social affects in mandarin. Proceedings of Speech Prosody 2012, pages 685-688, 2012.
[44] Jens Allwood, Stefano Lanzini, and Elisabeth Ahlsen. Contributions of different modalities to the attribution of affective-epistemic states. In Proceedings from the l st European Symposium on Multimodal Communication University of Malta, pages 1-6.
[45] Massimo Chindamo, Julian Allwood, and Elisabeth Ahlsen. Some suggestions for the study of In P"I"'(IL',", Security Risk and Trtlsl (PA55AT). 2012 lnternatirmal stance in communication, Conference on and 2012 International Con,Fje. rn• ece 011 Soc'ial Computing (Socia/Colli), pages 617-622. IEEE, 2012.
[46] Patrizia Paggio, Jens Allwood, Elisabeth Ahlsen a d K'" J ki , n nstuna 0 men. The nomco multimodal nordic resource-goals and characteristics. 20 I O.
[47] Peter Juel Henrichsen and Jens Allwood Predicting the attit d fI . di I b d . u e ow 10 ia ogue ase on multi-modal speech cues. NEALT PROCEEDINGS SERIES, 2012.
[48] Yann Morlec, Gerard Bailly, and Veronique Auberge. Generating the prosody of attitudes. In Intonation: Theory, Models and Applications, 1997.
[49] Jean-Marc Blanc and Peter Ford Dominey. Identification of prosodic attitudes by a temporal recurrent network. Cognitive Brain Research, 17(3):693-699,2003.
[50] Albert Rilliard, Jean-Claude Martin, Veronique Auberge, Takaaki Shochi, et al. Perception of french audio-visual prosodic attitudes. Speech Prosody. Campinas, Brasil, 2008.
[51] Plinio Barbosa and Gerard Bailly. Characterisation of rhythmic patterns for text-to-speech synthesis. Speech Communication, 15(1-2):127-137, 1994.
[52] Vasiliki Orgeta and Louise H Phillips. Effects of age and emotional intensity on the recognition of facial emotion. Experimental aging research, 34( I ):63-79,2007.
[53] Alberto Di Domenico, Rocco Palumbo, Nicola Mammarella, and Beth Fairfield. Aging and emotional expressions: is there a positivity bias during dynamic emotion recognition? Frantiers in psychology, 6, 2015.
[54] Judith A Hall and David Matsumoto. Gender differences in judgments of multiple emotions from facial expressions. Emotion, 4(2):201,2004.
[55] D Matsumoto and P Ekman. Japanese and caucasian brief affect recognition test (jacbart), i, ii, iii [videotapes]. Available from Culture and Emotion Research Laboratory, Department of Psychology, San Francisco State University, 1600, 1992.
[56] Jens Allwood. Cooperation and flexibility in multi modal communication. In Cooperative Multimodal Communication, pages 113-124. Springer, 200 I.
[57] Carlos Busso, Zhigang Deng, Serdar Yildirim, Murtaza Bulut, Chul Min Lee, Abe b k L UI' h Neumann and Shrikanth Narayanan. Analysis of erno- Kazernzadeh, Sung 0 ee,tion recognition using facial expressions, speech and multi d I' ". . mo a inrormanon. In Proceedings ofthe 6th international conference on Multilllodal interfaces; pages 205-211. ACM, 2004.
[58] Christina Regenbogen, Daniel A Schneider' Raquel E Gur" Fral'k Schnelider, Ute Habel, and Thilo Kellermann. Multimodal human communicationtargeting facial . expressions. speech content and prosody. Neuroimage, 60(4):2346-2356, 2012.
[59] David Crystal. The English tone of voice: Essays in intonation, prosody and paralanguage. Hodder Arnold, 1975.
[60] Peter Roach. English Phonetics and Phonology Fourth Edition: A Practical Course. Ernst Klett Sprachen, 2010.
[61] Sieb Nooteboom. The prosody of speech: melody and rhythm. The handbook of phonetic sciences, 5:640-673, 1997.
[62] John Laver. Principles ofphonetics. Cambridge University Press, 1994.
[63] Peter Roach. Techniques for the phonetic description of emotional speech. In ISCA Tutorial and Research Workshop (ITRW) on Speech and Emotion, 2000.
[64] Petri Laukka, Patrik Juslin, and Roberto Bresin. A dimensional approach to vocal expression of emotion. Cognition & Emotion, 19(5):633-653,2005.
[65] Helen M Hanson and Erika S Chuang. Glottal characteristics of male speakers: Acoustic correlates and comparison with female data. The Journal ofthe Acoustical Society ofAmerica, 106(2):1064-1077,1999.
[66] Nick Campbell and Parham Mokhtari. Voice quality: the 4th prosodic dimension. In 15th ICPhS, pages 2417-2420, 2003.
[67] Frank Dellaert, Thomas Polzin, and Alex Waibel. Recognizing emotion in speech. In Spoken Language, 1996. 1CSLP 96. Proceedings., Fourth International Conference on, volume 3, pages 1970-1973. IEEE, 1996.
[68] Tin Lay Nwe, Foo Say Wei, and Liyanage C De Silva. Speech based emotion classification. In TENCON 2001. Proceedings of IEEE Region 10 International Conference 011 Electrical and Electronic Technology, volume I, pages 297-30 I. IEEE, 200 I.
[69] . d AILBHE NI CHASAIDE Voice quality and loudness Irena Yanushevskaya, Chnster GobI, an . in affect perception. 2008.
[70] Sylvie Mozzicon acci. Emotion and attitude conveyed in speech by means of prosody. I n 2nd Workshop on Attitude, Personality and Emotions in User-Adapted illteraction. Citeseer. 200 I.
[71] Julia Hirschberg, Diane Litman, and Marc Swerts. Prosodic and other cues to speech recogni tion failures. Speech Communication, 43(1):155-175, 2004.
[72] Paul Ekman and Harriet Oster. Facial expressions of emotion. Annual review of psychology,
30(1):527-554, 1979.
[73] Klaus R Scherer. The Functions of Nonverbal Signs in Conversation, chapter 8, pages 225-244. Hillsdale: L. Erlbaum, 1980.
[74] Juergen Luettin and Neil A Thacker. Speechreading using probabilistic models. Computer Vision and Image Understanding, 65(2):163-178, I 997.
[75] Paul Ekman. About brows: Emotional and conversational signals. Human ethology, pages 169-202, 1979.
[76] Paul Ekman. Telling lies: Clues to deceit in the marketplace, politics, and marriage. WW Norton & Company, 2009 .
[77] Javid Sadro, lzzat Jarudi, and Pawan Sinhao. The role of eyebrows in face recognition. Per ception, 32(3):285-293, 2003.
[78] Timothy F Cootes, Christopher J Taylor, David H Cooper, and Jim Graham. Active shape models-their training and application. Computer vision and image understanding, 61(I ):38-59, 1995.
[79] Javier R Movellan. Visual speech recognition with stochastic networks. Advances in neural infonnation processing systems, pages 851-858, 1995.
[80] Beat Fasel and Juergen Luettin. Automatic facial expression analysis: a survey. Pattern recog nition, 36(1):259-275, 2003.
[81] Paul Ekman and Wallace V Friesen. Measuring facial movement. Environmental psyclwlogr and nonverbal behavior, I (I ):56-75, 1976.
[82] Marian Stewart Bartlett, Gwen C Littlewort, Mark G Frank, Claudia Lainscsek, Ian R Fasel.,j and JavierR Movellan. Automat ic recognition of facial actions in spontaneous exp r e s s ions.Journal cl multimedia, 1(6):22-35. 2006.
[83] Takeo Kanade, Jeffrey F Cohn, and Yingli Tian. Comprehensive database for facial expres sion analysis. In Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on, pages 46-53. IEEE, 2000.
[84] Timothy F Cootes, Gareth J Edwards, and Christopher J Taylor. Active appearance models. In
Computer VisionECCV98, pages 484-498. Springer, 1998.
[85] Timothy F Cootes, Gareth J Edwards, and Christopher J Taylor. Comparing active shape models with active appearance models. In BMVC, volume 99, pages 173-182, 1999.
[86] Chalapathy Neti, Gerasimos Potamianos, Juergen Luettin, lain Matthews, Herve Glotin, Dimi tra Vergyri, June Sison, and Azad Mashari. Audio visual speech recognition. Technical report, IDIAP, 2000.
[87] Minh Tue Vo and Alex Waibel. Multimodal human-computer interaction. Proceedings of ISSD, 93, 1993. [88] Maja Pantie and Leon JM Rothkrantz. Toward an affect-sensitive multimodal human-computer interaction. Proceedings of the IEEE, 91(9):1370-1390, 2003.
[89] Matthew Turk. Multi modal human-computer interaction. In Real-time vision for human computer interaction, pages 269-283. Springer, 2005.
[90] Matthew Turk and Mathias Kolsch. Perceptual interfaces. Emerging Topics in Computer Vision, Prentice Hall, 2004.
[91] Takaaki Shochi, Donna Erickson, Albert Rilliard, Veronique Auberge, Jean-Claude Martin, et al. Recognition of japanese attitudes in audio-visual speech. In Speech prosody, volume 2008, pages 689-692, 2008.
[92] Loic Kessous, Ginevra Castellano, and George Caridakis. Multimodal emotion recognition in speech-based interaction using facial expression, body gesture and acoustic analysis. Journal on Multimodal User lnte1faces, 3(1-2):33--48, 20 I 0.
[93] Heather Molyneaux, Kerri Gibson, Susan O'Donnell, and Janice Singer. New visual media and gender: A content, visual and audience analysis of youtube vlogs. 2008.
[94] James Trier. cool engagements with youtube: Part I. Journal of Adolescent & Adult Literacy,50(5):408--4 I 2, 2007
[95] Prince Deh. Promoting information-sharing in ghana using video blogging. Participatory Learning and Action, 59( I ):40-43, 2009.
[96] Wen Gao, Yonghong Tian, Tiejun Huang, and Qiang Yang. Vlogging: A survey of videoblog ging technology on the web. ACM Computing Surveys (CSUR), 42(4):15, 2010.
[97] Elisabetta Adami. we/youtube: exploring sign-making in video-interaction. Visual Co1111n11ni cation, 8(4):379-399, 2009.
[98] Joan-Isaac Biel and Daniel Gatica-Perez. Vlogcast yourself: Nonverbal behavior and attention in social media. In International Conference on Multimodal Inte1faces and the Workshop on Machine Learning for Multimodal Interaction, page 50. ACM, 20 I 0.
[99] Joan-Isaac Biel and Daniel Gatica-Perez. The youtube lens: Crowdsourced personality im pressions and audiovisual analysis of vlogs. Multimedia, IEEE Transactions on, 15( I ):41-55,2013
[100] Dairazalia Sanchez-Cortes, Joan-Isaac Biel, Shiro Kumano, Junji Yamato, Kazuhiro Otsuka,and Daniel Gatica-Perez. Inferring mood in ubiquitous conversational video. In Proceedings I of the 12th International Conference on Mobile and Ubiquitous Multimedia, page 22. ACM,2013.
[101] Joan-Isaac Biel and Daniel Gatica-Perez. Wearing a youtube hat: directors, comedians, gurus, and user aggregated behavior. In Proceedings of the 17th ACM international conference on Multimedia, pages 833-836. ACM, 2009.
[102] Joan-Isaac Biel and Daniel Gatica-Perez. Vlogsense: Conversational behavior and social at tention in youtube. ACM Transactions on Multimedia Computing, Communications, and Ap plications (TOMCCAP ), 7(1 ):33, 2011.
[103) Joan-Isaac Biel and Daniel Gatica-Perez. The good, the bad, and the angry: Analyzing crowd sourced impressions of vloggers. Ethnicity, 16(4.8):0-7, 2012.
[104] Joan-Isaac Biel, Vagia Tsiminaki, John Dines, and Daniel Gatica-Perez. Hi youtube!: per sonality impressions and verbal content in social video. In Proceedings of the 15th ACM on International conference on multimodal interaction, pages 119-126. ACM, 20 I 3.
[105] Robert R McCrae and Oliver P John. An introduction to the five-factor model and its applica tions. Personality: critical concepts in psychology, 60:295. 1998.
[106] Paul Boersma. Praat,a system for doing phonetics by computer. Glot International, 5(9/10):34 1- 3 4 5, 2001.
[107] Ligia Maria Batrinca, Nadia Mana, Bruno Lepri, Fabio Pianesi, and Nicu Sebe. Please, tell me about yourself: automatic personality assessment using short self-presentations. In Pro ceedings of the 13th international conference on multimoda/ inte1faces, pages 255-262. ACM, 2011.
[108] Rosalind W Picard and Roalind Picard. Affective computing, volume 252. MIT press Cam bridge, 1997.
[109] Cindy L Bethel and Robin R Murphy. Survey of non-facial/non-verbal affective expressions for appearance-constrained robots. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 38(1):83-92, 2008.
[110] Ashish Kapoor, Winslow Burleson, and Rosalind W Picard. Automatic prediction of frustra tion. International journal of human-computer studies, 65(8):724-736, 2007.
[111] Ashish Kapoor and Rosalind W Picard. Multimodal affect recognition in learning environ ments. In Proceedings of the 13th annual ACM international conference on Multi media, pages 677-682. ACM, 2005.
[112] Laurel D Riek, Maria F Oconnor, and Peter Robinson. Guess what? a game for affective annotation of video using crowd sourcing. In Affective computing and intelligent interaction, pages 277-285. Springer, 2011.
[113] Eva Schultze-Berndt. Linguistic annotation. Essentials of language documentation, 178:213, 2006
[114] Stefanie Nowak and Stefan Ruger. How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation. In Proceedings of the inter national conference on Multimedia information retrieval, pages 557-566. ACM, 2010.
[115] Jens Allwood, Loredana Cerrato, Laila Dybkjaer, Kristiina Jokinen, Costanza Navarretta, and Patrizia Paggio. The mumin multimodal coding scheme. In Proc. Workshop on Multimodal Corpora and Annotation, 2005.
[116] Michael Kipp. Graduate college for cognitive sciences university of the saarland, germany.2001
[117] Magnus Gunnarsson. User man ual for multitool. Technical report, Technical Repo rt avail ale from http://www. ling. gu. se/mgunnar/multitool/MT-manual. pdf, 2002.
[118] Niels Ole Bernsen, Laila Dybkjaer, and Mykola Kolodnytsky. The nite workbench-a tool for annotation of natural interactivity and multimodal data. In Las Pa/mas. Citeseer, 2002.
[119] Ron Artstein and Massimo Poesio. Inter-coder agreement for computational linguistics. Com putational Linguistics, 34(4):555-596, 2008.
[120] Jacob Cohen et al. A coefficient of agreement for nominal scales. Educational and psycholog ical measurement, 20( I ):37-46, 1960.
[121] Joseph L Pleiss, Jacob Cohen, and BS Everitt. Large sample standard errors of kappa and weighted kappa. Psychological Bulletin, 72(5):323, 1969.
[122] Ron Kohavi and Foster Provost. Glossary of terms. Machine Learning, 30(2-3):271-274, 1998.
[123] Eva Szekely, John Kane, Stefan Scherer, Christer Gobi, and Julie Carson-Berndsen. Detect- ') ing a targeted voice style in an audiobook using voice quality features. In Acoustic s, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on, pages 4593-4596. IEEE, 2012.
[124] Chu-Hsing Lin, Jung-Chun Liu, and Chia-Han Ho. Anomaly detection using libsvm training tools. In Information Security and Assurance, 2008. ISA 2008. International Cmiference on, pages 166-171. IEEE, 2008.
[125] Guido W Imbens and Richard Spady. Confidence intervals in generalized method of moments models. Journal of econometrics, I 07( I ):87-98, 2002.
[126] Marian Stewart Bartlett, Gwen Littlewort, Claudia Lainscsek, Ian Fasel, and Javier Movellan.Machine learning methods for fully automatic recognition of facial expressions and facial ac tions. In Systems, Man and Cybernetics, 2004 IEEE International Conference 011, volume I, pages 592-597. IEEE, 2004.
[127] Kristiina Jokinen, Costanza Navarretta, and Patrizia Paggio. Distinguishing the communicative functions of gestures. In Machine Learning for Multimodal Interaction, pages 38-49. Springer,2008
[ 128] Chih-Chung Chang and Chih- Jen Lin. Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27, 2011.
[129] Hatice Gunes, Caifeng Shan, Shizhi Chen, and YingLi Tian. Bodily expression for automatic affect recognition. Emotion Recognition: A Plittem Analysis Approach, pages 343-377, 2015.
[130] I Mccowan, G Lathoud, M Lincoln, A Lisowska, W Post, D Reidsma, and P Wellner. The ami meeting corpus. In In: Proceedings Mells11ri11g Behavior 2005, 5th Intemational Conference on Methods and Techniques in Behavioral Research. LP11 Nold11s, F. Grieco, LWS Loijens llnd PH Zimmerman (Eds.), Wageningen: Noldus Informlltion Technology, 2005.
[131] Shannon Hennig, Ryad Chellali, and Nick Campbell. The d-ans corpus: the dublin autonomous nervous system corpus of biosignal and multimodal recordings of conversational speech. In Proceedings of the ELRA, the 9th Edition of the Language Resources and Evaluation Conference. Reykjavik, Iceland, pages 26-31, 2014.
[132] Joan-Isaac Biel, Daniel Gatica-Perez, et al. Voices of vlogging. In ICWSM, 20 I 0.
[133] Casey Neistat. How To Vlog. https://www.youtube.com/watch?v= dGLE E Z z l 5N4, 2015. [Online ; accessed 07-July-2015].
[134] Miss Fenderr. HOW I MAKE MY VIDEOS!(Step-By-Step Tutorial). https: / /www. youtube. com/watch?v=dGLEEZZ15N4, 2014. [Online; accessed 07-July-2015].
[135] Ruchard Dufour, Vincent Jousse, Yannick Esteve, Frederic Bechet, and Georges Linares.Spontaneous speech characterization and detection in large audio database. SPECOM, St. Petersburg, 2009.
[136] Timothy Delaghetto. Annoying People I Hate #2. https://www.youtube.com/ watch?v=pY7n5mpG5mU, 2014. [Online; accessed 07-March-2014].
[137] Timothy Delaghetto. Be a Gentleman, Get the Booty. https: //www.youtube.com/ watch?v=jbqW7SUjuT0, 2012. [Online; accessed 07-March-2014].
[138] British Psychological Society. Report of the working party on conducting research on the internet: guidelines for ethical practice in psychological research online. British Psychological Society Leicester, 2007.
[139] Paul Reilly. The battle of stokes crofton youtube: The development of an ethical stance for the study of online comments. 201 3 .
[140] Fabian Neuhaus and Timothy Webmoor. Agile ethics for massi fie d research and vi sualization. Infonnation, Communication & Society, 15( I ):43-65, 2012.
[141] Stephen Pihlaja. Antagonism on Yo11Tube: Metaphor in Online Discourse. Bloomsbury Pub lishing, 2014.
[142] Youtube copyright centre. https://www.youtube.com/yt/copyright/ fair-use. html#yt-copyright-resources. Accessed: 2012-12-14.
[143] Niga Higa. Off The Pill - Arrogant People). ht tps: / /www. yout ube. com/watch ?v= 7sz5cI5lenE, 2011. [Online; accessed 07 March-2014].
[144] Sound systems: Mono versus stereo. http://www. mcsquared. com/mono-stereo. htm. Accessed: 2012-12-14.
[145] James A Russell. Core affect and the psychological construction of emotion. Psychological review, 110(1):145, 2003.
[146] Kre Sjlander and Jonas Beskow. Wavesurfer - an open source speech tool, 2000.
[147] P. Wittenburg, H. Brugman, A. Russel, A. Klassmann, and H. S!oetjes. Elan: a profe ssional framework for multimodality research. In Proceedings of Language Resources and Evaluation Conference (LREC), 2006.
[148] Noor A Madzlan, J Reverdy, Francesca Bonin, Loredana Sundberg Cerrato, and Nick Camp bell. Multi modal perception of attitudes: A study on video biogs. In 3rd European Symposium on Multimodal Communication, 2015.
[149] Ekaterina P Volkova, Betty J Mohler, Detmar Meurers, Dale Gerdemann, and Heinrich H Biilthoff. Emotional perception of fairy tales: achieving agreement in emotion annotation of text. In Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text, pages 98-106. Association for Computational Linguistics, 20 I 0.
[150] Bjorn Schuller. Multi modal affect databases: Collection, challenges, and chances. T/1e Ox.frml Handbook of Affective Computing, pages 323-333, 2014 .
[151] Judith A Hall. Gender effects in decoding nonverbal cues. Psychological bulletin, 85(4):845,1978
[152] Hillary Anger Elfenbein and Nalini Ambady. On the universality and cultural specificity of emotion recognition: a meta-analysis. Psychological bulletin, 128(2):203, 2002.
[153] Aoju Chen. Unil'ersal and language-spec(fic perception {paralinguistic intonational mean ing. Utrecht: LOT, 2005.
[154] Loredana Sundberg Cerrato. Investigating communicative feedback phenomena across lan guages and modalities. 2007.
[155] Marc Schroder, Roddy Cowie, Ellen Douglas-Cowie, Machiel Westerdijk, and Stan CAM Gielen. Acoustic correlates of emotion dimensions in view of speech synthesis. In INTER SPEECH, pages 87-90, 2001.
[156] James Hillenbrand, Ronald A Cleveland, and Robert L Erickson. Acoustic correlates of breathy vocal quality. Journal of Speech, Language, and Hearing Research, 37(4):769-778, 1994.
[157] Yen-Liang Shue, Gang Chen, and Abeer Alwan. On the interdependencies between voice quality, glottal gaps, and voice-source related acoustic measures. In /NTERSPEECH, pages 34-37, 2010.
[158] Pedro Tome, Julian Fierrez, Ruben Vera-Rodriguez. and Daniel Ramos. Identification using face regions: Application and assessment in forensic scenarios. Forensic science international, 233(1 ):75-83, 2013.
[159] Hua Wang, Heng Huang, and Fillia Makedon. Emotion detection via discriminant laplacian embedding. Universal access in the information society, I 3(1):23-31, 2014.
[160] Isabelle Guyon and Andre Elisseeff. An introduction to variable and feature selection. The Journal of Machine Learning Research, 3:1157-1182, 2003.
[161] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2013.
[162] Karl Pearson. Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, pages 240-242, I 895.
[163] Bernard L Welch. The generalization ofstudent's' problem when several different population variances are involved. Biometrika, pages 28-35, 1947.
[164] R. Mankiewicz. The story of mathematics. The story of mathematics. Princeton University Press, 2000.
[165] Graeme D Ruxton. The unequal variance Hest is an underused alternative to student's t-test and the mann-whitney u test. Behavioral Ecology, 17(4):688-690, 2006. [I 66] Jarkko Isotalo. Ba,ics of statistics. University
[167] CORRELATION DEDUCED FROM A SMALL SAMPLE. correlation coefficients covering the cases (i)the frequency dis-tribution of the values of the correlation coefficient in samples from an indefinitely large population, biometrika, vol. I 0, pp. 507v52 I, 1915. here the method of defining the sample by the coordinates of. I 921.
[168] David Martin Powers. Evaluation: from precision, recall and f-measure to roe, informedness, markedness and correlation. 201 I.
[169] Herve Abdi and Lynne J Williams. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2(4):433-459, 2010.
[170] Noor Alhusna Madzlan, Jingguang Han, Francesca Bonin, and Nick Campbell. Towards au tomatic recognition of attitudes: Prosodic analysis of video biogs. Speech Prosody, Duhlin, Ireland, pages 91-94, 20 I 4.
[171] Noor Alhusna Madzlan, Yuyun Huang, and Nick Campbell. Speech and Computer: 17th International Conference, SPECOM 2015, Athens, Greece, September 20-24, 2015, Proceed ings, volume 9319, chapter Automatic Classification and Prediction of Attitudes: Audio-Visual Analysis of Video Biogs. Springer, 2015.
[172] Helen M Hanson. Glottal characteristics offemale speakers: Acoustic correlates. The Journal of the Acoustical Society of America, IOI :466, 1997.
[173] Noor Alhusna Madzlan, JingGuang Han, Francesca Bonin, and Nick Campbell. Automatic recognition of attitudes in video biogs-prosodic and visual feature analysis. In Fifteenth Annual Conference of the International Speech Communication Association, 2014.
[174] Gerasimos Potamianos, Chalapathy Neti, Guillaume Gravier, Ashutosh Garg, and AndrewW Senior. Recent advances in the automatic recognition of audiovisual speech. Proceeding.1· cf the IEEE, 91(9):1306-1326, 2003.
[175] Sanaul Haq, Philip JB Jackson, and James Edge. Audio-visual feature selection and reduction for emotion classification. In Proc. Int. Conj. on Auditory-Visual Speech Processing ( AVSPOR ),Tangalooma, Australia, 2008.
[176] Wei Fan. Systematic data selec tion to mine concept-drifting data streams. In Proceedings mining, pages 128-137. ACM, 2004.
[177] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. A practical guide to support vector classification, 2003.
[178] Tin Kam Ho. Random decision forests. In Document Analysis and Recognition, 1995., Pro ceedings of the Third International Co1ifere11ce on, volume I, pages 278-282. IEEE, 1995.
[179] J Hohnsbein and S Mateeff. The time it takes to detect changes in speed and direction of visu:..I motion. Vision Research, 38( 17):2569-2573, 1998.
[180] SK DMello and AC Graesser. Feeling, thinking, and computing with affect-aware learning technologies. The Oxford Handbook of Affective Computing, pages 4 I 9-434, 20 I 4.
[181] Stephen Brown. Meet pepper, the emotion reading robot. TECHNOLOGY, 2014.
[182] Cory D Kidd, Will Taggart, and Sherry Turkle. A sociable robot to encourage social interaction among the elderly. In Robotics and Automation, 2006. JCRA 2006. Proceedings 2006 IEEE International Conference 011, pages 3972-3976. IEEE, 2006.
[183] Rosalind W Picard, Seymour Papert, Walter Bender, Bruce Blumberg, Cynthia Breazeal, David Cavallo, Tod Machover, Mitchel Resnick, Deb Roy, and Carol Strohecker. Affective learninga manifesto. BT Technology Journal, 22(4):253-269, 2004.
[184] J. Burgess, J. Green, H. Jenkins, and J. Hartley. YouTube: Online Video and Participatory Culture. DMS - Digital Media and Society. Wiley, 2013.
|
This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials. You may use the digitized material for private study, scholarship, or research. |