Communicative content in human communication involves expressivity of socio-affective
states. Research in Linguistics, Social Signal Processing and Affective Computing in par
ticular, highlights the importance of affect, emotion and attitudes as sources of information for
communicative content. Attitudes, considered as socio-affective states of speakers, are
conveyed through a multitude of signals during communication. Understanding the expres sion of
attitudes of speakers is essential for establishing successful communication. Taking the
empirical approach to studying attitude expressions, the main objective of this research is
to contribute to the development of an automatic attitude classification system through a
fusion of multimodal signals expressed by speakers in video biogs. The present study de
scribes a new communicative genre of self-expression through social media: video blogging, which
provides opportunities for interlocutors to disseminate information through a myriad of multi
modal characteristics. This study describes main features of this novel communica tion medium and
focuses attention to its possible exploitation as a rich source of information for human
communication. The dissertation describes manual annotation of attitude expres sions from the vlog
corpus, multimodal feature analysis and processes for development of an automatic attitude
annotation system. An ontology of attitude annotation scheme for speech in video biogs is
elaborated and five attitude labels are derived. Prosodic and visual fea ture
extraction procedures are explained in detail. Discussion on processes of developing an automatic
attitude classification model includes analysis of automatic prediction of attitude labels
using prosodic and visual features through machine-learning methods. This study also elaborates
detailed analysis of individual feature contributions and their predictive power to
the classification task
