TY - GEN
T1 - Modeling dynamic patterns for emotional content in music
AU - Vaizman, Yonatan
AU - Granot, Roni Y.
AU - Lanckriet, Gert
PY - 2011
Y1 - 2011
N2 - Emotional content is a major component in music. It has long been a research topic of interest to discover the acoustic patterns in the music that carry that emotional information, and enable performers to communicate emotional messages to listeners. Previous works looked in the audio signal for local cues, most of which assume monophonic music, and their statistics over time. Here, we used generic audio features, that can be calculated for any audio signal, and focused on the progression of these features through time, investigating how informative the dynamics of the audio is for emotional content. Our data is comprised of piano and vocal improvisations of musically trained performers, instructed to convey 4 categorical emotions. We applied Dynamic Texture Mixture (DTM), that models both the instantaneous sound qualities and their dynamics, and demonstrated the strength of the model. We further showed that once taking the dynamics into account even highly reduced versions of the generic audio features carry a substantial amount of information about the emotional content. Finally, we demonstrate how interpreting the parameters of the trained models can yield interesting cognitive suggestions.
AB - Emotional content is a major component in music. It has long been a research topic of interest to discover the acoustic patterns in the music that carry that emotional information, and enable performers to communicate emotional messages to listeners. Previous works looked in the audio signal for local cues, most of which assume monophonic music, and their statistics over time. Here, we used generic audio features, that can be calculated for any audio signal, and focused on the progression of these features through time, investigating how informative the dynamics of the audio is for emotional content. Our data is comprised of piano and vocal improvisations of musically trained performers, instructed to convey 4 categorical emotions. We applied Dynamic Texture Mixture (DTM), that models both the instantaneous sound qualities and their dynamics, and demonstrated the strength of the model. We further showed that once taking the dynamics into account even highly reduced versions of the generic audio features carry a substantial amount of information about the emotional content. Finally, we demonstrate how interpreting the parameters of the trained models can yield interesting cognitive suggestions.
UR - http://www.scopus.com/inward/record.url?scp=84873598676&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84873598676
SN - 9780615548654
T3 - Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011
SP - 747
EP - 752
BT - Proceedings of the 12th International Society for Music Information Retrieval Conference, ISMIR 2011
T2 - 12th International Society for Music Information Retrieval Conference, ISMIR 2011
Y2 - 24 October 2011 through 28 October 2011
ER -