MatthiasMauch, Robert M.MacCallum, MarkLevy, Armand M.Leroi
Published 6 May 2015.DOI: 10.1098/rsos.150081
In modern societies, cultural change seems ceaseless. The flux of fashion is especially obvious for popular music. While much has been written about the origin and evolution of pop, most claims about its history are anecdotal rather than scientific in nature. To rectify this, we investigate the US Billboard Hot 100 between 1960 and 2010. Using music information retrieval and text-mining tools, we analyse the musical properties of approximately 17 000 recordings that appeared in the charts and demonstrate quantitative trends in their harmonic and timbral properties. We then use these properties to produce an audio-based classification of musical styles and study the evolution of musical diversity and disparity, testing, and rejecting, several classical theories of cultural change. Finally, we investigate whether pop musical evolution has been gradual or punctuated. We show that, although pop music has evolved continuously, it did so with particular rapidity during three stylistic ‘revolutions’ around 1964, 1983 and 1991. We conclude by discussing how our study points the way to a quantitative science of cultural change.
The history of popular music has long been debated by philosophers, sociologists, journalists, bloggers and pop stars [1–7]. Their accounts, though rich in vivid musical lore and aesthetic judgements, lack what scientists want: rigorous tests of clear hypotheses based on quantitative data and statistics. Economics-minded social scientists studying the history of music have done better, but they are less interested in music than the means by which it is marketed [8–15]. The contrast with evolutionary biology—a historical science rich in quantitative data and models—is striking, the more so because cultural and organismic variety are both considered to be the result of modification-by-descent processes [16–19]. Indeed, linguists and archaeologists, studying the evolution of languages and material culture, commonly apply the same tools that evolutionary biologists do when studying the evolution of species [20–25].
Until recently, the single greatest impediment to a scientific account of musical history has been a want of data. That has changed with the emergence of large, digitized, collections of audio recordings, musical scores and lyrics. Quantitative studies of musical evolution have quickly followed [26–30]. Here, we use a corpus of digitized music to investigate the history of American popular music. Drawing inspiration from studies of organic and cultural evolution, we view the history of pop music as a ‘fossil record’ and ask the kinds of questions that a palaeontologist might: has the variety of popular music increased or decreased over time? Is evolutionary change in popular music continuous or discontinuous? And, if it is discontinuous, when did the discontinuities occur?
To delimit our sample, we focused on songs that appeared in the US Billboard Hot 100 between 1960 and 2010. We obtained 30-s-long segments of 17 094 songs covering 86% of the Hot 100, with a small bias towards missing songs in the earlier years. Because our aim is to investigate the evolution of popular taste, we did not attempt to obtain a representative sample of all the songs that were released in the USA in that period of time, but just those that were most commercially successful.
Like previous studies of pop-music history [28,30], our study is based on features extracted from audio rather than from scores. However, where these early studies focused on technical aspects of audio such as loudness, vocabulary statistics and sequential complexity, we have attempted to identify musically meaningful features. To this end, we adopted an approach inspired by recent advances in text-mining (figure 1). We began by measuring our songs for a series of quantitative audio features, 12 descriptors of tonal content and 14 of timbre (electronic supplementary material, M2–3). These were then discretized into ‘words’ resulting in a harmonic lexicon (H-lexicon) of chord changes, and a timbral lexicon (T-lexicon) of timbre clusters (electronic supplementary material, M4). To relate the T-lexicon to semantic labels in plain English, we carried out expert annotations (electronic supplementary material, M5). The musical words from both lexica were then combined into 8+8=16 ‘topics’ using latent Dirichlet allocation (LDA). LDA is a hierarchical generative model of a text-like corpus, in which every document (here: song) is represented as a distribution over a number of topics, and every topic is represented as a distribution over all possible words (here: chord changes from the H-lexicon, and timbre clusters from the T-lexicon). We obtain the most likely model by means of probabilistic inference (electronic supplementary material, M6). Each song, then, is represented as a distribution over eight harmonic topics (H-topics) that capture classes of chord changes (e.g. ‘dominant-seventh chord changes') and eight timbral topics (T-topics) that capture particular timbres (e.g. ‘drums, aggressive, percussive’, ‘female voice, melodic, vocal’, derived from the expert annotations), with topic proportions q. These topic frequencies were the basis of our analyses.