Aarhus University Seal / Aarhus Universitets segl

From close-listening to distant-listening: Developing tools for Speech-Music discrimination of Danish music radio

Publikation: Bidrag til tidsskrift/Konferencebidrag i tidsskrift /Bidrag til avisTidsskriftartikelForskningpeer review

Digitization has changed flow music radio in several ways. Competition from music streaming services like Spotify and iTunes has to a large extend outperformed traditional playlist radio, and the global dissemination of software generated playlists in public service radio stations in the 1990s has superseded the passionate music radio host (Russo 2013, Dubber 2014, Have 2019). But digitization has also the changed the way we can do research in radio.
In Denmark legal deposit of all broadcasted material to the Royal Danish Library since 2005, as well as digitization of almost all radio programming back to 1989, have made it possible to actually listen to the archive to investigate how radio content has changed historically. The Danish digital radio archive is accessible for research through the digital infrastructure www.larm.fm. Larm.fm contains more than 1.5 million digitized Danish radio programs, but no tool has yet been developed for big scale analysis of the archive. One of the ambitions behind the project presented in this article is to demonstrate a way to do that.
The specific research question behind this article is: How has the distribution of music and talk on the Danish Broadcasting Corporation’s radio channel P3 developed 1989-2019? P3 is the most popular public service music radio channel broadcasted nationwide in Denmark and equivalent of the British BBC1. The thesis is that there has been a development from recorded music being the most important content of the programs to an increasing emphasis on spoken words (chattering hosts, news etc.) on the channel. This thesis has previously been tested – and to some extend confirmed – in a qualitative study of seven case studies of a specific morning music show in the period (Have 2018), but the present article is zooming out to the entire P3 morning programming 1989-2019 presumably allowing more valid answers to the research question. Methodologically this is a shift from close-listening to a few programs to large-scale distant-listening to more than 65.000 hours of radio.
Thus, the aim of the article is twofold: 1) To describe the methodological process and challenges for developing a model for large-scale Speech-Music discrimination analysis on radio data. 2) To discuss and critically compare the methods, results, strengths and shortcomings of the qualitative and the quantitative analysis, respectively.
It has previously been shown that Convolutional Neural Networks (CNNs) trained for image recognition of spectograms outperforms popular audio classifiers based on audio features, such as Support Vector Machines (SVMs). Our study shows that the CNN-based approach also generalizes to speech and music classification in Danish radio, with an overall accuracy of 98%. This level of performance allows for large scale analysis of the relationship between speech and music and while our approach focuses on speech and music classification in radio, it generalizes to other audio media such as audiobooks, podcasts, as well as with alternate predictors such as gender and mood. The computational part of the project is supported by DeIC’s National Cultural Heritage Cluster and Royal Danish Library providing HPC-facilities, and developed together with Center for Humanistic Computing, Aarhus University.
TidsskriftDigital Humanities Quarterly
StatusUdgivet - mar. 2021

Se relationer på Aarhus Universitet Citationsformater

ID: 177154164