Jan Van Balen

research · notes · thesis · press · talks · github

In June 2016, I defended my PhD on music representations and corpus analysis strategies for audio-based music research. The thesis covers a few different topics. Hopefully below breakdown helps you find the part you're interested in.


PART I · Introduction to audio features and corpus analysis

Chapter 1: Introduction

This chapter introduces the motivations for the work in this thesis, and the Cogitch project of which it was a part. It also briefly introduces the fields of digital and empiral musicology, and talks about the challenges around empirical research in the humanities. We focused on a few perspectives that were found to be most useful throughout this thesis.

Chapter 2: Audio Description

There is a large amount of prior work on the description of audio music content. We give an overview of the most relevant work from the Music Information Retrieval literature in this chapter, focusing on audio description and its applications.

Chapter 3: Audio Corpus Analysis

An overview of previous work in corpus analysis and digital musicology. As this thesis aims to contribute to the audio description of popular music for the purpose of corpus analysis research, we look at the studies themselves as well as some methodological considerations. This chapter includes a case study in which we review the methods of Serra et al. and Mauch et al. in their corpus analysis work on the evolution of popular music.

PART II · Chorus analysis & pitch description

Chapter 4: Chorus Analysis

As a first venture into audio corpus analysis of our own, we take a look at the properties of choruses and other song sections for two datasets of structure-annotated popular music datasets.

Chapter 5: Cognition-informed Pitch Description

The above chapter identifies pitch and harmony as promising areas for audio corpus analysis, for which however the description approaches can be improved. This chapter presents a set of new compact pitch and harmony descriptors and a cover song detection experiment that is used to find out how descriptive the new descriptors can be.

Chapter 6: Audio Bigrams

The descriptors proposed in the above chapter can be generalized. The result is a family of features that has the potential to be useful in variety of audio content identification applications we jointly refer to as 'soft audio fingerprinting'. We present an implementation and an example evaluation using a larger cover song detection dataset.

PART III · Corpus analysis of hooks

Chapter 7: Hooked

Part III of this thesis talks about the corpus analysis of 'hooks'. We first introduce the concept of hooks and some of the existing literature around it. Then we discuss Hooked, the game we designed to collect data on popular music hooks and long-term music memory.

Chapter 8: Hook Analysis

Finally, we discuss the corpus analysis of the hook data we gathered with Hooked. Along the way, we introduce the notion of second-order audio features, as well as song-based and corpus-based second-order features, which allow us to model recurrence and conventionality.

Chapter 9: Conclusion