Jan Van Balen

research · notes · thesis · press · talks · github

Why Hip-Hop is Interesting (for Music Information Retrieval)

[abstract] [my slides] [Ethan's slides]

At ISMIR 2016 I collaborated with Ethan Hein (NYU) and Dan Brown (UWaterloo) on a tutorial with the title: 'Why Hip-Hop is Interesting'. Following the example of similarly titled tutorials at previous editions of the conference, we wanted to look at a music genre that (we believe) deserves more attention from the community.

The tutorial was divided into four parts. In the first part, I talked about how the influence of hip-hop on popular music is hard to overstate, while at the same, it gets very little attention in Music Information Retrieval and digital musicology. I illustrated this with a case study on the absence of hip-hop from corpus analysis research, and the challenges around music description that arise in genres in which melody and harmony are less important than in music styles descended from the European classical tradition.

In the second part, Ethan Hein gave his own perspective on the importance of hip-hop, as a music educator and music education researcher with a particular interest an technology. You can read some of his arguments over at his blog. Dan Brown then talked about the analysis of rhyme. Finally, I presented a short overview of my research on sampling, arguing that any music informatics system that aspires to understand popular music should also understand sampling. I also come back to another point I want to make with this tutorial: crucial information will be lost when pop music is broken down to just those elements that can be encoded in Western music notation or symbolic digital formats.

Audio Corpus Analysis & Hooks


A talk about audio corpus analysis and the analysis hooks (slides are mostly figures). I presented a version of this talk ahead of my PhD defence, for an audience of mixed techical and musical background that also included my family. I presented an earlier (more detailed) version of this when I was invited by Bob Sturm to talk at the Center for Digital Music (Queen Mary University of London).

Corpus Analysis Tools for Computational Hook Discovery

[slides] [paper] [code]

ISMIR 2015 presentation on our analysis of the data from our game (Hooked!). The talk focuses on our contributions to the audio description and statistical analysis of popular music data. It also mentions the CATCHY toolbox, which is since available on Github.


Compared to studies with symbolic music data, advances in music description from audio have overwhelmingly focused on ground truth reconstruction and maximizing prediction accuracy, with only a small fraction of studies using audio description to gain insight into musical data. We present a strategy for the corpus analysis of audio data that is optimized for interpretable results. The approach brings two previously unexplored concepts to the audio domain: audio bigram distributions, and the use of corpus-relative or 'second-order' descriptors. To test the real-world applicability of our method, we present an experiment in which we model song recognition data collected in a widely-played music game. By using the proposed corpus analysis pipeline we are able to present a cognitively adequate analysis that allows a model interpretation in terms of the listening history and experience of our participants. We find that our corpus-based audio features are able to explain a comparable amount of variance to symbolic features for this task when used alone and that they can supplement symbolic features profitably when the two types of features are used in tandem. Finally, we highlight new insights into what makes music recognizable.