What a WAV file looks like
When the Wavenet paper was first published, it was accompanied by a blog post that featured the following helpful gif:
I'd been noticing that pretty much every other blog post about wavenet, since, re-used the figure. It appears to be helpful in conveying, to people with some experience in machine learning but perhaps not with audio data, how impressively dense audio data actually is. In particular, it suggests that modeling audio in the time domain, shown here (as opposed to time-frequence domain), might just be really difficult.
Earlier today, Ethan Hein shared on Twitter that he's also using this figure in his teaching:
This gif is the single most valuable teaching resource I have for explaining how digital audio works. https://t.co/tLj6kr1wnl— Ethan Hein (@ethanhein) July 10, 2020
However, I thought it might be even more useful if it also reflected that, when audio is digitized, it's not just the time axis that discretized, but also the amplitude. So I made version of the figure that reflects that a little better.
Here's a few seconds of a piece for cello by Max Richter:
And here's a pop song, with a more complex arrangement:
Or how about some of the analog electronics of Elmer Bernstein's eery soundtrack to (appropriately) the Eames brothers' 1970 short film, The Powers of Ten:
Finally, here's a beautiful one contributed by Vincent Lostanlen, a Shepard-Risset glissando:
it's y(t) = 2 y(2t) all the way down! pic.twitter.com/ztFR5ERWgA— Vincent Lostanlen (@lostanlen) July 13, 2020
If you want to make your own, get the code here, or investigate below: