Monday, March 29, 2010

Automatically Generated Music Playlists

I have a massive music collection, I just finished doing a scan and there are nearly 20,000 files sitting around on my hard drive. I used to be fairly good at organizing and maintaining musical playlists, keeping up with the sheer volume of music and the occasional need to move them around on the HD, it is hard keeping playlist files up-to-date.

I have been a fan of online services of Pandora (when it was available internationally) and last.fm to find new music and through them, I have found plenty of non-mainstream artists that I would have never found while listening to the radio while driving in the car; I have considerably expanded my music collection as a result.

Relational Databases: Eggs and Tomatoes so Cakes and Tomatoes?

Maintaining music playlists is sort of like making a meal, some dishes go great together and other dishes don't. This is probably a challenging problem since you don't necessarily want to be mixing classical music with hard rock (unless you have something creative in mind).

As musical libraries get large, the combinatorics dealing with making suitable playlists get semi-exponentially hard. It's actually an n! (n-factorial problem) for those of you that have studies advanced mathematics. It makes natural sense that a computer would be useful in dealing with these kinds of problems. This is precisely the kind of itch that online music recommendation systems such as Pandora and last.fm are trying to scratch from a different angle. What these guys are trying to do is recommend you music that you *don't* have that would make a great addition to your music selection. If you end up buying the song, then they get a cut and make some money. The kind of problem that I want to solve is given a library of music, automatically generate a set of music that I would like to listen to. I am willing to bet that these guys realized the problem that I wanted to solve far earlier than I did and are trying to turn it into an online business.

Unfortunately, Pandora and last.fm are recommendation systems not services used to manage your musical collection. The next best thing that I've found so far is iTunes' "Genius," which I am experimenting with right now.

Though I do not know the exact mechanics of how "Genius" works, I do know that the system reads the ID3 tags of the song, giving album name, artist, genre and other information to make a guess as to what other songs that could be used to make a good playlist.

Without knowing anything about the science of playlist generation, I can propose at least one technique. The first is to use a Bayesian filter to as a first pass at a music playlist generator. If you can get access to the sequence of music that a person plays you could calculate the P(A|B) which means the probability of having song "A" played given that song "B" was played. Using this technique you could make a crude filtering mechanism. The interesting thing about this technique is that it is already used in spam filtering of e-mail by making a database of words that usually occur in a non-spam e-mail and comparing that against e-mails that comes into your mailbox. If the mail doesn't fit your "taste" it generally is labeled as spam and removed. Other techniques such as say a Markov chain is a variation of the Bayes filtering technique.

More advanced techniques actually analyze the song itself, identifying musical instruments, beats and style. I consider this a "hard problem" and would not want to touch this, but people from the Music Genome Project have been working on this problem, which was the kernel that gave birth to the Pandora online music service.

What I don't like about iTunes

After giving Genius a spin, I do like it. It's not 100% perfect but it is good at making playlists of standard music genres: Pop, Alternative/Rock, BritPop, House/Electornica and other fairly common mixes, however there it does miss some less common genres, like break-beat Jazz/Funk/Jazz-electronica kinds of mixes that I enjoy. There is also the problem that I would like to keep some music genres out of some mixes, say for Downbeat House/Electronica, I generally like to keep hard Trance/Techno out of the collection. But given a playlist, I can probably make minor tweaks to make it acceptable.

The other thing about iTunes I don't like is that it is slow and clunky-- it isn't as snappy as compared to the traditional music players like winamp I've used on Windows. I've also started using programs to start adding and fixing ID3 music information to match up with data stored in online CD databases online, which should make the auto playlist generation more effective. The programs I've been using are handy that they can also be used to automatically rename files to any format I like -> meaning if I had an album saved just as "track 01, track 02" it could be converted to "track number - artist - song name." I have done this with a lot of the data I have already, but the problem is that after renaming these files, file links in Genius get broken. Putting the renamed filenames into the database is easy, but the old entries still remain and I have had to do removal manually, which is a pain in the ass. Still looking for alternatives, but it's been an interesting journey into the world of automatically generated musical playlists.

No comments: