Show HN: Music Genre Classification App in Django

ianamartin · on June 11, 2017

This is really interesting, and it reminds me of a game I used to play with my brothers as a kid. We're all classical musicians, and we used to make mix tape challenges for each other when we were young. It would be a 1-second snippet of a piece of music followed by 4 seconds of silence. You had to name the piece before the next 1-second sample started to get a point.

If you didn't know the specific piece, you could still get a point by guessing the nationality of the composer and the time it was written within 10 years. The available time period for music was anything from Pythagoras' time to the present day (would have been late 80s at the time we were playing.)

It ha to be reasonably describable as "classical" music. Not rock, jazz or blues or anything.

It was a hard fucking game for an 8-year-old, and we really worked each other over very hard.

I wonder if I could fork this and try to get it to tell the difference between, for example, Bach and Corelli, Mahler and Bruckner, or Zarlino and Palestrina.

Very cool project.

l1feh4ck · on June 11, 2017

I am glad that you find this project interesting. You can fork this project and find how much Bach and Corell is related to classical music. Or if you give a music eg:`The Verve-Bitter Sweet Symphony.` It can say that it has 28% chance that it is a classical music, 6% chance of being pop and so on. Since classical probability is high. The output label will be classical. You can use some music recognition api and get the meta data of the uploaded music to find the song info like name, year, artist etc.

ianamartin · on June 11, 2017

Thanks for the additional information. I'm looking forward to playing with it. I'm curious if the mysvm library can learn very refined features that it's easy for me to note when listening, but I would have a hard time enumerating.

The difference in timbres between a baroque violin and a modern violin, what pitch the strings (if any) are tuned to, the difference in group texture a string quartet and a string orchestra, differences in contrapuntal texture between say, Bach and Mozart, or the difference in harmonic styles between Italian and German music from the same time period.

For example, if you hear a string quartet, and it's tuned to A425 instead of A440, you're listening to a historically accurate recording of something most likely written in Germany in the 18th century. Probably Mozart or Haydn. If there's any meaningful melodic activity coming from anyone but the first violinist of the quartet, it's probably late Mozart or late Haydn.

Of course, if the quartet is tuned at modern pitches, it's a setback because most recordings are still done at modern pitches, so it could be anything from early Haydn to Shostakovich or George Crumb. So you'd have to rely on other features. And then some things are just impossibly difficult to infer. Fauré was French but mostly fascinated with contemporary German music from the Weimar school from an early age. Absent specific knowledge of his music, it would be insanely difficult to ferret out the harmonic and contrapuntal features that define French music of his time.

Music History is littered with edge cases like that. In the late 16th century there was a school of composition known for extreme chromaticism and hyper complex rhythmic structures. If you play some of this music for even well-trained professional classical musicians, almost all of them will be fooled into guessing that it's early 20th century atonal music from the Second Viennese school.

I can't think of any practical use case for this. I'm mostly just curious what the limits are for machine learning at this point. I've been "learning" these kinds of stylistic differences for 30 years now, though I mostly practice with the radio now on a classical station. Turn it on for one second, and everyone in the car gets a few seconds to guess. It's possible for a dedicated human be good at this.

I guess I'm just curious what the current state of ML is at the moment with regard to a subject I'm intimately familiar with. I worked on real-time fraud detection at a credit card precessing company years ago, but a fraudulent transaction is harder for me to relate to than a piece of music. I'm hoping that this turns out to be a good learning experience for me to be able to get a batter handle on what happens when you tweak certain knobs in an ML model.

I'm also uncertain as to what makes for an apples-to-apples comparison between the classifier and a human tester. My first response was to not use an api for metadata because that seems like it's unfair to be able to instantly lookup the name and composer of a piece. But on the other hand, an experienced human will a lot of performing experience has basically absorbed all of that metadata. So it really is possible for a person to get a 90-95% or even better accuracy rate based on 1-second samples.

Anyway, that's enough of that. When I get a chance to play with this, I'll keep in touch about what I come up with. Thanks again for putting this out there.

brainless · on June 11, 2017

Does anyone find it odd that a music classifier app names the web framework in the title but not the classifying toolkit?

Django is a web framework, why does it matter so much in this project?

l1feh4ck · on June 11, 2017

While making this project, we first tried many classification algorithms to find the best one. We used Matlab for that. We called it mgc-matlab. Then we made a package in python using poly kernel SVM and called it mgc-python. Finally while making the Web application it was named mgc-django. Looks like we are not very good at nomenclature. What other name would you suggest?

brainless · on June 11, 2017

To me Django is not the core part of your project. Django could easily replaced by any other Python web framework. The interesting bit is the SVM. I would stick to mgc-python. And in the title of the post/project perhaps highlight that instead of Django.

ubernostrum · on June 11, 2017

Same reason why you see so many posts of "(thing) for Rust", "(other thing) in Go", etc.

Some people are interested by the thing, others will be interested by what it was implemented with.

wyldfire · on June 11, 2017

> others will be interested by what it was implemented with

Agreed but I think the point is regarding which "with". Let's say your hot-dog-not-hot-dog project leverages glibc, linux, json, LLVM, scipy, pandas, OpenCV. When you write a title to feature your project on HN or similar, should you pick "hot-dog-not-hot-dog in OpenCV" or "hot-dog-not-hot-dog using glibc"? Yes, it's true that you couldn't have gotten it done at all w/o a libc. But that's probably not what you should lead with.

benjiza · on June 11, 2017

You might be interested in this challenge: https://multimediaeval.github.io/2017-AcousticBrainz-Genre-T...

(not affiliated)