Event Details

Proxy scale for separating sound waves in natural settings using the music genre classifier built upon the neural network architecture.

Presenter: Sukhbani Virdi
Supervisor:

Date: Fri, August 2, 2019
Time: 13:30:00 - 14:30:00
Place: EOW 230

ABSTRACT

This project’s aim is to study the similarity between the music genre and the human voice in a real-world scenario. We have used the music genre as a scale to measure the tones, tempo, and loudness of human interactions. The main reason to use music as a proxy for categorizing the human voice is the lack of any data set of such a kind. Also, it's very difficult to categorize different human voice interactions with the varying accent, tone, and loudness into some well-defined classes. Whereas music is well categorized covering a wide variety of sounds from very low to very high tempo, loudness or pitch. Getting categorized music is also a fair and straightforward way, as we covered nine genres of music to build our pseudo scale.

The pseudo scale would be used as a proxy to segregate various interactions among human beings. Our hypothesis is that loud voices will be correlated with music genres such as rock, pop or hip-hop genres and simple conversations with moderate voices would be associated with genres such as blues, Latin or country. The project can be used in various ways such as building a mood detector on top of this pseudo scale to automate the music genre selection, building security system, where the microphone installed in the CCTV cameras, can be used to pinpoint the places where some altercation is going on in a huge compound. 

We have used the neural network model to build the classifier and then test it on human voices to test our initial hypothesis.