MIDIWeave - Gesture-based Expression MIDI Controller

Song Recognition and Recommendation System

SAMPA (Synchronized Action in Music-ensemble Playing and Athletics)

BachTransform

CARNATIC BOT

Recording and Live Sound

GuitaRRRizz - a Guitar Plugin

Alone-Marshmello (recreation of sounds using Serum)

Keyword Spotting in Noise Using MFCC and LSTM Networks

Creating a Drum Loop using VCV Rack

AUGMENTED CLOTHING

Here is the link to my GitHub: Click Here

Master's Thesis on Generative Music

Raga Transfer using Generative Adversarial Networks - Master's Thesis

I'm delighted to share that my thesis, "Raga Transfer Using Generative Adversarial Networks" has been published in the University of Miami's repository! This project blends my passion for music and technology, making the beautiful complexities of Carnatic music accessible to modern composers.

Music is a dynamic reflection of our cultural heritage. Inspired by the intricate melodies of Carnatic music, I aimed to bridge the gap between this ancient tradition and contemporary music using AI. My goal is to democratize access to Carnatic ragas, allowing composers from all backgrounds to explore and integrate these unique scales into their work.

Huge thanks to my thesis committee, Dr. Collins, Dr Christopher Bennett, and Mr. Karthik, for their invaluable guidance and support. Special shoutout to Dr. Justin Mathew and the Music Computing and Psychology Lab of the University of York for their mentorship and collaboration. A heartfelt thanks to Mr. Alphons Joseph for his support and motivation throughout this project. And, of course, my heartfelt gratitude to my family and friends for their unwavering support.

Publication: https://lnkd.in/gcVcFqbF

Special thanks to Genis Plaja Roglans and Music Technology Group for assisting me with accessing the Carnatic dataset and providing helpful emails regarding preprocessing steps.

MIDIWeave - Gesture-based Expression MIDI Controller

Demo.mp4

While working on a music production project, I found myself diving into a range of libraries and upgrading my softwares. However, one challenge kept surfacing: achieving smooth expression control. My MIDI keyboard’s knobs occasionally produced sounds that felt a bit too mechanical or artificial for my taste.

To solve this, I built a dedicated gesture-controlled expression controller using computer vision! Now, instead of relying on hardware knobs, I can control expression more naturally with gestures like:

Closed Palm 👊 : Stops MIDI messages to maintain consistent expression.

Lasso Sign 👆 : Resumes MIDI control, allowing for real-time expression adjustments.

Expression Spread Control 🤲: A dynamic gesture that adjusts expression based on hand spread, mapped directly to MIDI values.

This setup allows for a fluid, authentic sound, giving me full, expressive freedom!

Song Recognition and Recommendation System

Developing a Recognition and recommendation system based on fingerprinting, and pattern discovery with a combination of a deep learning model.

Here is the GitHub repo( You only have the access to the sample framework): https://github.com/navaneethskumar/Song-Recognition

The Song Recognition System is a web-based application that allows users to identify songs from short audio clips. The system follows a Shazam-like fingerprinting technique to recognize songs based on frequency peak analysis. It consists of a FastAPI backend for processing audio files and a React frontend for user interaction.

Currently working on the fingerprinting techniques on the backend. In the demo, I used the SHA256 hashing technique.

A thought-provoking film for students | Thiruvananthapuram News - Times of IndiaT‘puram: The 29th International Film Festival of Kerala (IFFK) wrapped up its final day with a heartwarming initiative.

KALAM STANDARD 5B

In Kalam Standard 5B, I served as the composer for the original score, creating emotional soundscapes and producing background music to enhance character development and story progression. I also made a heartfelt rendition of Hum Honge Kamyab (We Shall Overcome), bringing a sense of inspiration and unity to the film. I composed a moving instrumental track for the end credits to leave a lasting impression on the audience. The film is planned for release on OTT platforms in March, promising to reach a wide audience with its impactful story and music.

SAMPA Research

SAMPA (Synchronized Action in Music-ensemble Playing and Athletics)

Working as a Research Assistant for the SAMPA project under the supervision of Dr. Tom Collins, Associate Professor in the Music Engineering Department

More Projects Exploring the Intersection of Music and Artificial Intelligence

BachTransform

🎶 Excited to share my latest fun project, BachTransform! 🎹

BachTransform is a Music Transformer. Project exploring AI-generated music, specifically focusing on piano melodies in the style of the legendary composer Johann Sebastian Bach. 🤖🎵

Project Link

Carnatic Bot.docx

CARNATIC BOT

A Carnatic music database interpreter using OpenAI API

Audio Classification Using Convolutional Neural Networks

One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This project tries to address that issue by presenting a neural network to classify annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The project also provides an evaluation of accuracy in classifying environmental sounds and compares it to the performance of selected baseline classifiers using features derived from Mel-Spectrogram.

GitHub Link

Recording and Live Sound

Hangli (Lizzie) Wu Classical Piano Recital:

https://www.youtube.com/live/NSHB5fCh8CE?si=7vzWnjbC45mEQf75

Recording Engineer

Alexander Waguespack, Classical Sax Recital :

https://www.youtube.com/live/k_S7tvY7Bt8?si=smRSZNuVc_GD62nq

Recording Engineer

SCI Concert:

https://www.youtube.com/live/DO-6k5YLlBg?si=haEtOWkhIMZBkTdW

Recording Engineer

Adriana Music Kids Recital 2024:

https://www.instagram.com/adrianamusickids

Recording Engineer

Michael Wu, Violin Recital:

https://www.youtube.com/live/psLCb4obCD0?si=eHWpdt4N8Jaoj-Ug

Recording Engineer

Gabriel Wallerstein Composition Recital:

https://www.youtube.com/live/8B1qM2kU0S0?si=qfgVdyBBtXxtmytk

Recording Engineer

Ke Xu, Classical Piano Recital:

https://www.youtube.com/live/KrWVAqs0vE0?si=kno-Pe7J99tg_zHC

Recording Engineer

Kamil Pacholec classical piano recital:

https://www.youtube.com/live/qBZRfaLCJjI?si=ijDX0y2uHnuz9doG

Recording Engineer

Samuel Strent, Composition Recital:

https://www.youtube.com/live/jE2YWs5fM_0?si=8XXXKzIX63FR74E8

Live Sound Engineer

maria joao pires piano recital:

https://events.miami.edu/event/maria_joao_pires

Recording Engineer

As a Recording and Live Sound Engineer at the Frost School of Music, I managed live sound recording and sound reinforcement for recitals and guest performances held at Gusman, Clarke, and Newman Halls. My responsibilities included setting up and operating professional audio equipment, ensuring optimal sound quality during performances, and recording live sessions for archival, Live stream and production purposes. This role refined my expertise in live sound engineering and reinforced my ability to adapt to diverse acoustic settings and performance requirements.

Maria João Pires

PLUGIN UNIVERSE

GuitaRRRizz - a Guitar Plugin

Built a plucked String Synthesizer Using the Karplus Strong algorithm in Juce Framework. It supports Polyphony to play chords and Velocity sensitive to control the dynamics of the notes.

Link to the demo

"to tweak the tone, modulate the amplitude, distortion, Delay and gain parameters to add more variation to the sound"

Available in VST3, AU, and as Standalone Plugin

The UI design of the Plugin is based on the Indian Movie RRR

GitHub Link: Click here

Binaural Panner

The Binaural Panner is a project I built to create an immersive 3D audio experience using SOFA files and HRIR data. I’ve always been fascinated by spatial audio, and this tool allows me to place sound sources at any point in a 360-degree space with control over both azimuth and elevation. Using the MIT KEMAR Normal Pinna dataset, I wrote algorithms to map specific positions to the closest HRIRs, which I then use to convolve with input audio for left and right ear channels. The result is binaural audio that feels much more natural and immersive.

To make it user-friendly, I created a simple GUI with Python’s Tkinter, where users can easily load SOFA and audio files, adjust the spatial parameters with sliders, and then save the output. This project not only helped me dive deeper into audio signal processing, but it also gave me a chance to work on a tool that could be useful in VR, gaming, and even music production—anything where 3D audio really adds to the experience.

GitHub Link

Vocal Glider

Pitch Shifter: implemented the pitch shifter using phase vocoder and controls the pitch of the vocal

Saturation & bias: Adds character and grit to the vocal. Users can apply distortion to the audio, opening up possibilities for edgier and non-conventional timbres.

Amplitude Modulation(Tremolo): Amplitude modulation is a powerful tool for sculpting evolving and dynamic sounds. It allows users to introduce amplitude change over time, giving the audio an edge in creating expressive sounds.

Gain Control: Gain control is crucial for achieving the right balance in audio output.

GitHub Link: Click here

WaveStorm

"Ambient Sound Synthesizer developed using Juce. Handles Midi Events and create sounds using sine, square, triangle, and saw signal"

Available in VST3, AU, and as Standalone Plugin

LaLa Delay

InShot_20230628_222844278.mp4

Developed a smooth delay plugin using JUCE Framework (C++)

Smoothed the delay buffer read index using Linear Interpolation so that the plugin will take up any delay time float value (from the slider) and there won't be any unwanted cracking or popping sound from the effect as output.

Available in VST3, AU and as Standalone Plugin

Source code

Studio One - Marshmello-Alone 2023-04-21 18-19-04.mp4

Alone-Marshmello (recreation of sounds using Serum)

Yes, all the sounds that you hear are designed using just Serum(Except the vocals).

Keyword Spotting in Noise Using MFCC and LSTM Networks

Final Project_Navaneeth.docx

This Project tells how to identify a keyword in noisy speech using a deep learning network. Keyword spotting (KWS) is an essential component of voice-assist technologies, where the user speaks a predefined keyword to wake-up a system before speaking a complete command or query to the device. In this matlab project they train KWS deep network with feature sequences of mel-frequency cepstral coefficients. This project also takes the noisy environment into consideration and uses the noise augmented data sets to retrain the network. This project uses long short-term memory (LSTM) networks, which are a type of recurrent neural network (RNN) well-suited to study sequence and time-series data and the Google Speech Commands Dataset to train the deep learning model. The main objective of this project is to implement deep learning effectively to spot any keyword even in a noisy environment.

Creating a Drum Loop using VCV Rack

1)Kick drum 2) Snare 3)Hihat 4)Bassline

5)Pad : You can use the Mod wheel for the tremolo effect

6)Picky synth: To make sound of synth you have to press a key and use pitch bend to modulate it's frequency

AUGMENTED CLOTHING

Augmented Clothing is the virtual equivalent of an in-store changing room. It enables shoppers to try on clothes to check one or more of size, fit or style, but virtually rather than physically. The fitting room will be based on the Microsoft Kinect 360 sensor developed by Microsoft. It’s an innovative technology which provides a new way of interaction between humans and the computer. Here we are using hand gestures to virtually try-on clothes and send a superimposed image to a web site using FTP.

"Trying on clothes in stores today is one of the most time-consuming tasks. Usually long waiting periods have to be taken into account, for example when standing in front of full fitting rooms. Furthermore, additional time is lost when taking clothes on and off. Reducing this time and helping people to put on a large collection of garment in reduced time was a relevant motivation for this thesis."

Also after Covid - 19 outbreak it is risky to go out and purchase clothes from shops.So it will be a remedial and efficient technology to enhance online shopping platforms.

Please go through the Video and PPT to get more idea of the project.

Video Presentation

Augmented Clothing

Block diagram

Download Final Report