Hello everyone, I’m Yang Yang (aka yyoung), an undergraduate student from Shanghai Jiao Tong University, China. I am honored to be accepted as a student of Google Summer of Code 2021 in MetaBrainz Foundation to work on the improvements of external links editor. I had a good time with the MusicBrainz dev team this summer, and it was a valuable experience for me. This is a final report and overview of my work.Continue reading “GSoC 2021: Push the URL relationship editor to the next level”
Hi everyone, I am Akash Gupta, currently pursuing my undergraduate from Kalinga Institute of Industrial Technology. This summer, I participated in Google Summer of Code and developed a new feature — Series Entity— for the project BookBrainz.
I was mentored by Nicolas Pelletier (monkey on IRC) during this period. This post summarizes my contributions to the project and the experiences that I had throughout the summer.Continue reading “GSoC 2021: Series Entity for BookBrainz”
The MusicBrainz Mobile App developers have been working at full capacity, improving the user experience, incorporating more features and functionalities, while making sure the core purpose of the app remains as promised.
Since its inception in 2010, the MusicBrainz Official App has come a long way. The App currently is highly maintained and has been actively open for contributions. A systematic approach is being followed and updates are being made on a regular basis.
The most important revamp which has been worked on for the past few months is the Tagger feature available in the MusicBrainz Android App.
Functionalities like fetching the local album arts, searching through all your local music files at one go, retrieving the cover art from the server, and heading to the recording directly are some of the key highlights of the upcoming Tagger.
Picard has finally made an official entry to the MusicBrainz App where users can now send their releases to the original Picard desktop app with the click of a button. This has been worked on in collaboration with the Picard team and proper documentation on its usage will be shared soon.
The completely new addition of Listen and Critique showcases the functionalities of ListenBrainz and CritiqueBrainz websites natively from the app. Currently, these will be available as advanced features on the app.
A well-prepared Onboarding and About section will take you through every important detail on the app and make sure you are aware of all the functionalities in the best and optimized way possible.
Proper documentation of every feature is being prepared. The App is finally out in Production, do head to the stores and give it a try!
We are really excited to make the MusicBrainz App as user-friendly as possible for you, while we take care of all the wonder behind it!
Play Store: MusicBrainz – Apps on Google Play
I am Rohit Dandamudi, more commonly known as diru1100 in IRC and all other sites. I am currently doing my final year in Computer Science and Engineering at Chaitanya Bharathi Institute of Technology, Hyderabad. This summer, I had the wonderful opportunity to work with MetaBrainz Foundation and it’s my first time participating in GSoC. I worked on the SpamBrainz project under the guidance of yvanzo to make a step forward on eliminating spam in MusicBrainz.
How it started
I started looking for some cool projects to apply for GSoC, eventually, after going through some which were involved in the web development side, I finally got to know about the MetaBrainz Foundation, and it was already pretty late (around 2½ weeks before the proposal deadline), most of my fellow GSoCers were already in good rapport with the community by then. After looking through the project ideas, I wanted to do my project on CritiqueBrainz, but later I found out that it’s not considered for this year. In the end, I liked the concept of SpamBrainz and how it involves a good combination (Deep Learning and Web Development) of technologies. After browsing through the project I understood what I could and tried to make some changes to the codebase and was successfully able to run the model and add some documentation. Finally, I submitted the proposal, which got accepted.
My proposal was focused on extending the work done by Leo as part of GSoC 2018. It mainly involved the following:
- Do the research and implement online learning to:
- Update the model dynamically as new variations of editor spam accounts appear.
- Make the model self-sufficient without depending on a particular file or a batch of data.
- Explore different types of learnings that are applicable to enhance LodBrok and for better performance in production.
- Complete SpamBrainz API to:
- Use and update the model with API calls.
- Connect LodBrok with MusicBrainz Server.
- Do detailed documentation to make the project more public and involve more contributors
LodBrok model improvements
- Challenges faced initially were:
- There was less documentation.
- No access to real data.
- This made it a little difficult to understand the model, how it works, and what parameters are present, what is considered as spam or not, etc.
- To overcome this obstacle in the future, I have written a dummy data generation script.
- Have predicted using the model trained with generated data and got 100% accuracy against test data.
- Retrained the model to simulate online learning after doing a lot of research and considering the use case of LodBrok in MusicBrainz.
- Retrained the model with the simulation of taking spam as a non_spam account and it was able to predict the new learnings while still being able to remember the original non_spam accounts.
- Added detailed documentation about the model covering its usage, internal working, how to replicate locally with the help of images, helper functions.
- Finally, with all these included, the spambrainz_ml repo has released it’s first release v-0.1 with necessary binaries.
- Here is a navigation diagram explaining which notebooks and datasets are connected and the relationship between them.
Research for model live update
- To implement the online learning part I had to explore and test different methods with the generated dataset and LodBrok model. For this, I had to explore various resources such as Keras’ community forums, research papers, StackOverflow, courses, and blogs.
- A few of the interesting findings I have tested out were:
- Retraining the model
- This seemed to be the most obvious and easy fix to upgrade the model.
- This StackOverflow answer explains how retraining is done
- But some things to consider here are: a separate db has to be maintained to store the dataset and should be constantly updated by SpamNinja.
- This is not feasible overtime
- A lot of work is done to just transfer the data.
- Transfer Learning (TL)
- Official Keras blog explanation about the feature extraction and fine-tuning methods of TL.
- Transfer Learning mainly involves deriving a new model from a pre-existing successful model (LodBrok) known as feature extraction to tackle similar cases.
- I was inspired by the fine-tuning feature of Transfer learning which has a similar learning method as the one I implemented.
- Online Transfer Learning (OTL)
- This, as the name suggests is a combination of online learning and Transfer learning, which helps us to define models that can learn to classify similar spam accounts in MetabBrainz.
- This research explains about OTL and it’s use cases in the production environment.
- The concept of model drift:
- This article explains how the model degrades over time, the reasons for this, and how to handle it without depending on the production environment.
- This is useful to know as it is needed when the model is finally in production handling real data.
- Retraining the model
- In the end, I decided to go with refitting the model with a slow static learning rate, this seemed to be the best solution for the following reasons:
- No need to store editor details for false-positive and false-negative cases respecting MetaBrainz’s data privacy rules.
- The model won’t go through catastrophic forgetting (forget old learnings of what is spam or not) and will be able to learn new patterns in spam accounts over time.
- The structure of the data isn’t changing over time (editor account fields remain the same).
- Resources which helped me make this decision:
- Keras community help discussion about the same exact problem (online learning in Keras for an LSTM model [LodBrok])
- StackOverflow answer explaining catastrophic forgetting and role of fit function
- Machine Learning mastery blog article explaining the importance of learning rate on a model
- Reading research papers similar to this one covering online deep learning and consulting professors.
- Incorporated the above research in SpamBrainz API, which consists of 2 endpoints, namely:
/predictto return classification results by LodBrok for the editor accounts
/trainto retrain the model with incorrect results sent to SpamNinja respectively
- After discussing with Leo, I decided to implement the API using Flask and Redis combination. Going with Redis over RabbitMQ for this API is feasible as the API is pretty lightweight and has at most 2 events.
- Documented the entire API, with internal working, steps to replicate, and images to understand the results obtained.
- Completed dockerization of SpamBrainz_API for easier integration and testing with MusicBrainz docker.
- This diagram explains the current workflow of the implemented API:
Challenges ahead and future of SpamBrainz
- The API has to be integrated with MusicBrainz and should undergo more testing with real live data, currently, my focus is on this part.
- To extend online learning to other use cases in MetaBrainz through Transfer Learning and Online Transfer Learning.
- Also looking forward to writing a research paper about the work done, and eventually publish it in IEEE transactions, as I plan on using SpamBrainz as my final year major project.
Special thanks to…
- My mentor, YvanZo for being incredibly patient with me, helping me create quality commits, and overall making me a better programmer. Have always learned something new in every interaction with him.
- LeoVerto, for helping me out whenever stuck and getting me up to date with the project.
- MetaBrainz Foundation, for creating an open, inclusive, and productive environment to build some amazing stuff.
Hi everyone, I am Prabal Singh currently studying in Indian Institute of Technology, Guwahati. This summer I participated in Google Summer of Code and developed a new feature – User Collections – for the project BookBrainz.
I was mentored by Nicolas Pelletier (Mr_Monkey on IRC) during this period. This post summarizes my contributions to the project.Continue reading “GSoC 2020: User Collection for BookBrainz”
Hey! My name is Shivam Kapila (shivam-kapila on IRC) and I am a final year undergrad at National Institute of Technology Hamirpur. I have been working on the ListenBrainz project this Summer as a participant of the Google Summer of Code program. The past four months were full of fun, hacking and loads of music!!
Landing into the MetaBrainz Community!
My journey with MetaBrainz began in late January this year, when I introduced myself to the community. My first PR improving the developer documentation was by adding parts connected with setting up the Spark infrastructure on a local setup along with consolidating and improving bits of documentation. I delved into real code while implementing front end components for Deleting Listens. Over the next few months, I fixed various bugs like making the Importer Modal responsive, fixing the DB setup scripts, fixing pagination issues while browsing listens, handling stat calculation errors in the Spark Reader and flushing user stats when they delete their listens.
As a GSoC applicant, I proposed to add various Listen Management features like love/hate (aka feedback) and deleting individual listens in ListenBrainz. I also proposed a new design for the Listens page. This involved a lot of designing and research, going through UI/UX design guidelines and tuning colors, shades and shadows till we arrived at a presentable and subtle design.
And finally I onboarded the GSoC train 🙂 .
Bonding with the community
I had been a part of the community since January so I was familiar with how things work in ListenBrainz. So I decided to contribute to the TimescaleDB migration where we moved our primary listen store from InfluxDB to TimescaleDB, opening up a ton of features for us to work on. Here is the final migration PR containing the commits of my contribution.
I also contributed to easing the testing infrastructure for devs to test the patches on their local setups. Following this I upgraded the postgres-client to PG12 version when we migrated to Postgres 12. I also fixed a minor font bug on the profile page.
The GSoC journey begins
Laying the base
As the official coding period began, I started working on my proposed tasks. The first question was: how to store the feedback? So I began implementing the database changes to store the recording feedback and applying the necessary changes in production. Following this I added a Python module to interact with the database and implemented a Pydantic model to validate the feedback records before they are stored in the database or served over the API. Then I added the necessary APIs to store and fetch the feedback for a given user or recording. This was followed by improving the efficiency of the DB module.
I also worked on dumping the recording feedback in the ListenBrainz public dumps. Since ListenBrainz had migrated the stats calculation infrastructure from Google BigQuery to Apache Spark I also removed the BigQuery references from the ListenBrainz website. Now that the timescale migration work became stable, I began working on Delete a Listen feature.
Pulling out the front end brushes
Now that the base was ready for us to work on, I started working on the React components so that the feedback and deletion feature could actually be presented on the website. Around the same time, the Timescale release day was also getting near, so I helped with a few tests and finished up the work for deleting listens. The front end components also started looking good and we were ready to associate the back end with them.
Rectifying & Reactifying
It’s high time and the final phase started. Now that we were ready with a few components we needed some tweaks in some production components to make them subtle. Hence I shot an improvement PR to tweak some shadows, adjust some fonts, adjust heights of the components, sticking the footer to the bottom, and reactify the loading spinner. Then came the Listen Count Card denoting the number of listens for a user. Following this we moved to Card based design for displaying listens.
This was followed by the much awaited feedback controls and now we can love/hate the songs from our listen collection. Isn’t this amazing! There were some needed minor tweaks needed to handle the ‘playing now’ listens correctly. At the same time, following the MetaBrainz guidelines to write quality code, I worked on making the SQL queries more readable. Then came the much awaited Delete a Listen feature and now we can finally get rid of the embarrassing listens!!
I also addressed some high priority tasks like giving the users an option to download their submitted feedback as JSON. We noticed some UI glitches and then came three back to back PRs to update feedback control shades, improving the listen time text and smoothing up the deletion animation. This is how the listen list looks like:
Oh, now comes the time when we talk about the current scenario. The tasks currently on my radar are adding cover art support so that the page looks more alive and improving the Spotify imports to only import listens that were listened by the user after the latest Spotify listen we have for them.
After this I aim to work on the recommendation stuff that’s being actively pursued by the team. Also Mr_Monkey and me had been working on some design concepts for the All New ListenBrainz. I am pretty excited to work on it. Wanna take a sneak peek?
A new fam
The journey with MetaBrainz has been so amazing, that I am so tempted to stick here. I feel ecstatic to be a part of GSoC with the best org 🙂 . The best part is – it’s never all about code. There’s a lot to gain. Each day marked gaining maturity and thinking more and more like a real developer. I started feeling at ease with the communicate → code → integrate chain. It really feels fortunate to be a part of the MetaBrainz family where everyone is a ping away ❤ .
GSoC marks the kickstart of my journey with MetaBrainz and I will be here lurking on IRC, shooting PRs to make the projects more and more awesome.
- Robert Kaye (ruaok) for being a mentor and a companion, guiding me through the dev life and real life.
- Param Singh (iliekcomputers) for always keeping the spirits high.
- Nicolas Pelletier (Mr_Monkey) for guarding me against Cascading Snot Swab issues.
- Alastair Porter (alastairp) for fishing out the best practices from his pool of intelligence.
- Vansika Pareek (pristine___) for some awesome playlists.
- Frederik “Freso” S. Olesen (Freso) & C. “CatCat” Holm (CatQuest) for the best end user perspectives and reviews.
- GSoC & MetaBrainz for such a wonderful experience.
Hey everyone! I am Ishaan Shah (ishaanshah), a sophomore at International Institute of Information Technology – Hyderabad, India. This summer, I worked on ListenBrainz as a participant in Google Summer of Code ’20. My project involved generating statistics and visualisations for users using Apache Spark. This blog is an overview about the work I did and my experience working with ListenBrainz.
I started contributing to ListenBrainz in January 2020. My first PR was for LB-179, a small Quality of Life improvement to the LastFM importer. My first major contribution was porting the LastFM importer to ReactJS. Over the next two months, I continued working on the frontend, where I mainly worked on improving the frontend infrastructure by adding support for automated testing, porting the codebase to TypeScript and standardising the frontend code using ESLint and Prettier.
After making a few patches, I understood how ListenBrainz worked and got comfortable with the codebase. I decided to make a proposal for adding statistics to ListenBrainz using Apache Spark. While writing the proposal, I referred to many other websites, blogs, as well as community discussions for different ideas about statistics which could be added. After some research, I narrowed down on the specific graphs and statistics that I wanted to calculate during GSoC.
Community Bonding Period
Since I had been working with the MetaBrainz community since January, I was familiar with how things worked in the community. So we decided to use the Community Bonding Period for fixing and updating the Top Artists charts for a user. The first task that I took up was to add an API endpoint for fetching the Top Artists data for a user programmatically. Until then, I had mostly spent my time working on the frontend, this task helped me in getting familiar with the backend architecture. Next, I worked on porting the Top Artist graph from d3 to nivo – a charting library built with ReactJS and d3. The Top Artists graph only supported All Time statistics before. I worked on adding support for more time ranges. This was the first time I worked with Apache Spark and the PR for this took quite some time, but it was essential that we got it right as most of the statistics we built further would use a similar workflow. After we were satisfied with the overall flow of the data from our Spark cluster to the web server, I started working on showing the stats for different time ranges on the website. Although this task seemed easy at first, it took much longer than expected. We encountered some bugs and received some user feedback when we deployed the graph to production. The rest of this period was spent on incorporating the user feedback and fixing the bugs.
First Coding Period
We now had a somewhat stable pipeline for calculating the stats and sending them to the server. I started working on the backend for Top Releases stats for a user. We ran into memory issues when calculating these stats on the cluster, so I spent some time finding the cause of the issue and realised that we were collecting the results all at once which was causing the driver to run out of memory. I fixed this by collecting the results for each user separately and tweaking some RabbitMQ parameters to make sure that messages aren’t dropped while sending them to the server (PR #897). After this, I added Top Recordings for a user. Now we had a brand new Charts page that displayed the user’s Top Artists/Releases/Recordings for different time ranges. Next I started working on temporal statistics for a user i.e, number of listens in a past time range. The query that I wrote for calculating this data turned out to be pretty inefficient for larger datasets. So I ended up writing two versions of the same query: one for large datasets and one for smaller ones. While working on displaying these stats on the frontend, I tried various representations of the data. I finally settled on displaying the data as bar graphs, as shown on this report view.
Second Coding Period
I added two more graphs in this period: Daily Activity and Artist Origins. The Daily Activity graph shows the number of listens a user has at a particular time of the day. I implemented the query for calculating this data in a slightly different way compared to the Listening Activity query. This change improved the query speed significantly. I had some trouble finding a correct way to represent this data. My mentor helped me in this by suggesting the usage of a Heatmap, and the results turned out to be pretty good.
Next, we worked on the Artist Origins graph, which provides an insight into the geographical diversity of a user’s musical taste. I had a lot of help from the ListenBrainz team for this graph and I couldn’t have done this graph without their help. This was by far the most interesting stat that I worked on during the project. Furthermore it laid a general framework to calculate statistics using the data from MusicBrainz. After deploying this map on production, we received feedback from the users that the map looked plain for most of them and there wasn’t much colour difference between different regions. This happened because people generally tend to listen more songs from their home country, so there is a huge difference between the country with maximum artists and average number artists from other countries. We fixed this issue by changing the colour scale from linear to logarithmic.
Final Coding Period
We now turned our attention towards calculating some stats for the whole website. We decided to make a graph for the Top Artists over different time ranges. We thought that this would be relatively easy given that we had already done something similar for individual users before. However we hit an unexpected bump; the data we were calculating was not accurate, mainly because of various different sources of the artists and some minor changes in the artists’ name or metadata resulted in a different entry with a different listen count for the same artist. Moreover, we found a couple of users spamming our website for self promotion and we did not have a solid way to deal with this. Around this time, my college resumed and the amount of time I could dedicate to LB reduced severely. So we decided to use the remaining time to work on improving the frequency at which stats are updated. I have an open PR (#1052) for doing this at the time of me writing this blog and we should be able to implement this functionality in the near future.
The past 4 months have taught me a lot of things. I learnt new technical concepts everyday. I started writing code as a developer rather than a programmer. I understood the importance of proper unit and integration testing (even though it was my least favourite part while adding a new functionality). I also found it much easier to talk and interact with people both online and in real life. Frequent deployments of new features to production helped us a lot. We were able to catch bugs when we still had some context over the code written and also received feedback from the users about how we could improve the new features added. It also kept me motivated to keep working on new graphs and statistics and gave me a sense of satisfaction when I saw them on the production server. I also learnt that things don’t always go the way we expect them to. More often than not, you will run into some bumps while adding new features so it is better to keep some extra time to deal with these issues.
GSoC gave me a wonderful opportunity to work with some amazing people from all over the globe. I was not able to complete all the graphs that I had planned for this summer, but I do plan to continue working on ListenBrainz to add more statistics and new features.
- Param Singh (iliekcomputers) for being an amazing mentor and helping me whenever I was stuck on an issue.
- Robert Kaye (ruaok) for providing some really insightful feedback and the MusicBrainz data that was required for calculating the Artist Origin map.
- Nicolas Pelletier (Mr_Monkey) for helping me with the frontend for the user Charts page and providing some amazing tips for ReactJS.
We’ve recently received our annual $30,000 support from Google. The brings the total amount donated by Google’s Open Source Programs Office to us to over $470,000 — hopefully next year we’ll cross the half million dollar threshold!
I can’t quite express my gratitude for this level of support! Without Google’s help, especially early on, MetaBrainz may never have made it to sustainability. Google has helped us in a number of ways, including Google Code-In and Summer of Code — all of these forms of support have shaped our organization quite heavily over the past 15 or so years.
Thank you to Google and everyone at the Google Open Source Programs Office — we truly appreciate your support over the years!
For Starters… Who Am I?
My name is Aidan Lawford-Wickham, better known as aidanlw17 on IRC, and I’m entering my second year of undergraduate study in Engineering Science at the University of Toronto. This summer, I had the opportunity to participate in my first Google Summer of Code with the MetaBrainz Foundation. Working on the AcousticBrainz project under the mentorship of Alastair Porter (alastairp), I used previous work on measuring track to track similarity as the basis for a similarity pipeline using the entire AB database.
How Did I Get Involved?
When I started applying for GSoC, I needed to find an organization that paired a challenging learning environment with a project of personal interest. Given my own passion for listening to music, playing music, and exploring its overlap with culture, MetaBrainz quickly became my top priority. I jumped on the #metabrainz IRC channel for the first time, and I’ve been active daily ever since!
From there, the whole community welcomed me with open arms and responded thoughtfully to my questions about setting up my local development environment. I made my first pull request for AcousticBrainz, AB-387, which added the ability to include dataset and class descriptions when importing datasets as CSV files. This allowed me to work alongside my soon-to-be mentor for the first time and further acquaint myself with the acousticbrainz-server source code.
I was excited about my first PR and wanted to contribute more. Not only was this a project related to my passions, but it had already begun to teach me about technologies that I hadn’t used before. I was struck by the possibility to contribute more, and work with great people on a non-profit, open source project. I quickly decided that MetaBrainz was the only place I would apply for GSoC and began to think about proposals. I read through the previous work on recording similarity done by Philip Tovstogan, which was based upon a PostgreSQL solution with shortcomings in terms of speed. With a strong supporting background, high community interest, and my own dreams of the possibilities to come from predicting similar tracks, I created a proposal to build a similarity pipeline using Spotify’s nearest neighbours library, Annoy. The timeline and tasks shown on the full proposal were adjusted throughout the summer, but the general objectives were maintained. Looking back on the summer now, the basic requirements for the project were as such:
- Using the previous work, define metrics for measuring similarity that will translate recording features from the AB database into vectors. Compute and store these vectors for every recording in the database.
- Create an Annoy index for each of these metrics, adding the metric’s vector for each recording to the index.
- Develop methods of querying an index, such as outputting nearest neighbours (similar recordings) to a specific recording or many recordings, or finding the similarity between two recordings.
- Allow users to query the indices via an API.
- Create an evaluation that allows us to measure the success of our indices in the public eye, fine tune our parameters, and display index queries via a graphical user interface.
Community Bonding Period
After losing sleep before the announcement, and a huge sigh of relief on May 6th, I was ecstatic to get started.
There was plenty of required reading, and I familiarized myself with the different elements of building similarity into AB. After discussing with Rob (ruaok) and Alastair and cementing our decision to use Annoy as the nearest neighbours algorithm of choice, I took to reading through Annoy documentation and making a small implementation to grasp the concepts. Annoy works blazing fast, and uses small, static files – these are points that would prove advantageous for us in terms of querying indices many times, as quickly as possible. Static index files allow for them to be shared across processes and could potentially make them simple to redistribute to others in the future – a major benefit for further similarity research.
I studied Philip’s previous work, gained an understanding of the metrics he used in his thesis, and reimplemented all of his code to better grasp the concepts and use them as a basis for the summer. Much of Philip’s work was built to be easily expandable, and flexible to different types of metrics. Notably, when integrating it with a full pipeline including Annoy, priorities like speed meant that we lost some of this flexibility. I found this to be an interesting contrast between the code structure for an ongoing research purpose, and the code ready to be deployed in production on a website.
All the while, I kept a frequent dialogue with Alastair to gel as a team, clarify issues with the codebase, and further develop our plans for the pipeline. To build on my development skills, learn more about contributing guidelines and source control, and improve the site, I worked on some exciting PRs during the bonding period. Most notably, I completed AB-406 over a series of 3 PRs, which allowed us to introduce a submission offset column in the low-level table to handle multiple submissions of a single recording. This reduced the need for complexity in queries to the API, decreasing the load on the server. Additionally, I added some documentation related to contributions and created an API endpoint that would allow users to only select specific features rather than an entire low-level document for a recording – aiming at reducing server load.
Last but not least, I got really involved with the weekly meetings at MB! We have meetings every Monday on #metabrainz to give reviews of the last week, and discuss any other important community topics. I love this aspect of the community. Working remotely, it creates a strong team atmosphere and brings us all a bit closer together – even if we’re living time zones apart. During one meeting, we discussed whether or not past GSoC proposals should be available to students. What do you think? This prompted me to share my own experience with the application process at MetaBrainz and look into if/how we could improve it.
… And so it began, we dove into the first coding period.
The Key Components, a Deeper Look
Computing Similarity Metrics
Having explored the previous similarity work from Philip, I used his definitions of metric classes and focused on developing a script to compute metrics for each recording in the database incrementally. Recognizing that we would also need a method of computing metrics for a single recording on submission, I made this script as open ended as possible. After successfully computing all metrics for the first time, we went through an iterative process of altering the logic and methodology to dramatically improve its speed. Ultimately, we used a query to get the batch of low-level recordings that haven’t had similarity computations, complete with their low-level data and all high-level models. Though we revised and found bugs in this script time and time again, I’m confident in saying that with perseverance we finally got it working.
Prior to the beginning of the project I had limited experience working with SQL databases, and this objective pushed me to develop new ways to approach problems, and gave me a much deeper understanding of PostgreSQL.
Building Annoy Indices
With all that vectorized recording data from the metrics computation, nothing sounds better than adding it to an ultra-fast index built for querying nearest neighbours! Feeding the data into an index and watching it output similar recordings in milliseconds became the most satisfying feeling. The Annoy library is a platform for nearest neighbours of all sorts, and it is generally simple: define the index, add items with an identifier and a vector, built the index, save it for later use, load it up, and then use its built-in methods to query for similar items. Easy, right? The added challenge is making this interface with recordings from our database as items, and meeting our needs in terms of speed and alterability when new items are added. Annoy is built without checks in many places, and we required a custom cycle of building, loading, and saving indices to ensure they were operable for our purposes (once an index is built, new items may not be added). At this point, the index model is open to saving new indices with different parameters, which allows us to tune as we further develop the pipeline.
After wrapping the index in a class that interfaced with our needs, we added scripts to build all indices and save them, and scripts to remove indices if need be. Currently, the project has 12 indices, one for each metric in use:
- Weighted MFCCs
- Weighted GFCCs
- Onset Rate
Making API endpoints available was a high priority activity and was an exciting aspect of the project since it would allow users to interact with the data provided by a similarity pipeline. Using the index model, I created three API endpoints:
- Get the n most similar recordings to a recording specified by an (MBID, offset) combination.
- Get the n most similar recordings to a number of recordings that are specified (bulk endpoint).
- Get the distance between two recordings.
For each endpoint, a parameter indicates the metric in question, determining which index should be used. Currently, the endpoints also allow varying index parameters, such as the distance type (method of distance calculation) and number of trees used in building the index (precision increases with trees, while speed decreases).
A full explanation of the API endpoints is documented in the source code.
As I said, an index can be altered using multiple parameters that impact the build speed, query speed, and precision in finding nearest neighbours. Assessing the query results from our indices with public opinion is a top priority, since it gives us valuable data for understanding the quality of similarity predictions. With the evaluation we will be able to collect feedback from the community on a set of similar recordings – do they seem accurate, or should a recording have been more or less similar? What recording do you think is the most similar? With this sort of feedback, we can measure the success of different parameters for Annoy, eventually optimizing our results.
Moreover, this form of evaluation provides a graphical user interface to interact with similar recordings, as a user-friendly alternative to the API endpoints. Written using React, it feels snappy and fast, and I feel that it provides a pleasing display of similar recordings. At this point in the project I was glad to accept a frontend challenge which differed from the bulk of my work thus far.
Documentation and Project Links
Similarity pipeline related:
- The project set-up documentation (pull request)
- The full similarity pipeline complete with evaluation (pull request)
- Code only for computing metrics (pull request)
- Specific low-level features endpoint (pull request)
- Integrate submission offsets into low-level table (pull request 1, pull request 2, and pull request 3)
- Bulk get items with single database query (closed, unnecessary) (pull request)
- Bug tracking and git workflow documentation (pull request)
- Dataset and class descriptions for CSV import/export (pull request)
This summer allowed for us to build on previous similarity work to the point of developing a fast, full pipeline. At this point, there is still a vast amount of work to be continued on the pipeline and I am eager to see it through. In the upcoming year I plan to continue contributing to AcousticBrainz and the MetaBrainz Foundation as a whole. These are areas that I’m interested in continuing to develop for the recording similarity pipeline:
- Parameter tuning on Annoy indices
- Adding more metrics to cover other recording features
- Adding support for hybrid metrics that consider multiple features (this was started by Philip and should be integrated to provide more holistic similarity)
- Making indices available for offline use
- Creating statistics and visualizations of vectors for each metric
To say the least, this has been a highly rewarding experience. MetaBrainz is a community full of extraordinary, thoughtful, and friendly developers and enthusiasts. I will be forever thankful for this opportunity and the lessons that I gained this summer. I am so excited to meet everyone at the summit this September! I’d like to personally thank my mentor, Alastair Porter (alastairp), for his perceptive guidance, his support, his friendship, and his own contributions to the project. Thanks to Robert Kaye (ruaok) for his support, thoughts, and enthusiasm towards this project, as well as for his dedication to MetaBrainz. Thanks to Google for making this all possible – SoC is a highly unique opportunity to learn about open source software and make new connections! Cheers.
I am Anirudh Jain (Cyna on IRC), an undergraduate student at Bharati Vidyapeeth’s College of Engineering, New Delhi, India. I’ve been working on the MusicBrainz project of the MetaBrainz Foundation as a participant in Google Summer of Code 2019. This year marks the beginning of me as an Open Source developer. My work during the GSoC 2019 period can be found in my “temp” branch in my musicbrainz-server clone. The changes there will slowly get merged into the “cyna-gsoc” branch in the main musicbrainz-server repository on GitHub as they’re reviewed.