GSoC 2017: Rating System in CritiqueBrainz

Hello!

I am Pinank Solanki, an undergrad at Indian Institute of Technology (IIT) Mandi, India. I worked with the MetaBrainz Foundation on one of its projects, as part of the Google Summer of Code 2017 over the last summer. It was one of the best and exciting summers I ever had.

Let me begin from the beginning. I first came to know about MusicBrainz in January and first contacted the community in February and was immediately hooked. Initially I decided to make a proposal for addition of book reviews for CritiqueBrianz, but it was not possible because the BookBrainz web service was unstable and the CritiqueBrainz’s host didn’t have direct access to BookBrainz database. So I tried to pitch my own ideas. But then, in one of the weekly meetings, I saw great support and enthusiasm among the community members for rating system for reviews —and I personally liked the idea of the project and thought it would be a great addition to CritiqueBrainz. I submitted my proposal, got accepted and a treat to the friends was due!

Overview

The aim of the project was to add support for three types of reviews: text, rating, text+rating (CB supported only-text reviews).

The schema changes and data-access functions are completed and merged. The frontend part is mainly completed including the fundamental functionality along with additional features. It took a lot of time to select and modify the rating input plugin perfectly satisfying the project’s needs. There is still some work to be done, most of which is based on the rating scale conversion in db package. Similarly, most of the web service part is completed and is held up due to the rating scale conversion PR.

Implementation

Schema changes

The schema changes done are quite different than what was mentioned in the proposal. My mentor for the project, Roman Tsukanov (Gentlecat), recommended some changes which would make keeping track of revisions a lot easier. You can see the schema here and the PR here.

Data-access functions

By the time I started working on the project, CB has migrated off the ORM. So, I wrote raw SQL queries and its tests. See the PR here. The rating scale was decided to be 1-5 but for storage a scale of 0-100 is used just like MusicBrainz keeping the possibility of migration of ratings from MB to CB in mind (more info at CB-245). This part is covered in the PR here.

Changes in user-interface

This plugin is used for rendering the rating star icons. The code can be seen in this PR. See the images below to get a good idea about the implementation.

Write review page:

cb-write-review

Review page:

cb-review

Entity page:

cb-entity-page

Revision comparison:

cb-revision-comparison

Web service

All the functionalities added to CritiqueBrainz had to be implemented in the web service (API) as well. All three types of reviews and other features are now supported via the web service. See the PR here.

Documentation

The chief part of the documentation was to update the schema. Other than that, rating parameter and several notes were added to the API documentation. See the PR here.

Other PRs relevant to the project can be found here.

Future work

First of all, I will complete the leftover work. Web service and frontend PRs are dependent on the rating scale PR. Once it gets merged, it’s 2–3 days of work to complete the rest.

Other than that, I look forward to keep contributing to CritiqueBrainz and other MetaBrainz projects. I am sure many interesting ideas will be discussed at the annual MetaBrainz Summit in Barcelona.

Conclusion

It was quite an eventful summer and GSoC was the biggest of them. Thanks to Roman for his constant help and guidance over the entire summer and also to all the other community members. It was so cool to work on an open-source project and I would definitely suggest for any music and data lover to explore the MetaBrainz projects.

GSoC 2017: Hacking on ListenBrainz

Namaste!

I am Param Singh, an undergraduate at the National Institute of Technology, Hamirpur, India, and I worked on ListenBrainz over the summer as part of the Google Summer of Code program. I started contributing code to ListenBrainz in January 2017 and have been working on new features and bug fixes since. I’ll be writing about the work I did and my experience working on LB in this blog post.

After a few of my patches had made it in and I was comfortable with the ListenBrainz codebase (which was a really nice example of software architecture for me), I talked with the LB team about what possible contributions I could make over the summer, and we decided that a Google BigQuery based statistics system is something that would be useful to have in ListenBrainz after we release a beta and have listen data that is permanently archived. I made a proposal for adding statistics to ListenBrainz which got accepted! During the community bonding period, we decided to try to get a solid and stable beta of ListenBrainz released before starting with the relatively large code additions that would be required by my project proposal. We tracked issues that we wanted fixed before a release in the MetaBrainz ticket tracker here. This work of fixing release blocking issues went into the coding period and we decided to continue working on a solid beta instead of adding new features for the time being.

I started with fixing bugs and adding new features to get a beta released as soon as possible. Some cool stuff I worked on during this time was dockerizing MessyBrainz (see PR here), migrating the codebases of MessyBrainz and ListenBrainz to Python 3 (PRs here and here) and improving the startup resilience of various parts of ListenBrainz to make sure that the server is able to self-heal (partially) if some part of it like RabbitMQ goes down (ticket here).

Later on, I did a big refactor of the LB code so that adding new modules would be easier in the future (PR here). I also spent a lot of time fixing bugs in our listen deduplication. Relevant pull requests for this are here and here.

Another feature I added to ListenBrainz while working on the beta was incremental imports. Earlier, LB didn’t keep track of previous imports of a user and did a full Last.FM import every time. However, now we keep track of the last time each user imported listens and only import new data since then. The PR adding incremental imports is here.

My mentor, Robert Kaye (ruaok) set up a test instance of the ListenBrainz server that was used by the community and as the community kept throwing their data at us, bugs kept popping up. A particularly weird bug caused LB to lose data for users with special characters in their usernames. The PR to fix this took a lot of time to create.

We kept on fixing bugs for a long time and the biggest thing I took away from this period of GSoC was the Ninety-ninety rule: «The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.» This summer has drilled this into my mind.

As soon as the beta was released, I started with writing code for statistics, making schema changes (PR here) and adding some user stats (PRs here and here). I’ll be continuing on the stats work after Summer of Code. The basic foundation of stats is mostly done and soon I’ll start with showing statistics to the users.

By the end of the official GSoC coding period, I have made 266 commits in the ListenBrainz codebase and have opened a total of 111 pull requests. The current production ListenBrainz running on https://listenbrainz.org has 253 commits by me, most of which were made during the GSoC period.

Over the summer, I have fallen in love with the MetaBrainz community and have learned a lot of stuff. I’m really looking forward to adding more features to ListenBrainz soon, so that the data that the community is contributing becomes useful to everyone. I loved working on a really cool open-source project like ListenBrainz this summer and am very thankful to Google for providing me this opportunity. I would encourage everyone reading this to give the ListenBrainz beta a try and contribute to ListenBrainz if possible.

GSoC 2017: Directly accessing MusicBrainz DB in CritiqueBrainz

Hello, everyone! This summer was fantastic for me!
I’m Suyash Garg, an undergraduate at National Institute of Technology, Hamirpur and I participated in Google Summer of Code 2017 contributing code to CritiqueBrainz. Alastair Porter mentored me during this GSoC programme. This post summarizes my contributions to the project and experiences that I had throughout the summer.

I started contributing to CritiqueBrainz in January, 2017 and before the start of the SoC programme, I mainly worked on writing raw SQL for retrieving data from the CB database and replacing the ORM code (CB-230). Other than that I worked on issues like CB-120, CB-235 and other minor bugs and issues. They were my first proper contributions to the open source world. Thank you MetaBrainz!!

For the Google Summer of Code 2017, my project involved retrieving data related to various entities (release-groups, artists, releases, events and places) directly from the MusicBrainz database instead of querying the MusicBrainz web service (CB-231). This became necessary as some pages on CB required to fetch too much data and thus made many requests to the MB web service. These pages were taking a long time to load. Thus, by connecting directly to the database, we could reduce the load time of these pages.

Here is a summary of my contributions to the project during the summer:

Accessing the MusicBrainz database
New Infrastructure is allowing us to easily read data directly from the MusicBrainz database. For accessing the database in the development environment, another service running the MusicBrainz database was added which uses an existing Docker image which the MusicBrainz project was already using. This allowed us to share resources between projects. I worked on adding an option to download the database dumps and import the data into the database (see PR#523). Also, I added the service in CB docker-compose files and updated the documentation for setting up the development environment (see PR#115 and PR#92).
Fetching data using mbdata.models
After setting up the development environment, my mentor suggested to me to use the mbdata package for writing queries to fetch data from the database instead of writing raw SQL. I worked on retrieving information for the entity: places and added helpers for fetching the relationship information. Following that, I worked on retrieving information for other entities (release-groups, releases, events, and artists). Also, since SQLAlchemy makes lazy queries to the database, a number of queries were being issued to the database. This could slow things down as for each query it was going to require one trip to the SQL server (network trip in production). So, as suggested by my mentor, I also worked on reducing the number of queries made for fetching data related to each entity (see PR#135). For pages that made a number of requests to the web service, I made this PR#121 for fetching information related to multiple entities at the same time.
Testing
For testing, the database queries are mocked using the unittest.mock Python package. The tests added make sure that the code (serializing RowProxy objects to dictionaries, caching, etc.) works properly (see PR#134). Adding up a new service (as a separate Docker container) in the test environment and running tests was taking too much time (in creating the tables and truncating them). So as suggested by my mentors, mocking the database queries was the best option. Throughout my GSoC period, I learned how important it was to write tests (especially when you break things more when you fix something) and make them run fast. I learned that «If tests don’t run fast, they would be a distraction rather than a help» (quoting from the book “The Art of Agile Development” by James Shore).

Other than these, I also worked on some UI/UX issues, namely CB-80 (adding option to filter releases with reviews), CB-84 (ordering release groups according to release year) and CB-261 (authenticating requests to Spotify Web API). CB-130 (reviewing entities with MBID redirects – see PR#145) was also solved while solving a production server issue.

This summer was awesome for me. I learned a lot of new things and techniques for writing better code. Thanks to my mentors, Alastair Porter and Roman Tsukanov. Also, great thanks to the lovely MetaBrainz community and Google for this opportunity. I’m really looking forward to keep contributing to CritiqueBrainz and to dive into other MetaBrainz projects.

MetaBrainz has been accepted to Summer of Code 2017!

I’m pleased to announce that MetaBrainz has been accepted into the Google Summer of Code program for 2017!

If you are an eligible university student who would like to participate in Summer of Code and get paid to hack on a MetaBrainz project over the summer, take a look at our ideas page for 2017. If this sounds interesting, take a look at our getting started page.

We kindly ask that you carefully read the ideas page and the getting started page before you contact us for help!

Thanks and good luck applying!

Wrapping up Google Code‐in 2016

As Google has now announced the finalists of this round of Google Code-in, it feels only right that we also take a look back at what’s been taking up a large chunk of our time the last couple of months.

Divya Prakash Mittal (India – returning from last year!), Daniel Theis (Germany), and Tigran Kostandyan (Russia) all made it to the finalists, with Anshuman Agarwal (India) and Daniel Hsing (Hong Kong) additionally taking home the Grand Prize Winner positions! Congratulations to all of you and thank you so much for your contributions this year. 🙂 Continue reading “Wrapping up Google Code‐in 2016”

GSoC ’16 + ListenBrainz = fun :)

Hello,

I am Pinkesh Badjatiya and I have been working on ListenBrainz as part of GSoC ’16. I was largely involved in implementing the most requested features in ListenBrainz.
I began my journey with MetaBrainz not long before the Final Organization list was out. I started with MusicBrainz but moved quickly to ListenBrainz, and have been working on it since then.

About the project

The project consisted of creating a proxy scrobbling API similar to last.fm’s which could be used by existing desktop clients to submit listens to listenbrainz.org. I submitted my initial idea, that involved creating a new API along with few other optional features that were very much required (import, export, etc.).
The project made its way through the approval process, and I worked with ruaok (my mentor) & alastairp to get important things done. Yey!

Here are some of the snapshots of the my journey with ListenBrainz.

API_compat

ListenBrainz already had its own API which can be used to fetch/submit listens but all the existing clients that support scrobbling to last.fm use the ws.audioscrobbler.com’s API. To add support for these clients, I ended up creating a proxy API, api_compat (as in “compatible API”), that translates every request that is sent to “api.listenbrainz.org/2.0/” in the native format. This is an additional API which can be used along with the existing native ListenBrainz’s API.

This was largely the main goal of my project proposal. The instructions for scrobbling using Audacious are attached along with the source code.

Import lastfm-backup

The import page now allows users to import listens from the last.fm scrobbles or from the backup file which was downloaded from the older version of the last.fm website.
import_backup On successful import of listens from backup, you’ll get the following notification.import_success

Export listens

This allows users to export the listens from the listenbrainz.org website. This is useful for users who want to keep track of their listen history offline as well.
The export feature can be accessed from the drop-down menu.

export_dropdown_menuexport_page

Playing Now

With the support for API-Compat, the support for currently playing song was needed. This keeps the currently playing song on the website in sync with your favourite player.
playing_now

Import scraper uses audioscrobbler API

I also worked on updating the import scraper which now use the ws.audioscrobbler.com‘s API allowing users to import without opening their last.fm profiles. This also provides other useful track information to ListenBrainz.

Migrate to PostgreSQL

Another important change to ListenBrainz was how it stored listens. We moved from using Cassandra to PostgreSQL. Cassandra was fast and effective but getting more information other than the user’s listens (ex. generating statistics) was not possible. So we switched to Postgres + Redis. This opened more possibilities for future.

Experience

After 3.5 months, I ended up with 15 merged and 3 closed PR’s and a bunch of features for ListenBrainz that improved its look and feel.

My pull requests: https://github.com/metabrainz/listenbrainz-server/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Apinkeshbadjatiya%20

I have worked on quite a lot of varied things in the past 4 months. A lot of them were actually not the part of the GSoC proposal but they were done largely in the same timeline or were optional targets, so I suppose they would count significantly towards GSoC.
I worked largely with alastairp, ruaok and Gentlecat. Gentlecat helped improve my coding style by providing feedback on my PR’s. I worked with alastairp and ruaok regarding the ideas/suggestions on how to address a problem and its possible solutions. It was a interesting experience working with the community and getting to know about MetaBrainz. Now that my understanding of the project and the community has increased, I look forward to making some great contributions!

Conclusion

In short, ListenBrainz went through a hell lot of changes in the past 4 months. If you were waiting for it to improve before using it, then now is the time that you should try it. I bet you’ll love its new look and you won’t be disappointed. 😀

Summer of Code ’16 with Picard

Hi! I’m Rahul Raturi, GSoC participant for Picard. This was my first GSoC, and it’s been a pretty awesome experience. Following is the overview of my project.

About the project

The outline of the project is to allow searching for albums, artists and tracks from within Picard. This avoids switching back and forth between web browser and Picard for searching, say release. If Picard fails to auto tag a file usual flow to tag the file with correct metadata is to first select the file, then click on “Lookup in Browser”, then search correct release, and load it into Picard by clicking the green “Tagger” button. In some systems, the “Tagger” button wouldn’t show, which was also a nuisance. With this patch, the entities can be searched and optionally loaded into Picard using built-in search dialogs, so no application switching.

Search dialogs

Picard already provides search options (through a web browser) for three entities; namely track, artist, and album. So I’ve built search dialogs for these three.

  1. Track Search Dialog — Searches for tracks and allows optionally loading corresponding album back into Picard. track_dialog
  2. Album Search Dialog — Searches albums and optionally allows loading the selected one into Picard. Screenshot from 2016-08-15 17-08-09
  3. Artist Search Dialog —  Displays basic information about the artists. To get more information about the selected artist, there’s an option to lookup him/her in browser. Screenshot from 2016-08-20 15-55-42

Searching similar tracks/releases

This is another important part of the project. Sometimes Picard fails to auto tag a file (or a cluster), or incorrectly tag it. These dialogs may prove useful here. To get expected data, right click on the file (should be in “Unmatched Files” cluster), and select “Search for similar tracks…”. The track search dialog would pop up, and expected release can be looked up there. Same procedure is for searching clusters.

Links to my work

Each PR is based on the previous one. A new dialog in each, plus some improvements to existing dialog. For trying the dialog, clone the artist search branch, until it gets merged into master. It has the most recent changes.

Note: To use these dialog for searching, an option in User Interface setting about built-in search needs to be enabled.

Conclusion

It was quite fun doing this project. Thanks to Michael Wiencek (mentor) for the guidance and leniency :). Also the Picard team for the reviews. I look forward to contribute more to Picard, now that I’ve a better understanding of the code. Also for another Summer of Code.

BookBrainz GSoC Gamification/Achievement System

Hi guys, I’m Max (AKA QuoraUK), a university student working with BookBrainz as part of Google Summer of Code. My project this summer has been to build a new gamification system, that introduces rewards for BookBrainz users and recognises their achievements. Here I will explain the system and the features I’ve implemented.

Overview

My original specification for the gamification system is here. To summarise, the idea behind gamification is to add game-like elements to the site in order to make it more engaging for users. The plan for the gamification of BookBrainz was:

  • Add badges and titles for users to earn on the BookBrainz site
  • Allow users to display badges and titles on their profile page
  • Encourage regular and high quality content

To implement this plan we have added 12 achievement tracks – once an achievement track is completed a title is unlocked. The artwork for the badges is currently “programmer art” and we are very open to other people designing replacements for them. This could be a part of this year’s Google Code-In. The achievements that will be available on launch are:

revisioncreator
Revisionist: Perform (1, 50, 250) Revision(s); Creator Creator: Create (1, 10, 100) Creator(s)
limitedpublisher
Limited Edition: Create (1, 10, 100) Edition(s); Publisher: Create (1, 10, 100) Publication(s)
pubcreatworker
Publisher Creator: Create (1, 10, 100) Publisher(s); Worker Bee: Create (1, 10, 100) Work(s)
runnerexplorer
Sprinter: Create 10 revisions in an hour; Fun Runner: Create a revision a day for a week; Marathoner: Create a revision a day for 30 days; Explorer: View (10, 100, 1000) Entities
timetrack
Time Traveller: Create an edition before it is released; Hot Off the Press: Create an edition within a week of release

All of these are unit tested and have unique badges for each tier on the track. If you would have already unlocked these achievements before the system was launched, you will earn them with your next revision/creation. Badge templates are available for developers to introduce new badges and adding achievements can be as simple as making a badge and adding a few lines of code.

Profile Page

profilednd
Profile Page, Drag and drop badge selector

The gamification system also brings some changes to the profile page. There is now a badge box which can contain your three favorite badges. Additionally, your selected title is shown next to your username. You can select your favorite badges in the new achievements menu on the profile, then drag and drop your favorites into the boxes. Titles can be selected by going to Edit Profile, and selecting them from the drop down menu.

Other Areas

2016-08-20_16-12-21
Achievement Alert

On creation of an entity or revision you will now see an alert if an achievement is unlocked. This will prompt you to go to your profile page and set the ones you want to display. Usernames in other areas of the site can be hovered over in order to see the title they have set.

Demonstration

Here is a demonstration video I’ve made for the system:


Continue reading “BookBrainz GSoC Gamification/Achievement System”

GSoC 2016 students and projects

Google announced the final list of Google Summer of Code 2016 students and their projects yesterday. The list of MetaBrainz’ projects can be seen at our page on the GSoC site, but just for good measure, here’s the rundown:

MusicBrainz
Jeff Weeks (weeksio) returns to finish up the SOLR search server. We’re really hoping that this will be the end of our current search server woes. He will be mentored by the German duo of Ulrich Klauer (chirlu) and Rob Kaye (ruaok).
MusicBrainz Picard
Rahul Raturi (rahulr) will be working on improving searching MusicBrainz from within Picard, mentored by MusicBrainz’ senior developer Michael Wiencek (bitmap).
BookBrainz
Max Prettyjohns (QuoraUK) is going to try and take on adding gamification to our fledgling book/literature database. He will be supervised by the BookBrainz project leads and lead developers Ben Ockmore (LordSputnik) and Sean Burke (Leftmost).
ListenBrainz
Pinkesh Badjatiya (armalcolite) has pledged to tackle adding a much requested feature for our youngest project: implementing a Last.FM compatible submission API. Robert Kaye (ruaok) will be the one guiding him along.
AcousticBrainz
Daniele Scarano (hellska) will be spending the summer writing a toolkit for creating datasets, which should help researchers using AcousticBrainz. He will be mentored by MetaBrainz software engineer Roman Tsukanov (Gentlecat).
Kartik Gupta (kartikgupta0909) has set out to create an offline client for computing AcousticBrainz dataset evaluations. Alastair Porter (alastairp), the AcousticBrainz project lead, will be their mentor.
Goran Cetusic (cetko), our final student of this year, will be exploring how AcousticBrainz data can be utilised within Google’s BigQuery storage under the guidance of Alastair Porter.

Congratulations and good luck to all our students! We’re looking forwards to following your progress over the summer and see what you end up with. 🙂

For all the students that applied but did not get accepted: we appreciate your applications, and even if you did not make the cut this year, we hope that you will stick around and apply with us again next year when we know you better – and you know us better.

For now, let the community bonding… begin! 🙌

MetaBrainz & GSoC 2016

Google announced who the chosen organisations for their Google Summer of Code were earlier tonight… And MetaBrainz was amongst them for yet another year (our 10th!)!

If you’re a prospective GSoC student who is hoping to apply with MetaBrainz this year, be sure to check out our guide to getting started and our currently suggested projects – but also note that we’d love for you to come up with your own suggestion! You may also want to take a look at the application template, which contains suggestions about
some of the things you should do and know before submitting your proposal.

For those of you who are already members of our community, you’re also welcome to add ideas to the suggestions page, and please be gentle when the GSoC hopefuls come along. 🙂

That’s really all for now, but expect to hear more about GSoC and our participation over the coming months!