Potential Security Leak

What Happened?

On March 29th 2013 we discovered that one of the MusicBrainz database dumps contained password hashes for a large portion of MusicBrainz accounts. While we don’t believe that these password hashes are either useful or widely distributed, we are requiring all users change their passwords.

What Data Was Leaked?

bcrypt password hashes, with a cost parameter of 8, for all accounts as of March 25th 2013.

Why Did This Happen?

We’ve recently began work on a long standing ticket against MusicBrainz server – MBS-357, “don’t store passwords in clear text”. We’re going to be moving away from clear text passwords, and we’ve decided to use one of the current industry standards for hashing passwords – bcrypt. Using bcrypt means that MusicBrainz will store only the hashes of passwords, which in laymans terms is a “fingerprint” of the password. Hashing means that we never store the actual password, but only the hash. There are many hashing functions available, and bcrypt is designed to be an expensive hash to compute with an adjustable “cost” – this makes it very hard to find out what the original password was via brute force attacks.

While this does mean that it’s hard to extract passwords from the hashes, the initial round of hashing passwords to move away from clear text is time consuming. As such, we built a small program that would gradually hash passwords over the course of a few days in order to make the switch from clear text passwords to secure password hashes done with as little downtime as possible.

This script hashed the password into the bcrypt_password column for all editors, and would also be notified when users changed their password in order to update the hash. Unfortunately, our database dump scripts sanitize this data by excluding data after-the-fact, rather than declaring what data to dump before running the script. As such, it dumped the entire editor table with the new column, as we forgot to add a rule to exclude this column.

Our Response

The database dumps that contain this data were promptly deleted, and have been replaced with correctly sanitized database dumps. Unfortunately logs from this server do show that this database dump was downloaded, and as we have no real indication of where this data now is, we’re treating this seriously. We have adjusted our database dumping scripts to be very specific about exactly which data they should export, so that in the future we will not leak private data by making the same mistake again.

We’re extremely sorry about this mistake, and while we don’t believe this data should allow attackers to retrieve user passwords, we can’t be 100% certain. As such, we require that all users change their password as soon as possible.

Search server fixes released

Last week’s search server release had some bugs that we decided should be fixed sooner than later. Paul Taylor rose to the challenge and fixed 4 important bugs and we just finished releasing the updated code. Thanks for your efforts, Paul!

Release Notes – MusicBrainz Search Server – Version 2013-04-04

Bug

  • [SEARCH-279] – Seach server returning wrong results
  • [SEARCH-280] – Artist search DAVID BOWIE → FRANZ SCHUBERT (score 100) !? Bowie (score 0)
  • [SEARCH-281] – If set explain=true option with dismax search it actually does a non-indexed search

Improvement

  • [SEARCH-267] – Create new rewrite method for Dismax FuzzySearch

Picard 1.2 released

Version 1.2 of Picard has been released today, with some notable features and bug fixes.

As announced recently, PUID support is being removed from MusicBrainz on October 15. This release of Picard has followed suit by removing all support for AmpliFIND/PUID scanning and submission. AcoustID is now the default (and only) method of scanning files. If you’re currently using AmpliFIND as your fingerprinting system, then after installing Picard 1.2 you must enable AcoustID in your options, under the “Fingerprinting” section.

Those who have the Picard cover art plugin installed will likely want to remove it. Picard 1.2 has built-in support for downloading cover images from a variety of sources, without the need for a plugin. Please see the new configuration under Options -> Cover Art.

Something important to note for people who build/package Picard is that Python 2.6 is now the minimum required version.

Special thanks to Wieland Hoffmann, Laurent Monin, Lukáš Lalinský, and everyone who’s helped contribute/report bugs for this release!

Picard 1.2 can be downloaded at:
http://musicbrainz.org/doc/MusicBrainz_Picard

Changes since 1.1:

  • Picard now requires at least Python 2.6
  • Removed support for AmpliFIND/PUIDs
  • Add support for the Ogg Opus file format
  • It’s now possible to download cover images without any plugin. Cover Art Archive images can be downloaded by image type
  • Improved directory scanning performance
  • Prefer already-loaded releases of the same RG when matching files
  • Allow dropping new files onto specific targets
  • Add basic collections management support (PICARD-84)
  • Allow adding custom tags in the tag editing dialog (PICARD-349)
  • Fix replacing of Windows-incompatible characters (PICARD-393)
  • Save both primary and secondary release types (PICARD-240)
  • Handle errors from the AcoustID service better (PICARD-391)
  • Accept HTTPS URLs on drag-and-drop (PICARD-378)

Updated search server now live

We’ve just updated our search servers to the latest version. Thanks to Paul Taylor for his long hard work porting our code to Lucene 4.1. Big thanks also to Murdos and Nikki for helping with this release!

Also, we’ve installed a small gigabit ethernet switch for our search server cluster so that we can move new search indexes around much faster. Hopefully we will see indexes updating in just over 2 hours from now on.

Read on for all the details of this release:

Release Notes – MusicBrainz Search Server – Version 2013-03-29

Bug

  • [SEARCH-239] – Search updater doesn’t update the index last-updated value returned by SEARCH-232
  • [SEARCH-244] – Since the October Schema Change Release, search server is now returning empty join phrases when once doesnt exist , whereas before it didn’t display it all
  • [SEARCH-247] – Runtime Exception: The property or field count on the class org.musicbrainz.mmd2.Medium$TrackList is required to be included in the propOrder element of the XmlType annotation.
  • [SEARCH-263] – Search server applications do not gracefully disconnect from PostgreSQL on termination

Improvement

  • [SEARCH-217] – Allow searching and displaying of folksnomy tags for the release entity
  • [SEARCH-246] – Extend support for searching for blank parameters to ISWC and ISRC

New Feature

  • [SEARCH-228] – Let Dismax Search for Labels search Label Code

Task

Updating our privacy policy

A number of bugs have been piling up pointing out inconsistencies in our privacy policy. We’ve realized that the existing privacy policy is quite brittle with respect to our fast paced server development cycle. I’ve edited the privacy policy in an attempt to future proof the policy while not eroding anyone’s privacy. The revised policy has already had some community review and now I would like to cast the net to a wider audience for review.

However, I would like to set the stage carefully here: This is not a call to re-write our privacy policy or to nit-pick it to pieces. I am trying to nail down the open bugs and ensure that our privacy policy reflects our current actual activities. So, unless you find an actual problem that needs to be addressed, I am not likely to make any further changes to the revised policy.

That said, here is the list of bugs I’m trying to close: MBS-5708, MBS-5709, OTHER-144, OTHER-145, MBS-5942, MBS-5948.

Here is the new privacy policy that should fix the bugs above. Compare that to our current policy and also have a look at the diff of the wiki markup between the two pages.

Server update 2013-03-25

We’ve just finished deploying changes from the last fortnight. This release is mostly bug fixes, but we also have a bit more validation when entering label URLs and some improvements to reports. Many thanks to Alastair Porter, Frederik “Freso” S. Olesen, Lukáš Lalinský, Nicolás Tamargo, Paul Taylor, Pavan Chander and the MusicBrainz team for their work on this release. Detailed changes follow:

Bug

  • [MBS-4364] – Various Artists should not have a gender (and other attributes)
  • [MBS-4868] – Regression : « → view aliases » link disappeared from alias edits
  • [MBS-5947] – Changed AC display doesn’t detect unchanged gaps
  • [MBS-5953] – oauth: web application reauthorization creates table spam
  • [MBS-5958] – Release editor did something very very wrong
  • [MBS-5979] – Automatic redirect to beta clear release editor seeding
  • [MBS-5996] – Release and work level relationships getting mixed up in release pages
  • [MBS-6012] – Internal server error in Wikipedia extract
  • [MBS-6016] – browse works by invalid artist id causes an error

Improvement

  • [MBS-2535] – Lookup labels based on Label Code when entering releases
  • [MBS-4894] – Reject non-BBC Music URLs when using the BBC Music relationship
  • [MBS-5491] – Check for “vs.” and “feat.” in PossibleCollaborations report
  • [MBS-5912] – Report of releases with cover art relationships

Task

  • [MBS-5975] – Add Anime News Network Encyclopedia to “Other Databases” whitelist
  • [MBS-5976] – Add Generasia.com to the “other databases” whitelist

The Git tag for this release is v-2013-03-25.

PUIDs are deprecated and will be removed on 15 October, 2013

tl;dr: On 15 october, we’re going to: drop table PUID;

In 2006 we added support for PUID acoustic fingerprints from MusicIP. MusicIP went out of business some years ago and the PUID service has been passed along, through various hands. Along the way it became neglected and the quality of the service went downhill. This spurred the creation of AcoustID which is our preferred solution for fingerprinting inside MusicBrainz today. We set out to let AcoustID support and PUID support live side-by-side in MusicBrainz for a while and we feel that almost enough time has passed. Therefore we’re going to remove PUID support from MusicBrainz in our autumn schema change release on 15 October, 2013.

If you depend on PUID support today, we encourage you to move over to AcoustID as soon as possible.

Official schema change notification for 15 May, 2013

We’re nearly done implementing the SQL portions of our tickets for the upcoming schema change on 15 May, 2013. We’ve settled on the following tickets that we plan to release:

  • MBS-5861: Dynamic work attributes
  • MBS-3978: Support more than one barcode on same release
  • MBS-4756: Move the wiki transclusion index to the database
  • MBS-799: Location, venue and event support
  • MBS-3985: Support multiple artist countries
  • MBS-4925: Add country of birth and country of death to Artist (person)
  • MBS-4115: Cover art archive: Support .png SQL changes
  • MBS-1839: Track MBID SQL changes
  • MBS-5809: Add a “description” field to collections
  • MBS-5314: Drop the work.artist_credit column
  • MBS-5302: Store International Standard Name Identifier (ISNI, ISO 27729) for artists and labels
  • MBS-5528: Change short_link_phrase to long_link_phrase
  • MBS-2229: Allow multiple release events per release
  • MBS-2417: Support multiple countries/regions on a single release
  • MBS-5772: Generate relationship documentation (semi-)automatically
  • MBS-5848: Instrument credits

Each of the tickets above will give you a complete idea of how we plan to change our schema on May 15th. Questions? Post a question in the comments and we will answer it.

Finally, we are going to require that Postgres 9.1 will be the minimum version of Postgres going forward. I’ve spoken to many people about this and it seems that a large percentage of people are already using Postgres 9.1, so this should not be a major change.

Thanks!

Server Update, 2013-03-11

Thanks to work from Nicolás Tamargo, nikki, Lukáš Lalinský, Paul Taylor, and the MusicBrainz team, we’ve just released a new version of the MusicBrainz website. As usual, this is mostly a bug fix release, but we do have one shiny new feature once again… OAuth2! It is now possible to authenticate with MusicBrainz using this, rather than the old digest auth system. For more details, see the Development/OAuth2 documentation page. This feature is new, so if you encounter any problems (here or elsewhere), please be sure to let us know!

Here’s a full list of what’s changed:

Bug

  • [MBS-4155] – Clicking twice the same relationship link → Internal Server Error
  • [MBS-4419] – Memcached should not be used for persistent data
  • [MBS-5358] – Can’t enter fuzzy ending date in the same year as the start date
  • [MBS-5829] – Internal server error when requesting /tracklist/ with an invalid ID
  • [MBS-5856] – Reorder medium edits use “Disc”
  • [MBS-5866] – “Direct database search” for a new artist is missing at least Gender and Country
  • [MBS-5873] – Modbot recommending to merge label with an artist
  • [MBS-5875] – Internal server error approving an edit with no votes
  • [MBS-5877] – Internal server error when adding a release with a non-existent release group MBID
  • [MBS-5878] – Internal server error when doing tag lookup
  • [MBS-5884] – Internal server error when viewing editing history
  • [MBS-5896] – Error in Spanish even though the interface language is German
  • [MBS-5908] – Some Wikipedia extracts not appearing on translated MB
  • [MBS-5914] – Tagger icons are not showing up in tables
  • [MBS-5915] – Add ISWC/IPI buttons display incorrectly
  • [MBS-5938] – sever-side warnings generated on a proper client request on /ws/2/
  • [MBS-5939] – User applications page shows empty tables when there’s nothing to display
  • [MBS-5945] – oauth2: entering an invalid URI then switching to “Installed Application” fails to submit (silently)
  • [MBS-5946] – oauth: web applications don’t remember authorizations for the same scope
  • [MBS-5950] – Crappy-written multiple-artist paragraph on FreeDB import
  • [MBS-5951] – Display the actions column for applications the same as elsewhere on the site
  • [MBS-5954] – Confirm revoking OAuth access

Improvement

  • [MBS-950] – Only allow permitted sites for some relationships
  • [MBS-5020] – Report:: recordings linked to same work more than once
  • [MBS-5218] – Report: Duplicate dated/undated relationships
  • [MBS-5839] – Relate To URL should auto-identify beatport as “can be purchased for download” relation.
  • [MBS-5940] – Remove fref=ts from Facebook URLs
  • [MBS-5949] – Update the wording about deleting accounts

New Feature

Sub-task

  • [MBS-2917] – Report:Releases with unknown track times

A Request for Feedback on the Upcoming "Changed MBID" Service

A common problem for users of MusicBrainz is that of synchronizing a local collection against the main MusicBrainz servers. Our current rate limit stipulates that you make at most 1 request per second, which we understand is extremely limiting – especially if you’re trying to fetch thousands of releases! During our first hack weekend, we created the beginnings of a service to allow you to get a list of MBIDs that have been updated. We have finished the preliminaries of this service, and now we need to hear from you how you’d want to utilize this.

Change Logs

The most basic data we currently gather is a JSON document containing a list of MBIDs that have changed per hour. For each of our data replication packets, we generate a JSON packet that summarizes all of the MBIDs that have changed, either directly on indirectly (such as the addition of more relationships).

A “What’s Changed?” Service

The first piece of feedback we received was that people were not really interested in consuming this data stream, but would rather have a service that allows them to query what data has changed in a given window of time. Having to manually fetch packets and perform set intersections is not particularly difficult, but the more hoops people have to jump through, the less likely they are to even use the service. We’ve been pondering how best to implement this service, and we would like feedback on the following options:

  1. Filter a list of MBIDs

    The service would allow you to POST a set of MBIDs, and would in turn return the subset of these MBIDs that have been changed. You are able to specify any date and have all changes since that date. For example, you could find all changes to all releases in your library since you last checked 2 weeks ago.

    Because every MBID would take 36 bytes to submit, there will be a limit on the amount of MBIDs that can be submitted in order to preserve bandwidth.

  2. Provide client libraries

    Rather than having people craft their own web service requests, MusicBrainz should provide a library to do this. This will allow us to use more advanced techniques (for example, Bloom filters) to both conserve bandwidth, and allow for larger queries. In this scheme the web service will be documented, but users are not expected to consume it directly.

  3. Support Both!

    MusicBrainz could offer a simplified API, which is based on option 1, while also supporting larger queries through option 2. For example, we might limit option 1 to have a maximum of 4000 MBIDs per request/response, while the service that depends on our client libraries could handle many more.

  4. Allow filtering based on collections

    MusicBrainz already has the concepts of collections, which have an associated unique identifier, so these will be used to filter the list of changes. This limits the service to only deal with releases, and will require people set up collections before they can do queries. Again, due to the possibility of large collections, there will likely be pagination on responses – though the per-page limit will probably be fairly high.

These are the ideas that we’ve been debating, and we’d love to know which of these would work for you. If you have other ideas, we’re also very interested in hearing what those are!