ruaok – Page 19 – MetaBrainz Blog

Updating our privacy policy

A number of bugs have been piling up pointing out inconsistencies in our privacy policy. We’ve realized that the existing privacy policy is quite brittle with respect to our fast paced server development cycle. I’ve edited the privacy policy in an attempt to future proof the policy while not eroding anyone’s privacy. The revised policy has already had some community review and now I would like to cast the net to a wider audience for review.

However, I would like to set the stage carefully here: This is not a call to re-write our privacy policy or to nit-pick it to pieces. I am trying to nail down the open bugs and ensure that our privacy policy reflects our current actual activities. So, unless you find an actual problem that needs to be addressed, I am not likely to make any further changes to the revised policy.

That said, here is the list of bugs I’m trying to close: MBS-5708, MBS-5709, OTHER-144, OTHER-145, MBS-5942, MBS-5948.

Here is the new privacy policy that should fix the bugs above. Compare that to our current policy and also have a look at the diff of the wiki markup between the two pages.

PUIDs are deprecated and will be removed on 15 October, 2013

tl;dr: On 15 october, we’re going to: drop table PUID;

In 2006 we added support for PUID acoustic fingerprints from MusicIP. MusicIP went out of business some years ago and the PUID service has been passed along, through various hands. Along the way it became neglected and the quality of the service went downhill. This spurred the creation of AcoustID which is our preferred solution for fingerprinting inside MusicBrainz today. We set out to let AcoustID support and PUID support live side-by-side in MusicBrainz for a while and we feel that almost enough time has passed. Therefore we’re going to remove PUID support from MusicBrainz in our autumn schema change release on 15 October, 2013.

If you depend on PUID support today, we encourage you to move over to AcoustID as soon as possible.

Official schema change notification for 15 May, 2013

We’re nearly done implementing the SQL portions of our tickets for the upcoming schema change on 15 May, 2013. We’ve settled on the following tickets that we plan to release:

MBS-5861: Dynamic work attributes
MBS-3978: Support more than one barcode on same release
MBS-4756: Move the wiki transclusion index to the database
MBS-799: Location, venue and event support
MBS-3985: Support multiple artist countries
MBS-4925: Add country of birth and country of death to Artist (person)
MBS-4115: Cover art archive: Support .png SQL changes
MBS-1839: Track MBID SQL changes
MBS-5809: Add a “description” field to collections
MBS-5314: Drop the work.artist_credit column
MBS-5302: Store International Standard Name Identifier (ISNI, ISO 27729) for artists and labels
MBS-5528: Change short_link_phrase to long_link_phrase
MBS-2229: Allow multiple release events per release
MBS-2417: Support multiple countries/regions on a single release
MBS-5772: Generate relationship documentation (semi-)automatically
MBS-5848: Instrument credits

Each of the tickets above will give you a complete idea of how we plan to change our schema on May 15th. Questions? Post a question in the comments and we will answer it.

Finally, we are going to require that Postgres 9.1 will be the minimum version of Postgres going forward. I’ve spoken to many people about this and it seems that a large percentage of people are already using Postgres 9.1, so this should not be a major change.

Thanks!

A Request for Feedback on the Upcoming "Changed MBID" Service

A common problem for users of MusicBrainz is that of synchronizing a local collection against the main MusicBrainz servers. Our current rate limit stipulates that you make at most 1 request per second, which we understand is extremely limiting – especially if you’re trying to fetch thousands of releases! During our first hack weekend, we created the beginnings of a service to allow you to get a list of MBIDs that have been updated. We have finished the preliminaries of this service, and now we need to hear from you how you’d want to utilize this.

Change Logs

The most basic data we currently gather is a JSON document containing a list of MBIDs that have changed per hour. For each of our data replication packets, we generate a JSON packet that summarizes all of the MBIDs that have changed, either directly on indirectly (such as the addition of more relationships).

A “What’s Changed?” Service

The first piece of feedback we received was that people were not really interested in consuming this data stream, but would rather have a service that allows them to query what data has changed in a given window of time. Having to manually fetch packets and perform set intersections is not particularly difficult, but the more hoops people have to jump through, the less likely they are to even use the service. We’ve been pondering how best to implement this service, and we would like feedback on the following options:

Filter a list of MBIDs

The service would allow you to POST a set of MBIDs, and would in turn return the subset of these MBIDs that have been changed. You are able to specify any date and have all changes since that date. For example, you could find all changes to all releases in your library since you last checked 2 weeks ago.

Because every MBID would take 36 bytes to submit, there will be a limit on the amount of MBIDs that can be submitted in order to preserve bandwidth.
Provide client libraries

Rather than having people craft their own web service requests, MusicBrainz should provide a library to do this. This will allow us to use more advanced techniques (for example, Bloom filters) to both conserve bandwidth, and allow for larger queries. In this scheme the web service will be documented, but users are not expected to consume it directly.
Support Both!

MusicBrainz could offer a simplified API, which is based on option 1, while also supporting larger queries through option 2. For example, we might limit option 1 to have a maximum of 4000 MBIDs per request/response, while the service that depends on our client libraries could handle many more.
Allow filtering based on collections

MusicBrainz already has the concepts of collections, which have an associated unique identifier, so these will be used to filter the list of changes. This limits the service to only deal with releases, and will require people set up collections before they can do queries. Again, due to the possibility of large collections, there will likely be pagination on responses – though the per-page limit will probably be fairly high.

These are the ideas that we’ve been debating, and we’d love to know which of these would work for you. If you have other ideas, we’re also very interested in hearing what those are!

Housecleaning part 2: Moving our mailing lists

Part 2 in our housecleaning series concerns our mailing lists. Hosting mailing lists is quite a pain and we’d rather leave this pain to people who specializein mailing lists. So, we are proposing to do the following things:

Remove the under-utilized list musicbrainz-italian.
Remove the musicbrainz-commits mailing list. Github (and similar sites) have better notification systems, so we don’t really need this list anymore.
Ask the Xiph Foundation to find a new home for the XSPF Playlist mailing list.
Remove the under-utilized musicbrainz-users list since the forums are predominantly used for end-user discussion. We’ll point people to the forums for those.

Finally, we would like to get some suggestions and feedback on where we should host our mailing lists. We’re considering:

Nabble: This has gotten mixed reviews from various users.
Librelist: This site is quite new and UI reservations have been noted about it.
Savannah: This site has many more features than just mailing lists. We’re not certain if we can move only our mailing lists here.
Google Groups: We’ve heard complaints about spam and spam fighting tools. Has this improved recently?

If you have any comments on any of these solutions or proposed list consolidation ideas, please let us know. Also, if you know of a cheap/free/good list provider that we didn’t list, please let us know!

Housecleaning part 1: Please help us create a new theme for our blog

We have one aging machine (scooby) that has been in continuous service since 2006. Back then we didn’t have as many options for hosting source code, mailing lists and blogs. Today, we have a lot more choice and we’re opting to host fewer things so that we can focus our energy on hosting MusicBrainz and not a bunch of ancillary stuff. Our goal is to retire scooby soon and move the services that run on that server elsewhere.

Our blog is the first thing to move: We’re moving it to wordpress.com and we’re nearly done with the move. But, we dont have a decent wordpress MusicBrainz theme for our blog. If anyone is interested in taking an existing wordpress theme and making it a custom MusicBrainz theme, we would love your help!

If you’re interested, please leave a comment and we’ll get in touch with you to coordinate this process.

Thanks!

Please welcome AOL Music into the MetaBrainz ecosystem!

The continued economic turmoil persisted in 2012 and thus it was a slow year for adding new customers for MetaBrainz. However, we did add one high profile customer in 2012: AOL Music.

For a number of reasons we felt that it was prudent to get MusicBrainz integrated into AOL before making public news about it. Now the time is finally right to talk about our relationship with AOL and Winamp. I had been talking to Geno Yoham (GM of Winamp) and Lisa Namerow (GM of AOL Music) about MusicBrainz at various conferences for several years. Forging relationships with large companies take a quite a long time and the formation of our relationship was really no different. At the end of 2011 Geno, Lisa and team were ready to take action and surprised me by pledging a sizeable donation to the MetaBrainz Foundation. This donation was received early in 2012 about the same time that we signed the data license contract. And just last week we received another donation for 2012!! Thanks AOL and Winamp!

Early in 2012 AOL launched updated services underpinned by MusicBrainz data:

The Now Playing feature in Winamp allows a user to find out more about the artist that is currently playing in Winamp.
The AOL Music Artist pages also use MusicBrainz data to display discography information and to provide some of the links for the other content shown on those pages.

Our relationship with AOL follows a similar pattern to our relationship to the BBC. The BBC has done wonders for highlighting and lending credibility to MusicBrainz and I expect that our relationship with AOL will bring about similar benefits for MusicBrainz.

Thank you team AOL and especially to Geno Yoham and Lisa Namerow for believing in us!

We have a new community calendar

We’ve been scheduling more meetings for discussing various complex topics, but communication about those dates has not been clear. In order to fix this, we’ve created a community curated calendar:

http://calendar.musicbrainz.org

reosarevok, nikki, ian, ollie, warp and myself can put things onto the calendar. If you have something you’d like to have added to the calendar, please ask one of these folks.

Preparing for the May 15th schema change release

It it time for us to start the process towards the next schema change release. Starting today and for the next two weeks, we’re going to seek people to be the champion (sponsor) of a ticket. If you feel strongly about a schema change ticket getting taken care of, you should consider championing this ticket. Once you’ve decided to do adopt a ticket, you should assign the ticket to yourself.

Then, over the next two weeks it will be up to you to do the following:

Drive consensus around the core concept of the ticket. If you go through the process of working up a ticket, but no one agrees with what you’re proposing, you’ve wasted your time. Make sure that you get buy in from others in the community. For instance, if Nikki doesn’t like it, chances are its not going to fly. 🙂
Each schema change feature requires two tickets: 1) An SQL ticket that implements the actual changes to the database and defines the queries used to fetch the data. 2) A UI change ticket that implements the UI portions of the schema change ticket.
Ensure that the ticket clearly states what needs to be done to implement the ticket. The ticket should essentially become or link to a requirements document. This requirements document should explain what the new feature should do. It should not explain how it should be done — we should leave the how to our developers who are going to implement the feature.
Provide as much supporting documentation as you can. Mock-ups for UIs are deeply appreciated (even if they delve into the how realm of things) and very useful for meaningfully discussing these tickets.
Have the ticket reviewed by a developer for clarity and completeness, then address any issues said developer may raise.

On 15 February, we’re going to look at the list of tickets that people have taken on and choose the ones that are clear enough to move forward. If you’ve done all the work outlined above, the chances are good that your ticket will be chosen to move forward. If your ticket is chosen to move forward, there will be more questions that the developers will raise — hopefully those can be tackled in the space of a week. After that we will take all of the well defined tickets and schedule them for implementation. All the other tickets that are not clear to implement will be rejected and will have to make another pass though this process in the autumn.

If you’re still interested, here is the list of schema change tickets that should be considered for this.

We’re going to follow the this schedule:

1 Feb: Schema change ticket selection starts
15 Feb: Select schema change tickets for implementation, start making tickets fully actionable
1 March: Tickets must be fully actionable. Tickets that are not actionable will be dropped from the 15 May release.
15 March: SQL tickets must be fully implemented.
1 May: UI tickets must be fully implemented, start final ticket testing phase
15 May: Release day

All of these dates have been added to our new community calendar.

IMPORTANT: Proposed changes to the data returned by our web service

Our current web service at the /ws/2 endpoint returns too much data in a lot of cases and in many cases we suspect that the programs making the calls to the service don’t actually consume all of that data. We’d like to reduce the amount of unused data our web service returns, in order to reduce our bandwidth costs. We propose that:

The web service will no longer includes aliases and tags in relation elements. Regardless of what entity you may request, if the results of your request includes a relation element, any alias or tag elements that are currently returned will no longer be returned.
The web service no longer includes aliases and tags in for the Various Artists artist anywhere, unless you specifically request the Various Artist from the /ws/2/artist endpoint.

We’ve mocked up these changes in the following XML files:

We think that this will have a minimal impact on our web service users. If you use our web service, please tell us what you think about this. If you know someone who is using our web service, but may not read this blog, please forward a link to this post to them.

For more background on our research into this topic, please take a look at this document.