A Request for Feedback on the Upcoming "Changed MBID" Service

A common problem for users of MusicBrainz is that of synchronizing a local collection against the main MusicBrainz servers. Our current rate limit stipulates that you make at most 1 request per second, which we understand is extremely limiting – especially if you’re trying to fetch thousands of releases! During our first hack weekend, we created the beginnings of a service to allow you to get a list of MBIDs that have been updated. We have finished the preliminaries of this service, and now we need to hear from you how you’d want to utilize this.

Change Logs

The most basic data we currently gather is a JSON document containing a list of MBIDs that have changed per hour. For each of our data replication packets, we generate a JSON packet that summarizes all of the MBIDs that have changed, either directly on indirectly (such as the addition of more relationships).

A “What’s Changed?” Service

The first piece of feedback we received was that people were not really interested in consuming this data stream, but would rather have a service that allows them to query what data has changed in a given window of time. Having to manually fetch packets and perform set intersections is not particularly difficult, but the more hoops people have to jump through, the less likely they are to even use the service. We’ve been pondering how best to implement this service, and we would like feedback on the following options:

  1. Filter a list of MBIDs

    The service would allow you to POST a set of MBIDs, and would in turn return the subset of these MBIDs that have been changed. You are able to specify any date and have all changes since that date. For example, you could find all changes to all releases in your library since you last checked 2 weeks ago.

    Because every MBID would take 36 bytes to submit, there will be a limit on the amount of MBIDs that can be submitted in order to preserve bandwidth.

  2. Provide client libraries

    Rather than having people craft their own web service requests, MusicBrainz should provide a library to do this. This will allow us to use more advanced techniques (for example, Bloom filters) to both conserve bandwidth, and allow for larger queries. In this scheme the web service will be documented, but users are not expected to consume it directly.

  3. Support Both!

    MusicBrainz could offer a simplified API, which is based on option 1, while also supporting larger queries through option 2. For example, we might limit option 1 to have a maximum of 4000 MBIDs per request/response, while the service that depends on our client libraries could handle many more.

  4. Allow filtering based on collections

    MusicBrainz already has the concepts of collections, which have an associated unique identifier, so these will be used to filter the list of changes. This limits the service to only deal with releases, and will require people set up collections before they can do queries. Again, due to the possibility of large collections, there will likely be pagination on responses – though the per-page limit will probably be fairly high.

These are the ideas that we’ve been debating, and we’d love to know which of these would work for you. If you have other ideas, we’re also very interested in hearing what those are!

Server Update, 2013-02-25

Thanks to work from Ben Ockmore, Frederik “Freso” S. Olesen, Michael Wiencek, Nicolás Tamargo, nikki, Sean Burke, and the MusicBrainz team, we’ve just released a new version of the MusicBrainz website. As usual, this is mostly a bug fix release, but we do have one shiny new feature… collection subscriptions! It is now possible to subscribe to your own collections, or other public collections, and receive daily emails summarizing edits recently made that might affect releases in these collections. This feature is new, so if you encounter any problems, please be sure to let us know!

Here’s a full list of what’s changed:

Bug

  • [MBS-3514] – Part of an edit was applied but not shown in preview nor in edit history
  • [MBS-3952] – Last Updated time is wrong on the Statistics page
  • [MBS-4543] – Seeding the create relationship page ignores some parameters
  • [MBS-4774] – Edit to swap two recordings creates a blank edit
  • [MBS-4947] – /edit/open won’t load
  • [MBS-5016] – Buttons misaligned in Firefox
  • [MBS-5309] – Not all artist credit changes are shown in tracklist edits
  • [MBS-5388] – Relationship editor allows linking an entity to itself
  • [MBS-5397] – Edit medium’s diff (insert track) : shows track duration change where there are none
  • [MBS-5402] – Country (and language) dropdown is sorted by English alphabet
  • [MBS-5461] – No space between releases in collection and button to remove them
  • [MBS-5520] – Translation problem on disc IDs page
  • [MBS-5547] – Relationship Editor: Can’t unselect orchestra type
  • [MBS-5560] – Internal server error (after creating recording-work relationship via relation-editor)
  • [MBS-5565] – Link to add new label is missing when there are results
  • [MBS-5677] – URL MBIDs missing from the webservice
  • [MBS-5726] – Internal server error when requesting /ws/2/discid in JSON
  • [MBS-5765] – Very long releases have bad release length parameters
  • [MBS-5787] – i18n: “This beta test server allows testing of new features with the live database” is untranslatable
  • [MBS-5799] – Relationship editor doesn’t allow adding multiple relationships with different attributes
  • [MBS-5815] – Use “Died” in edit artist page like on sidebar
  • [MBS-5819] – Instrument select in relat. editor requires accents
  • [MBS-5821] – i18n: work types are untranslatable inside the relationship editor
  • [MBS-5824] – Internal server error loading Wikipedia extract for page name containing &
  • [MBS-5825] – Internal server error when requesting an artist with a really large ID
  • [MBS-5828] – Internal server error on /chrome_frame
  • [MBS-5836] – Relationship editor accepts POST data containing mismatched entity/link type combos
  • [MBS-5842] – “Position” is untranslatable in add_cover_art.tt
  • [MBS-5880] – Internal server error for Wikipedia extract when Wikipedia page is a redirect
  • [MBS-5889] – Subscription summary test fails, does not check collection subscriptions

Improvement

  • [MBS-1955] – Sort relationship types tree
  • [MBS-2225] – Subscribe to collection
  • [MBS-3990] – Add a user preference for using beta version of MB
  • [MBS-4094] – Make artist credits more compact in edits
  • [MBS-5568] – Work types list should be sorted
  • [MBS-5644] – relationship editor : check all/several recordings and batch-add relationship to same work
  • [MBS-5759] – Make reorder cover art an auto-edit
  • [MBS-5823] – Work alias page doesn’t allow to select the guess case options

New Feature

  • [MBS-5833] – Show off the latest releases on the homepage

Task

  • [MBS-3856] – Revert MBS-1052 “Amazon should lookup cover art by barcode too”
  • [MBS-5802] – Add autoselect + sidebar links to VIAF
  • [MBS-5849] – Adding lyricsnmusic.com to the lyrics whitelist

Sub-task

  • [MBS-4270] – Release editor JS strings should be in text.js or similar so they are translatable
  • [MBS-4997] – Add lookup url to webservice
  • [MBS-5490] – CAA coverart over SSL

The Git tag for this release is v-2013-02-25.

Housecleaning part 2: Moving our mailing lists

Part 2 in our housecleaning series concerns our mailing lists. Hosting mailing lists is quite a pain and we’d rather leave this pain to people who specializein mailing lists. So, we are proposing to do the following things:

  1. Remove the under-utilized list musicbrainz-italian.
  2. Remove the musicbrainz-commits mailing list. Github (and similar sites) have better notification systems, so we don’t really need this list anymore.
  3. Ask the Xiph Foundation to find a new home for the XSPF Playlist mailing list.
  4. Remove the under-utilized musicbrainz-users list since the forums are predominantly used for end-user discussion. We’ll point people to the forums for those.

Finally, we would like to get some suggestions and feedback on where we should host our mailing lists. We’re considering:

  • Nabble: This has gotten mixed reviews from various users.
  • Librelist: This site is quite new and UI reservations have been noted about it.
  • Savannah: This site has many more features than just mailing lists. We’re not certain if we can move only our mailing lists here.
  • Google Groups: We’ve heard complaints about spam and spam fighting tools. Has this improved recently?

If you have any comments on any of these solutions or proposed list consolidation ideas, please let us know. Also, if you know of a cheap/free/good list provider that we didn’t list, please let us know!

Housecleaning part 1: Please help us create a new theme for our blog

We have one aging machine (scooby) that has been in continuous service since 2006. Back then we didn’t have as many options for hosting source code, mailing lists and blogs. Today, we have a lot more choice and we’re opting to host fewer things so that we can focus our energy on hosting MusicBrainz and not a bunch of ancillary stuff. Our goal is to retire scooby soon and move the services that run on that server elsewhere.

Our blog is the first thing to move: We’re moving it to wordpress.com and we’re nearly done with the move. But, we dont have a decent wordpress MusicBrainz theme for our blog. If anyone is interested in taking an existing wordpress theme and making it a custom MusicBrainz theme, we would love your help!

If you’re interested, please leave a comment and we’ll get in touch with you to coordinate this process.

Thanks!

Server update, 2013-02-11 and an important notice regarding edits

We’ve just finished pushing out another two weeks changes to the MusicBrainz web site. While this release is predominantly a bug fix release with a few small improvements, we’ve made a fairly substantial change to the way edits are applied.

As of this release, all subsequent edits entered will have an expiration period of 7 days – a reduction from the previous 14 days. We’ve made this change in order to reduce the time that editors have to wait for changes to be applied, which should lead to an improved user experience; and we’ve also made the change to hopefully try and make the edit queue a little bit more managable. This change is exploratory, so if you find it counterproductive, we’d love to hear your thoughts. IRC, the forums and the mailing lists are all good channels to voice your feedback.

Also, we have finally made the switch to GitHub. While the existing repository URLs will continue to work, they will no longer be updated. If you want to stay up to date with the latest code, make sure to update your checkout information.

Many thanks to Frederik “Freso” S. Olesen, Michael Wiencek, Nicolás Tamargo and the rest of the MusicBrainz team for their work on this release. Here’s what’s new:

Bug

  • [MBS-3457] – Direct search results incorrectly reports a recording as "standalone"
  • [MBS-3962] – Edit search doesn’t really exclude artists with "is not"
  • [MBS-4522] – Empty annotations are not merged correctly
  • [MBS-5144] – The MusicBrainz logo in the top-left corner has a different background color than the rest of the header
  • [MBS-5395] – "Actions" column on alias page is too narrow for translations
  • [MBS-5432] – Internal server error when editing cover art
  • [MBS-5434] – Inconsistent terminology: primary/secondary types and type/extra types
  • [MBS-5506] – Edit search for "My vote is not" does not work as expected
  • [MBS-5525] – Blank edit relationship type edits
  • [MBS-5566] – No more autoedit mark ?
  • [MBS-5567] – Internal server error entering remove cover art edit
  • [MBS-5617] – "new image goes here" is not translatable
  • [MBS-5650] – ISE when attempting to approve already-closed release group edit
  • [MBS-5655] – It’s possible to "Remove [a] release label" although there’s none
  • [MBS-5696] – ModBot can fail to close ‘edit artist’ edits that violate uniqueness on (name, comment).
  • [MBS-5698] – Some move disc ID edits display nothing for the old release
  • [MBS-5700] – Subscribers aren’t removed when a user deletes their account
  • [MBS-5705] – Rows in the appearances section of the artist relationships tab are sometimes one column too short
  • [MBS-5719] – The track number of the last track on some 2-disc releases is missing
  • [MBS-5724] – Sort/copy name missing for work aliases
  • [MBS-5728] – Possible to enter ‘edit release group’ edits that fail to apply due to spaces in artist credits
  • [MBS-5732] – Empty labels don’t warn for deletion pending
  • [MBS-5740] – Useless "select all" checkbox on Release Duplicates tab
  • [MBS-5742] – Editing a work without JS on will enter a silent remove ISWC edit
  • [MBS-5748] – Not possible to "Approve" an edit where yours is the only "No" vote
  • [MBS-5750] – Subscriptions report filtering ignores labels
  • [MBS-5754] – Editing recordings from tracklist does not work if "Release Duplicate" is chosen
  • [MBS-5762] – "Work Type" and "Work Language" incorrectly left blank for "Merge Works" edits when viewing predefined edit searches
  • [MBS-5763] – Edit release edits not properly translated
  • [MBS-5764] – Work language column in the stats links to releases
  • [MBS-5770] – There is no constraint on release_label that either (or both) the label or catalog number are not null.
  • [MBS-5771] – Add utamap.com (うたまっぷ) in the lyrics URL white list
  • [MBS-5774] – "Vote on all edits" is not translatable
  • [MBS-5776] – Cover art types are not being translated on the cover art tab
  • [MBS-5777] – Users’ work ratings page has no title
  • [MBS-5786] – i18n: secondary RG type labels on overview are untranslatable
  • [MBS-5788] – No edit is created when trying to submit/attach a DiscID
  • [MBS-5791] – ISRCs can’t be removed
  • [MBS-5797] – Internal server error if the Wikipedia link is not really a Wikipedia link

Improvement

  • [MBS-1413] – Make profile bios support WikiFormat
  • [MBS-1774] – Webservice: expose UUID for AR type
  • [MBS-3535] – Stack traces should mention which server was handling the request
  • [MBS-4880] – Add Help/Info about How "Merge mediums and recordings" works
  • [MBS-4902] – Add Google+ links to the sidebar
  • [MBS-5505] – Add j-lyric.net to the lyrics whitelist
  • [MBS-5605] – Ratings on collection pages
  • [MBS-5707] – Add autoselect for IMDb URL relationships for labels
  • [MBS-5727] – Disc title missing from edit relationships page
  • [MBS-5767] – Add an index on (editor, id DESC)
  • [MBS-5775] – Update Facebook URL cleanup to use https
  • [MBS-5779] – Remove useless nbsp on common-macros.tt
  • [MBS-5798] – Normalise wikisource URLs to http
  • [MBS-5803] – Add autoselect for SecondHandSongs artist URL

Task

  • [MBS-5757] – Consider diminishing time edits stay open
  • [MBS-5766] – Open up /release/ and /work/ in robots.txt
  • [MBS-5810] – Add Open Library to the other databases whitelist

The Git tag for this release is v-2013-02-11.

Please welcome AOL Music into the MetaBrainz ecosystem!

The continued economic turmoil persisted in 2012 and thus it was a slow year for adding new customers for MetaBrainz. However, we did add one high profile customer in 2012: AOL Music.

For a number of reasons we felt that it was prudent to get MusicBrainz integrated into AOL before making public news about it. Now the time is finally right to talk about our relationship with AOL and Winamp. I had been talking to Geno Yoham (GM of Winamp) and Lisa Namerow (GM of AOL Music) about MusicBrainz at various conferences for several years. Forging relationships with large companies take a quite a long time and the formation of our relationship was really no different. At the end of 2011 Geno, Lisa and team were ready to take action and surprised me by pledging a sizeable donation to the MetaBrainz Foundation. This donation was received early in 2012 about the same time that we signed the data license contract. And just last week we received another donation for 2012!! Thanks AOL and Winamp!

Early in 2012 AOL launched updated services underpinned by MusicBrainz data:

  • The Now Playing feature in Winamp allows a user to find out more about the artist that is currently playing in Winamp.
  • The AOL Music Artist pages also use MusicBrainz data to display discography information and to provide some of the links for the other content shown on those pages.

Our relationship with AOL follows a similar pattern to our relationship to the BBC. The BBC has done wonders for highlighting and lending credibility to MusicBrainz and I expect that our relationship with AOL will bring about similar benefits for MusicBrainz.

Thank you team AOL and especially to Geno Yoham and Lisa Namerow for believing in us!

We have a new community calendar

We’ve been scheduling more meetings for discussing various complex topics, but communication about those dates has not been clear. In order to fix this, we’ve created a community curated calendar:

http://calendar.musicbrainz.org

reosarevok, nikki, ian, ollie, warp and myself can put things onto the calendar. If you have something you’d like to have added to the calendar, please ask one of these folks.

Preparing for the May 15th schema change release

It it time for us to start the process towards the next schema change release. Starting today and for the next two weeks, we’re going to seek people to be the champion (sponsor) of a ticket. If you feel strongly about a schema change ticket getting taken care of, you should consider championing this ticket. Once you’ve decided to do adopt a ticket, you should assign the ticket to yourself.

Then, over the next two weeks it will be up to you to do the following:

  1. Drive consensus around the core concept of the ticket. If you go through the process of working up a ticket, but no one agrees with what you’re proposing, you’ve wasted your time. Make sure that you get buy in from others in the community. For instance, if Nikki doesn’t like it, chances are its not going to fly. 🙂
  2. Each schema change feature requires two tickets: 1) An SQL ticket that implements the actual changes to the database and defines the queries used to fetch the data. 2) A UI change ticket that implements the UI portions of the schema change ticket.
  3. Ensure that the ticket clearly states what needs to be done to implement the ticket. The ticket should essentially become or link to a requirements document. This requirements document should explain what the new feature should do. It should not explain how it should be done — we should leave the how to our developers who are going to implement the feature.
  4. Provide as much supporting documentation as you can. Mock-ups for UIs are deeply appreciated (even if they delve into the how realm of things) and very useful for meaningfully discussing these tickets.
  5. Have the ticket reviewed by a developer for clarity and completeness, then address any issues said developer may raise.

On 15 February, we’re going to look at the list of tickets that people have taken on and choose the ones that are clear enough to move forward. If you’ve done all the work outlined above, the chances are good that your ticket will be chosen to move forward. If your ticket is chosen to move forward, there will be more questions that the developers will raise — hopefully those can be tackled in the space of a week. After that we will take all of the well defined tickets and schedule them for implementation. All the other tickets that are not clear to implement will be rejected and will have to make another pass though this process in the autumn.

If you’re still interested, here is the list of schema change tickets that should be considered for this.

We’re going to follow the this schedule:

  • 1 Feb: Schema change ticket selection starts
  • 15 Feb: Select schema change tickets for implementation, start making tickets fully actionable
  • 1 March: Tickets must be fully actionable. Tickets that are not actionable will be dropped from the 15 May release.
  • 15 March: SQL tickets must be fully implemented.
  • 1 May: UI tickets must be fully implemented, start final ticket testing phase
  • 15 May: Release day

All of these dates have been added to our new community calendar.