Server update, 2013-05-28

Now that the fires around the last schema upgrade seem to be dying down to a much more manageable level, we’ve just put out the next version of MusicBrainz server. As you might already expect, this release is mostly a bug fix release. Thanks to Lukáš Lalinský, Michael Wiencek, Nicolás Tamargo and the rest of the MusicBrainz team for their work on this release. Here’s what’s changed:

Bug

  • [MBS-3059] – Using "Submit votes & edit notes" added edit note to all edits
  • [MBS-6158] – Reset password form claims the user/email address does not exist
  • [MBS-6172] – Lost password claims to work when user has no email
  • [MBS-6209] – Direct search for work doesn’t check aliases
  • [MBS-6230] – Internal server error removing a relationship in the relationship editor
  • [MBS-6271] – Internal server error trying to set track lengths
  • [MBS-6274] – Capitalisation of "release events" in sidebar and edits is inconsistent
  • [MBS-6299] – "License" header in release sidebars is displayed even if the following list is empty
  • [MBS-6302] – Country column empty in search results page
  • [MBS-6304] – Table on Attach CD TOC page broken (too many columns)
  • [MBS-6305] – Regression: Release dates are no longer optional
  • [MBS-6318] – Search hint alias for areas require a sortname when adding
  • [MBS-6319] – Internal server error editing an area alias
  • [MBS-6329] – REGRESSION. no more dates in collection pages
  • [MBS-6335] – Regression: Tracklists not shown when merging releases
  • [MBS-6341] – Area alias aren’t found by direct search unless they match an area name

Improvement

  • [MBS-3083] – Do not use placeholder text for artist credit names
  • [MBS-5603] – Edit relationships page should show disambiguation comments
  • [MBS-6198] – Deleted Editor Stats
  • [MBS-6213] – Permit ISRCs starting with TC
  • [MBS-6219] – Stop using two edit types and two places in the UI for the "change release group" functionality
  • [MBS-6222] – Update FeaturingRecordings report
  • [MBS-6223] – Display language and script in the right order on release edits
  • [MBS-6294] – Release events are weirdly indented on release pages
  • [MBS-6340] – Redirect to direct search for indexed area searches as long as the latter are not actually available
  • [MBS-6348] – Display the type of area on inline search

New Feature

  • [MBS-6018] – Add /oauth2/tokeninfo and /oauth2/userinfo handlers for 3rd party login

Task

  • [MBS-6229] – Add autoselect, sidebar links and pretty URLs for Wikidata

The Git tag for this release is v-2013-05-28.

Schema 17/18 upgrade instructions

We’ve just completed our extra schema upgrade. The full instructions for upgrade follow:

Schema 16 to schema 17 upgrade

If you already ran the migration that was announced May 15th, or if you imported a data dump from May 15th or later, skip to the next section.

  1. Run replication with carton exec -Ilib -- ./admin/replication/LoadReplicationChanges until it cannot apply any packets in schema 16.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Make sure your REPLICATION_TYPE setting is RT_SLAVE in lib/DBDefs.pm
  5. Switch to the new code with git fetch origin followed by git checkout schema-16-to-17
  6. Run carton install --deployment to install any new perl modules.
  7. Run carton exec -Ilib -- ./upgrade.sh from the top of the source directory.
  8. Set DB_SCHEMA_SEQUENCE to 17 in lib/DBDefs.pm
  9. Turn cron jobs back on, if needed.
  10. Restart the MusicBrainz web server, if needed.

Schema 17 to schema 18 upgrade

  1. Run replication with carton exec -Ilib -- ./admin/replication/LoadReplicationChanges until it cannot apply any packets in schema 17.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Make sure your REPLICATION_TYPE setting is RT_SLAVE in lib/DBDefs.pm
  5. Switch to the new code with git fetch origin followed by git checkout v-2013-05-24
  6. Run carton exec -Ilib -- ./upgrade.sh from the top of the source directory.
  7. Set DB_SCHEMA_SEQUENCE to 18 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if needed. EDIT: also restart memcached here, see http://tickets.musicbrainz.org/browse/MBS-6376

Note that the tags to check out for the two migrations are different.

Changes

For the list of changes in schema 17, see the former blog post. The changes for schema 18 are:

  1. Fix the track table corruption that the schema 16-17 upgrade created, by importing a copy of the ‘track’ table from the production database.
  2. Fix some indexes and constraints that should not be on slaves or which had bad names starting with ‘medium2013’ or ‘track2013’
  3. Create a missing index on medium.release that dramatically improves performance.
  4. Fix the ref_count column of the artist_credit table, which was not updated properly at the schema 16-17 upgrade.

Urgent schema update required

On Friday 24 May, 2013 at 15:00UTC we’re going to make an urgent schema update to fix a problem that occurred during our schema change last week. Please read this whole blog post carefully!

This update will not make any changes to the schema, but it will fix some data issues that have appeared on slave servers.

We apologize profusely for these problems — we’re working hard to rectify this problem and we’re going to improve our processes going forward to ensure that future releases will not encounter these problems.

What went wrong

Due to a misunderstanding of our database system, the ‘track’ table will be corrupted on the majority of replicated slave databases after the schema 17 migration. Specifically, depending on the internal choices of a given postgresql installation’s query planner and other system details, any particular server can end up with a variety of incompatible permutations of the track table, where ‘id’/’gid’ pairs will generally point to the incorrect track data. Unfortunately, this problem is compounded by replication, which is based on the ‘id’ column. Therefore, replication packets since the schema change are likely to have deleted and modified the incorrect rows of the ‘track’ table on slaves.

How are we fixing it

In order to ensure that no slaves continue to replicate incompatible changes, we are incrementing the schema number again to 18, which will force operators of slave servers to intervene appropriately. To ensure that slaves have a correct version of the ‘track’ table, we are providing an upgrade script that will download an exported snapshot of the production server’s ‘track’ table at a known point and import it, as well as correct some smaller issues. By importing this snapshot, slave servers will be reset to a correct version of this table and replication can continue.

Specific step by step instructions on running this upgrade will be in a separate blog post. Watch this space!

What problems may have arisen

  1. In the unlikely case an external program directly references track row ID numbers, or if it uses the newly-added track MBID field (the ‘gid’ column), these will not be correct if they were taken from any server but the production server. If an application stores either of these identifiers in any way, that data should be rebuilt.
  2. Due to the compounding problems from replication, some tracklists will have incorrect information — missing tracks, misnumbered tracks, links to the wrong recordings, wrong durations, and/or wrong track artist credits. Information of this sort that was derived from replicated slaves during the affected period should be regenerated after upgrading.

FAQ about this update

Q: We don’t use the track table, we use recordings. Am I affected?
A: You are not affected if you use recordings directly, i.e., looking up recording information by a stored recording MBID, except if you use track information linked to those recordings (for example, if you create a list of releases a given recording appears on). Since the link between the recording and the release tables is via the ‘track’ table, anything that connects these two entities is likely to be affected.

Q: How can I tell if any of the tracklists I am using are affected?
A: Due to the random permutation issue, it’s not completely possible to be 100% sure. However, it’s possible to know of tracklists that definitely have problems by two means: track counts, and sequence issues. The former can be tested with a fairly simple query: “

SELECT medium.id, medium.track_count, count(track.id) as track_track_count,
     medium.track_count  count(track.id) AS counts_differ
FROM medium join track on track.medium = medium.id
GROUP BY medium.id, medium.track_count
HAVING count(track.id)  medium.track_count;

Any medium that appears in that query has been affected and its tracklist should not be trusted (select ‘medium.release’ to get release IDs, if that’s your jam). Sequence issues are a more complex query:

SELECT distinct m.id FROM
  (SELECT DISTINCT medium.* FROM
    ( SELECT track.medium, min(track.position) AS first_track, max(track.position)
      AS last_track, count(track.position) AS track_count, sum(track.position)
      AS track_pos_acc
      FROM track
      GROUP BY track.medium) s
    JOIN medium ON medium.id = s.medium
    WHERE first_track != 1 OR last_track != s.track_count OR
        (s.track_count * (1 + s.track_count)) / 2  track_pos_acc
    ) m

(note: if you only get 10 rows for this query, you’re fine — they’re these ten, which are known problems)

For more safety, don’t trust anything in the track table that’s been updated since the schema change:

SELECT distinct medium
FROM track
WHERE last_updated > ‘2013-05-15’

If it’s possible in your application, it’s probably best to throw out any updates to tracklists since 2013-05-15.

Again, we’re sorry for the trouble this update may have caused you!

Search server regressions fixed

Yesterday we pushed out a new version of our search servers to fix some regressions introduced last week. Thanks to Paul Taylor for fixing these bugs so quickly.

Release Notes – MusicBrainz Search Server – Version 2013-20-05

Bug

  • [SEARCH-290] – REGRESSION WS2 RECORDING query returns cropped artist-credit
  • [SEARCH-294] – REGRESSION:Search results no longer include medium-list count attribute
  • [SEARCH-298] – REGRESSION:ws/1 release search seems broken

Improvement

  • [SEARCH-296] – Update README to point to up-to-date mmd-schema repository

Issues with 2013-05-15 schema change and the 'track' table.

As a heads-up for anyone using postgresql 9.1 or later (9.0 is the only confirmed-correct version) anyone running a slave server, it appears that there’s an issue with the upgrade script which will result in an incorrect track table in most cases.

An ostensible fix that was previously mentioned here does not work. We’re still working on a fix and will update this post as we have more details.

There is a fix, see http://blog.musicbrainz.org/?p=1962 for instructions. Thanks for your patience!

Search server release: 2013-05-15

Coninciding with our main server release, we’ve updated our search servers. This version fixes some bugs from the last release and adds support for countries and track ids.

Thanks for your hard work on this release, Paul!

Release Notes – MusicBrainz Search Server – Version 2013-05-15

Bug

  • [SEARCH-236] – Incomplete VA artist credit included for releases in recording search
  • [SEARCH-282] – REGRESSION:Johanne Sebastian Bach is not the first result when search for artist Bach
  • [SEARCH-283] – REGRESSION:"-" is returned instead of an empty list when there are no ISWCs for a work
  • [SEARCH-284] – REGRESSION:"-" is returned instead of an empty list when there are no ISRCs for a recording

Improvement

  • [SEARCH-219] – Include alias sortnames when searching labels or artists
  • [SEARCH-257] – entity search : entity name should have more weight than aliases and artist credits
  • [SEARCH-268] – Add extended alias info to the ws search results
  • [SEARCH-269] – WS searches don’t return aliases that match the artist name

Task

  • [SEARCH-274] – Support for changes to Countries in forthcoming Schema release
  • [SEARCH-285] – Support for TrackIds in forthcoming Schema Release

Schema change release, 2013-05-15

Today we released a schema change update for MusicBrainz. Schema change updates change the format of the underlying MusicBrainz database and allow us to store more information, or model information in a richer/more correct form.

Summary

This release specifically includes some exciting new features:

Areas

A new entity is this release is the “area” entity, which can track countries, subdivisions of countries, cities, and other such location entities (venues, however, will be another entity). While this release primarily only introduces the entity, migrating only our existing list of countries, it’s now also possible to add start and end locations to artists, and to mark works as anthems. Editing of areas, their aliases and annotations, and area-area and area-url relationships, is limited to a new class of “Location Editor”.

ISNI Codes

The International Standard Name Identifier (ISNI) is an ISO standard that identifies public identities of parties. We now have support for storing ISNI codes inside the MusicBrainz database, which will make it easier to cross-reference data in MusicBrainz with other databases.

Multiple Release Events

Releases can now have multiple date and country pairs, whereas previously they could only have one country and one date. This will allow us to more accurately store information about releases that occur in different areas at different dates, but are otherwise the same physical product.

Forthcoming Features

We also began work on the database support for some future MusicBrainz features:

Track MBIDs

All tracks on mediums now have unique identifiers. This will allow people to refer to a specific track in a release in a way that is more resilient to editing than just the track name or position. Currently we have database support for this, but track identifiers are not yet exposed in either the website or the web service.

Dynamic Work Attributes

Dynamic work attributes will let us introduce new attributes to describe works without schema changes

Free Text Relationship Attribute Credits

This feature will let editors specify an alternative name for relationship attributes to specifically exactly which model guitar was used in a recording, rather than the current vague “electric guitar” attribute. Support for this feature is now in the database (in the link_attribute_credit) table, but the UI to do this editing is still to be finished.

Support more formats in the Cover Art Archive

Uploading cover art to the cover art archive will soon support a few other image formats, starting with PNG.

Other Schema Changes

The remaining smaller schema changes are:

  • The wiki transclusion version mapping is now stored in the database, not a flat file.
  • The link_type.short_link_phrase was renamed to link_type.long_link_phrase.
  • The work.artist_credit column was dropped.
  • Collections can now have a description.

Upgrading

If you are currently running a slave database, then you will need to perform a few manual steps to upgrade to the new version:

  1. Take down the web server running MusicBrainz, if you’re running a web server.
  2. Turn off cron jobs if you are automatically updating the database via cron jobs.
  3. Make sure your REPLICATION_TYPE setting is RT_SLAVE
  4. Switch to the new code with git fetch origin followed by git checkout v-2013-05-15
  5. Run carton install --deployment to install any new perl modules.
  6. Run carton exec -Ilib -- ./upgrade.sh from the top of the source directory.
  7. Set DB_SCHEMA_SEQUENCE to 17 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if needed.

Please see the instructions in our more recent blog post instead of these instructions.

Release Notes

This release wouldn’t have possible without help from Alastair Porter, Michael Wiencek, Nicolás Tamargo or the rest of the MusicBrainz team – thank you all for your hard work! As we missed the previous release, there are a few other changes in this release. Here are the full release notes:

Bug

  • [MBS-4703] – Add Medium edit does not correctly display the auto-edit note
  • [MBS-5834] – Users’ votes page mistitled as “edits”
  • [MBS-5851] – Instruments are missing from the JSON webservice responses
  • [MBS-5979] – Automatic redirect to beta clear release editor seeding
  • [MBS-6015] – Edit Artist Credit edits affecting track ACs don’t appear in related release edit histories
  • [MBS-6129] – Relationship editor isn’t correctly parsing attributes
  • [MBS-6149] – Entity merges silently dropping aliases with locales
  • [MBS-6178] – URL page headers are completely inconsistent
  • [MBS-6196] – work/edit_form.tt includes artist credit docs
  • [MBS-6249] – Add Event button broken on add release with cdtoc

Improvement

  • [MBS-1346] – New Report: Artists with 0 subscribers
  • [MBS-2229] – Allow multiple release events per release
  • [MBS-3626] – Display license logos in the sidebar
  • [MBS-3669] – Merge dated and undated relationships
  • [MBS-4115] – Cover art archive: Support .png SQL changes
  • [MBS-4294] – Add a “description” field to collections (SQL/UI)
  • [MBS-4756] – Move the wiki transclusion index to the database
  • [MBS-4866] – URL autoselect: ameblo.jp -> has blog at
  • [MBS-4867] – Timeline graph rate-of-change graph should change its vertical scaling depending on where it’s zoomed
  • [MBS-4925] – Add country of birth and country of death to Artist (person)
  • [MBS-5528] – Change short_link_phrase to long_link_phrase
  • [MBS-5772] – Generate relationship documentation (semi-)automatically
  • [MBS-5848] – Instrument credits (SQL)
  • [MBS-6023] – Track MBID UI changes
  • [MBS-6141] – Add discography page URL matching for universal-music.co.jp, lantis.jp, jvcmusic.co.jp, wmg.jp, avexnet.jp and kingrecords.co.jp
  • [MBS-6142] – Prevent Wikipedia links from being added as discography page relationships
  • [MBS-6188] – Remove rating from work merge page
  • [MBS-6189] – Show work languages on ISWC page
  • [MBS-6190] – Artist credit diffs per word, join phrase per char

New Feature

  • [MBS-799] – Location, venue and event support
  • [MBS-1839] – Track MBID SQL changes
  • [MBS-2417] – Support multiple countries/regions on a single release
  • [MBS-3296] – Add dynamic attributes
  • [MBS-3985] – Support multiple artist countries
  • [MBS-5272] – Create daily and weekly “rollup” replication packets
  • [MBS-5302] – Store International Standard Name Identifier (ISNI, ISO 27729) for artists and labels
  • [MBS-5861] – Dynamic work attributes (SQL)

Task

  • [MBS-5314] – Drop the work.artist_credit column
  • [MBS-6133] – Ensure JPEG uploads still work through CAA
  • [MBS-6167] – Add iTunes links to the sidebar
  • [MBS-6170] – Add Bandcamp links to the sidebar

Sub-task

  • [MBS-4217] – Spotify relationship under the External links section
  • [MBS-5809] – SQL changes for MBS-4294: Add a “description” field to collections
  • [MBS-5917] – transclusion to DB: SQL
  • [MBS-5918] – transclusion to DB: UI
  • [MBS-5919] – locations: SQL
  • [MBS-5920] – locations: UI
  • [MBS-5929] – Schema changes to store relationship type guidelines and example usages in the database
  • [MBS-5930] – Generate documentation automatically/allow managing guidelines in DB
  • [MBS-5933] – Schema changes to support multiple (date, country) pairs on releases
  • [MBS-6187] – UI changes to support multiple country/date pairs on releases

Schema change release tomorrow 15 May, 1300UTC

Things have been quiet at MusicBrainz in the past few weeks, but don’t let this fool you! We’ve been working hard on getting the schema change release put together. We’re on schedule for releasing this tomorrow at 13:00 UTC, 15 May, 2013.

We’re going to start the release process at 1300UTC, but the site may not go down just yet then. We’ll get started once we have all of our ducks in a row — to get more updates from us before we start, please follow @musicbrainz on Twitter or join us in IRC at #musicbrainz on irc.freenode.net.

Thanks, and wish us luck for a smooth schema change tomorrow!