May 2016 schema change release details

In about two months time we’ll have the next schema change release: May 16, 2016. Even after skipping the fall schema change release, this release is going to have few changes that will impact our downstream users. Most of the tickets in this release will make minor improvements to database indexes and edit tables. If you are one of the few users of our edit data, then you should delve deeper into the list of tickets in this release. For everyone else, I will summarize the tickets with a greater impact.

In a previous blog post we also talked about upgrading the minimum required version of postgres. We received no real feedback requesting for us to upgrade to 9.4, but we did receive some feedback that some people would prefer 9.5, which is our preference as well. Based on that feedback, we’re going to make PostgreSQL version 9.5 the minimum required version. If you’d like to run a MusicBrainz replicated instance via our Live Data Feed, you will need to run Postgres 9.5!

The official minimum supported Ubuntu release as of now is still Ubuntu 10.04 LTS (Lucid Lynx) which reached end-of-life a year ago. We will upgrade that to Ubuntu 14.04 LTS (Trusty Tahr) at the schema change release. In particular, this means that we might start using Perl 5.18 features in the MusicBrainz Server code (as opposed to Perl 5.10 currently).

We understand that this is potentially a lot of work for some of our users, but occasionally we need to upgrade our requirements. We try and limit these sorts of upgrades as much as possible, so please bear with us.

Finally onward to the details of the release. Please take a look at the list of issues that will be addressed in this release. The few tickets worth discussing in details are:

  • MBS-8838 – “Add gids to all *_type tables“. This ticket adds MBIDs (GIDs in schema lingo) to all of our tables that define a type for some database element. Given that we recommend that external users never reference our data by row ids, we really need to provide proper permanent MBIDs to all elements of our database.
  • MBS-6024 – “Support more than one barcode on same release (SQL edition)“. This ticket adds the ability for the database to contain more than one barcode for a given release. However, this ticket does not include the user interface portions of this feature. The team will add the user interface/edit portions of this feature in a later, non schema change release.
  • MBS-4501 – “Alternative tracklists“. This ticket creates a new feature that would allow an alternative tracklist to be used for a given release. This is a better solution for handling conflicts between our style guidelines and how the data appears on the release. It is also a more elegant solution for translations of releases into different languages.

As usual, we will post final details about the release shortly before the release happens. If you have any questions about this release, feel free to ask specific questions in the tickets or general questions in the comments below.

(Edited 2016-03-16 at 12:55 UTC to add the upgraded Ubuntu requirement.)

Upgrading Postgres for MusicBrainz Live Data Feed users

We’re slowly approaching that time of year: Schema change release time. After skipping our fall update to focus on some internal tasks, we’re ready to have another schema change release in the spring: May 16, 2016

We have started the process to collect features we wish to release for this schema change release and we’ll be publishing that list in the coming weeks. However, we’re contemplating the impact of one more change we’d like to make: Upgrading to a more recent version of Postgres.

Internally we are going upgrade to Postgres 9.5, which was recently released, so we expect that the Postgres team will have worked out the most significant kinks before we’re ready to move to it. However, even though we are moving to 9.5, we are considering the impact on our downstream users/customers who need to make the same or similar change.

While we are moving to version 9.5 of Postgres, we have the option of only adopting features from Postgres 9.4, which means that our downstream users may continue to use Postgres 9.4. However, Postgres 9.5 has some nice features we’d like to use (e.g. UPSERT), so we’re pondering if it is possible for us to require Postgres 9.5 from all of ours Live Data Feed users starting on May 16, 2016. 

We have already informally queried a few of ours users and so far it seems that requiring Postgres 9.5 is feasible. If you are a Live Data Feed user and feel that this requirement of Postgres 9.5 is too much for your and your organization by May 16, 2016, please leave a comment to this blog post!

There will be no autumn 2015 schema change

Schema changes are always a lot of work for us and we end up spending much time preparing for it and then even more time cleaning up/catching up after it. As a result, some critical non-schema change features keep getting pushed back… to the point that we never get to them.

To try and break this cycle, we’re going to skip the Autumn 2015 schema change. Instead we will focus on other tasks such as hosting and community features.

We will resume our schedule with the next planned schema change around 15 May, 2016. After that release we will determine if we want to go ahead with 1 or 2 schema change releases a year.

Schema change release, 2015-05-18 (including upgrade instructions)

Our previously mentioned schema change release is finished! Below will be upgrade instructions, including configuration updates for replication access tokens.

This release does not include UI for several of the schema change patches, which will (hopefully) happen for next release on June 1. The incomplete patches are MBS-7489 (credits for artists in relationships), MBS-4145 (tag upvote/downvote), and MBS-8004 (collections for additional entity types). These patches have had their schema change components finished, but the UI was incomplete or needed more work.

Schema Change Upgrade Instructions

These are largely as previous upgrade instructions, using the tag v-2015-05-18-schema-change. The primary difference is the inclusion of configuring an access token for replication.

  1. Make sure your REPLICATION_TYPE setting is RT_SLAVE and your DB_SCHEMA_SEQUENCE is set to 21 in lib/DBDefs.pm. If you’re running a standalone server, you can run the upgrade, but it may be easier to just import a new data dump!
  2. Ensure you’ve replicated up to the most recent replication packet available with the old schema. (if you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to update, it should say “Mismatched schema sequence, 21 (database) vs 22 (replication packet)”).
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you are automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2015-05-18-schema-change
  6. Run ./upgrade.sh (or carton exec -Ilib -- ./upgrade.sh if you’re using carton, with very old setups).
  7. Run cpanm --installdeps --notest . to ensure your perl-based dependencies are up to date. This release adds a dependency on LWP::Protocol::https, for fetching replication packets from the new server; many systems may already have this installed, but it should be verified.
  8. Set DB_SCHEMA_SEQUENCE to 22 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh
  9. Assuming you have been updating your server with replication, it will now be necessary to configure an access token:
    1. Go to https://metabrainz.org/supporters/account-type and choose your account type as applicable. If you’re an individual, non-commercial user of the data, choose “non-commercial”; if not, choose an applicable tier in the “commercial” section. If you’re not sure of the appropriate tier, make your best guess; it can be adjusted if necessary.
    2. Then, from https://metabrainz.org/profile, create an access token, which should be a 40-character random alphanumeric string provided by the site.
    3. Finally, add this token to lib/DBDefs.pm under the REPLICATION_ACCESS_TOKEN configuration option. The final configuration section should look something like sub REPLICATION_ACCESS_TOKEN { "ck3UpgwgOXhWC6SpFcd99rZOTjzfrei3gQlgZZ9z" }.
    4. Don’t reveal your access token! If you do, inadvertently, you can use the MetaBrainz site to generate a new token, invalidating the old one. (The one in the example above is one I created for myself and then invalidated — don’t get any ideas, it won’t work!)
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart memcached.

Finally, the list of bugs closed this release:

Bug

  • [MBS-4436] – Medium titles cannot be longer than 255 charaters

Improvement

  • [MBS-1347] – Implement aliases for release groups, releases and recordings
  • [MBS-7906] – maybe don’t show “”≠null diff. in edit pages
  • [MBS-8279] – Remove empty_artists etc. database functions

New Feature

  • [MBS-8302] – Add Live Data Feed access token support

Task

  • [MBS-8266] – Make medium titles VARCHAR NOT NULL
  • [MBS-8278] – Update DB_SCHEMA_SEQUENCE in DbDefs.pm.sample
  • [MBS-8283] – Remove DB constraint that disallows empty event names

Not included in this list but also relevant is MBS-8349, which while fixed for a previous release, in this release is also applied to old slave servers, which may help performance for some queries.

Schema upgrade downtime: Monday, 18 May, 2015 @ 17:00 UTC

Our next schema change version will be released on Monday, 18 May, 2015 around 10am PDT/1pm EDT/17:00 UTC/18:00 BST/19:00 CEST. We expect that MusicBrainz will be unavailable for 15 – 30 minutes during this time. We will put up the downtime notification on the site and tweet from @musicbrainz right before the release.

Since we’re total slackers, we still haven’t set up our backup database server since it suffered a hardware failure. This means that we won’t be able to put the site into read-only mode and will require us to do a full downtime. Hopefully for our next schema change we’ll have tackled our backlog of sysadmin duties and will have a backup DB server to make the release easier.

Sorry for any trouble this may cause you.

P.S. Look for another blog post on Sunday for details on where to get your access tokens for the Live Data Feed.

Schema change upgrade instructions, schema 21

This upgrade shouldn’t be substantially different than past upgrades, now that we’ve fixed a few bugs with the process. To upgrade:

  1. Make sure your REPLICATION_TYPE setting is RT_SLAVE and your DB_SCHEMA_SEQUENCE is set to 20 in lib/DBDefs.pm.
  2. Ensure you’ve replicated up to the most recent replication packet available with the old schema. (if you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you).
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you are automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout schema-change-20-to-21
  6. Run ./upgrade.sh (or carton exec -Ilib -- ./upgrade.sh if you’re using carton, with very old setups).
  7. Set DB_SCHEMA_SEQUENCE to 21 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart memcached.

That’s it! The only real difference from the past is the specific tag to be used: schema-change-20-to-21, which is a couple of fix-up commits past the regular release tag.

Downtime for fall schema change

Our next schema change version will be released on Monday, 17 November, 2014 around Noon PST/3pm EST/20:00 GMT/21:00 CET. We expect that MusicBrainz will be unavailable for 30 – 60 minutes during this time. We will put up the downtime notification on the site and tweet from @musicbrainz right before the release.

Sadly, our backup database server suffered a hardware failure and we ran out of time to get a replicated database setup after the hardware was fixed. This means that we won’t be able to put the site into read-only mode and will require us to take a full-downtime.

It sucks and we’re not happy about it either, but there is only so much we can accomplish with our limited resources. 🙁

Sorry for any troubles this may cause you.

2014-11-17 schema change release details

We’re now 60 days away from our fall schema change, so we’re announcing the tickets we intend to implement for the next schema change release:

  • [MBS-1059] – Types of list/collection: This new feature will allow a user to specify what type of list/collection they have.
  • [MBS-5458] – CD Stubs replication: Replicate the CD Stub data as part of our replicated data feed.
  • [MBS-6201] – Add an “event” entity: This is the big feature for this release. This feature allows us to record events like concerts or recordings, both future events and past events!
  • [MBS-7551] – Add folksonomy tag support to all entities without them. This features will allow users to tag any of our entities.
  • [MBS-7638] – CreateIndexes for instruments wrongly looks at label tables: During our last release we created an incorrect index. This fixes this mistake.
  • [MBS-7784] – Support for data tracks in tracklists: This new feature would allow us to properly track Audio-CD data tracks in our tracklists.

Besides the events, there isn’t anything earth shattering in here.

New autumn schema change date: 17 November, 2014

Due to two conflicting summits in (our own and the GSoC Summit) around our usual schema change release date, we’ve decided to move the autumn schema change to 17 November, 2014. This will ensure that our developers are properly rested before attempting a hard task such as a schema change.

We hope this won’t cause too much trouble for everyone downstream.

Server upgrade and schema change, 2014-05-14

Hello again! This fortnight, as we’ve mentioned among the past few posts, is our twice-annual schema change release! This means that we’ve got some big changes, as well as special upgrade instructions.

In the former category, we’ve added support for two new entities: Series and Instruments, each of which is exactly what it sounds like. We’ve also done some various cleanup (tables missing from replication, making some attribute-style tables (e.g., label types) trees, and more properly orderable, removing unuseful sortnames). In non-schema-change fixes, we’ve fixed some merging for Artist Credits, show Area names after Places routinely, and perhaps most excitingly, added smaller versions of the relationship editor for other entities. It’s now possible to add relationships to any entity from most entity edit pages, expanding on our previous addition of URL editing!

As far as upgrading:

  1. Ensure you’ve replicated up to the most recent replication packet available with the old schema. (if you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you).
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Make sure your REPLICATION_TYPE setting is RT_SLAVE
  5. Switch to the new code with git fetch origin followed by git checkout v-2014-05-14-schema-change
  6. Run ./upgrade.sh (or carton exec -Ilib -- ./upgrade.sh if you’re using carton, with older installs).
  7. Set DB_SCHEMA_SEQUENCE to 20 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart memcached.

The git tag for this release, as mentioned above in the instructions, is v-2014-05-14-schema-change.

Full release notes, as usual:

Bug

  • [MBS-5978] – Replication feed is missing release_tag
  • [MBS-6709] – “None” is no longer the last Packaging type after adding Book and Cassette Case
  • [MBS-7482] – Artist merge with AC renaming does not merge identical ACs

Improvement

  • [MBS-2410] – Label types not a tree anymore
  • [MBS-2714] – Add support for Series.
  • [MBS-5897] – Make it possible to see edit JSON in the case of an ISE or poor data display
  • [MBS-6144] – Remove the apparently-unused script_language table
  • [MBS-6602] – Remove sortnames from areas
  • [MBS-6603] – Remove sortnames from labels
  • [MBS-6651] – Make it possible to disable dates for relationship types
  • [MBS-6886] – Display area after place names
  • [MBS-6887] – Model coordinates without nullable latitude and longitude
  • [MBS-7205] – Link types should track assumed cardinality
  • [MBS-7411] – Don’t require disambiguation comments for places from different areas
  • [MBS-7470] – Merging/combining RG types (primary/secondary) is unintuitive

New Feature

  • [MBS-3674] – Make instruments entities
  • [MBS-6234] – Add a relationship editor to artists, labels, recordings, release-groups, places, areas and works

Task

  • [MBS-7441] – Check non-replicated changes to DB that have happened since last schema change