Schema change release, 2016-05-23 (with upgrade instructions)

Starting with this release, PostgreSQL 9.5 is now our minimum supported version. In order to import any future data sets, you will need to upgrade your installation to version 9.5.

Due to unforeseen problems with the Live Data Feed (AKA replication), users with slave databases will be required to first import a fresh data dump into their new 9.5 installation. We apologize that this is the case, but even had this stream not been broken, doing a clean import is faster and easier than doing the migration. For details on what happened during this rather lengthy schema change release, stay tuned for a post mortem blog post that covers the details.

If you have a non-replicated standalone database, you can use pg_upgrade and run ./upgrade.sh directly, but for simplicity we strongly recommend importing the latest data dump. Thus, we will only provide instructions for a clean import:

  1. Make sure you have PostgreSQL 9.5 installed, and your database settings in lib/DBDefs.pm are updated to point to the 9.5 installation if you currently have an older version of postgres running. If you already have postgres 9.5 and want to replace the existing database there, you’ll need to drop it first (using dropdb or from within psql). Be careful that you’re not dropping any important data if this is a standalone database that you’ve made changes to.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Switch to the new code with git fetch origin followed by git checkout v-2016-05-23-schema-change-v2
  5. Run cpanm --installdeps --notest . to ensure your perl-based dependencies are up to date. Note the dot at the end.
  6. Set DB_SCHEMA_SEQUENCE to 23 in lib/DBDefs.pm
  7. Download the latest data dumps. If you don’t need historical edit data, excluding the edit dump will speed up your import significantly.
  8. Initialize a new database from the data dumps downloaded in step 7. Detailed instructions for doing this are located in INSTALL.md in the musicbrainz-server repository; if your data dumps are in /tmp, the command should simply be something like ./admin/InitDb.pl --createdb --import /tmp/mbdump*.tar.bz2.
  9. After the import has finished, turn cron jobs back on, if applicable.
  10. Restart the MusicBrainz web server, as well as memcached, if applicable.

We would like to thank bitmap, Gentlecat, zas, chirlu, reosarevok, gcilou for contributing directly to the release and we’d also like to thank all of the people who helped test, debug or otherwise offer support in this quite difficult release. Thank you!

And finally, here’s the list of changes you can expect in the upgrade:

Bug

  • [MBS-6406] – Admins can’t change email addresses
  • [MBS-8288] – Missing indexes for inverse lookup on *_gid_redirect tables
  • [MBS-8669] – Primary key for place table missing on old slaves
  • [MBS-8906] – Release pages ISE if CB doesn’t return JSON from its API for whatever reason
  • [MBS-8928] – If you submit the release editor without being logged in, it displays “[object Object]” as an error mesage
  • [MBS-8943] – Some pages do not respect DB_READ_ONLY setting

Improvement

  • [MBS-1873] – Fix vote tallies for edits
  • [MBS-3887] – Duplicate artist and label names not being checked against alias
  • [MBS-8287] – Log deleted entities that were in a subscribed collection
  • [MBS-8433] – Work attributes don’t have a uuid
  • [MBS-8716] – Store the edit data in a JSONB column
  • [MBS-8717] – Move the edit data to a separate table
  • [MBS-8838] – Add gids to all *_type* tables
  • [MBS-8873] – Convert and unify artist credit editors to React
  • [MBS-8909] – Add logos to IMDb and VGMdb links in the sidebar
  • [MBS-8939] – Update the Instagram logo used in the sidebar
  • [MBS-8940] – Let banner message editors dismiss the banner only temporarily

Task

  • [MBS-8656] – Bring edit table indexes back into sync
  • [MBS-8719] – Stop materializing of edit and vote counts
  • [MBS-8720] – Add a materialized view of edit note recipients
  • [MBS-8727] – Prevent duplicate votes
  • [MBS-8800] – Create the earthdistance extension and add a geodetic index for place coordinates
  • [MBS-8804] – Add BRIN indexes for timestamp columns
  • [MBS-8897] – add new entity icons
  • [MBS-8938] – Schema changes to support alternative tracklists

Schema change update

We’ve finally completed the schema update and things are returning to normal. We need to get a new data dump out and then we will provide upgrade instructions tomorrow. As you might be able to guess, unless you are already on Postgres 9.5, we are going to recommend a clean data import, rather than a migration, if you have a replicated slave.

And, if anyone even dare ask (within the next week) when an updated VM will be released, you owe the whole development team each 2 bars of high quality chocolate.

Feeling lucky, punk?

P.S. Can you tell we’ve been up too long? 🙂

Important: Schema change delayed to May 23

With our ongoing hosting issues due to massive traffic increases and failing hardware we’ve been too distracted trying to manage those issues to finish all of the testing for the schema change release that was scheduled for today.

We deeply regret having to do this, but we’re going to delay the schema change release by a week. It is now scheduled for May 23, 2016. This week long delay will give us a chance to further tweak our server configuration (more on this in the next blog post) and to test the schema change release in much more detail.

We are, however, going to upgrade our database server to Postgres 9.5 either later today or tomorrow. During this upgrade we are going to employ a back-up database server and keep MusicBrainz running in read-only mode with a slightly reduced overall capacity (I’m sure everyone know what that means by now). This upgrade should have no other effects on our downstream data users.

We will give people plenty of notice before we start the postgres upgrade via our site banner and via our Twitter account (@musicbrainz).

Sorry for the continued drama affecting our services — we’re working hard to keep things together!

May 2016 schema change release details

In about two months time we’ll have the next schema change release: May 16, 2016. Even after skipping the fall schema change release, this release is going to have few changes that will impact our downstream users. Most of the tickets in this release will make minor improvements to database indexes and edit tables. If you are one of the few users of our edit data, then you should delve deeper into the list of tickets in this release. For everyone else, I will summarize the tickets with a greater impact.

In a previous blog post we also talked about upgrading the minimum required version of postgres. We received no real feedback requesting for us to upgrade to 9.4, but we did receive some feedback that some people would prefer 9.5, which is our preference as well. Based on that feedback, we’re going to make PostgreSQL version 9.5 the minimum required version. If you’d like to run a MusicBrainz replicated instance via our Live Data Feed, you will need to run Postgres 9.5!

The official minimum supported Ubuntu release as of now is still Ubuntu 10.04 LTS (Lucid Lynx) which reached end-of-life a year ago. We will upgrade that to Ubuntu 14.04 LTS (Trusty Tahr) at the schema change release. In particular, this means that we might start using Perl 5.18 features in the MusicBrainz Server code (as opposed to Perl 5.10 currently).

We understand that this is potentially a lot of work for some of our users, but occasionally we need to upgrade our requirements. We try and limit these sorts of upgrades as much as possible, so please bear with us.

Finally onward to the details of the release. Please take a look at the list of issues that will be addressed in this release. The few tickets worth discussing in details are:

  • MBS-8838 – “Add gids to all *_type tables“. This ticket adds MBIDs (GIDs in schema lingo) to all of our tables that define a type for some database element. Given that we recommend that external users never reference our data by row ids, we really need to provide proper permanent MBIDs to all elements of our database.
  • MBS-6024 – “Support more than one barcode on same release (SQL edition)“. This ticket adds the ability for the database to contain more than one barcode for a given release. However, this ticket does not include the user interface portions of this feature. The team will add the user interface/edit portions of this feature in a later, non schema change release.
  • MBS-4501 – “Alternative tracklists“. This ticket creates a new feature that would allow an alternative tracklist to be used for a given release. This is a better solution for handling conflicts between our style guidelines and how the data appears on the release. It is also a more elegant solution for translations of releases into different languages.

As usual, we will post final details about the release shortly before the release happens. If you have any questions about this release, feel free to ask specific questions in the tickets or general questions in the comments below.

(Edited 2016-03-16 at 12:55 UTC to add the upgraded Ubuntu requirement.)

Upgrading Postgres for MusicBrainz Live Data Feed users

We’re slowly approaching that time of year: Schema change release time. After skipping our fall update to focus on some internal tasks, we’re ready to have another schema change release in the spring: May 16, 2016

We have started the process to collect features we wish to release for this schema change release and we’ll be publishing that list in the coming weeks. However, we’re contemplating the impact of one more change we’d like to make: Upgrading to a more recent version of Postgres.

Internally we are going upgrade to Postgres 9.5, which was recently released, so we expect that the Postgres team will have worked out the most significant kinks before we’re ready to move to it. However, even though we are moving to 9.5, we are considering the impact on our downstream users/customers who need to make the same or similar change.

While we are moving to version 9.5 of Postgres, we have the option of only adopting features from Postgres 9.4, which means that our downstream users may continue to use Postgres 9.4. However, Postgres 9.5 has some nice features we’d like to use (e.g. UPSERT), so we’re pondering if it is possible for us to require Postgres 9.5 from all of ours Live Data Feed users starting on May 16, 2016. 

We have already informally queried a few of ours users and so far it seems that requiring Postgres 9.5 is feasible. If you are a Live Data Feed user and feel that this requirement of Postgres 9.5 is too much for your and your organization by May 16, 2016, please leave a comment to this blog post!

There will be no autumn 2015 schema change

Schema changes are always a lot of work for us and we end up spending much time preparing for it and then even more time cleaning up/catching up after it. As a result, some critical non-schema change features keep getting pushed back… to the point that we never get to them.

To try and break this cycle, we’re going to skip the Autumn 2015 schema change. Instead we will focus on other tasks such as hosting and community features.

We will resume our schedule with the next planned schema change around 15 May, 2016. After that release we will determine if we want to go ahead with 1 or 2 schema change releases a year.

Schema change release, 2015-05-18 (including upgrade instructions)

Our previously mentioned schema change release is finished! Below will be upgrade instructions, including configuration updates for replication access tokens.

This release does not include UI for several of the schema change patches, which will (hopefully) happen for next release on June 1. The incomplete patches are MBS-7489 (credits for artists in relationships), MBS-4145 (tag upvote/downvote), and MBS-8004 (collections for additional entity types). These patches have had their schema change components finished, but the UI was incomplete or needed more work.

Schema Change Upgrade Instructions

These are largely as previous upgrade instructions, using the tag v-2015-05-18-schema-change. The primary difference is the inclusion of configuring an access token for replication.

  1. Make sure your REPLICATION_TYPE setting is RT_SLAVE and your DB_SCHEMA_SEQUENCE is set to 21 in lib/DBDefs.pm. If you’re running a standalone server, you can run the upgrade, but it may be easier to just import a new data dump!
  2. Ensure you’ve replicated up to the most recent replication packet available with the old schema. (if you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to update, it should say “Mismatched schema sequence, 21 (database) vs 22 (replication packet)”).
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you are automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2015-05-18-schema-change
  6. Run ./upgrade.sh (or carton exec -Ilib -- ./upgrade.sh if you’re using carton, with very old setups).
  7. Run cpanm --installdeps --notest . to ensure your perl-based dependencies are up to date. This release adds a dependency on LWP::Protocol::https, for fetching replication packets from the new server; many systems may already have this installed, but it should be verified.
  8. Set DB_SCHEMA_SEQUENCE to 22 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh
  9. Assuming you have been updating your server with replication, it will now be necessary to configure an access token:
    1. Go to https://metabrainz.org/supporters/account-type and choose your account type as applicable. If you’re an individual, non-commercial user of the data, choose “non-commercial”; if not, choose an applicable tier in the “commercial” section. If you’re not sure of the appropriate tier, make your best guess; it can be adjusted if necessary.
    2. Then, from https://metabrainz.org/profile, create an access token, which should be a 40-character random alphanumeric string provided by the site.
    3. Finally, add this token to lib/DBDefs.pm under the REPLICATION_ACCESS_TOKEN configuration option. The final configuration section should look something like sub REPLICATION_ACCESS_TOKEN { "ck3UpgwgOXhWC6SpFcd99rZOTjzfrei3gQlgZZ9z" }.
    4. Don’t reveal your access token! If you do, inadvertently, you can use the MetaBrainz site to generate a new token, invalidating the old one. (The one in the example above is one I created for myself and then invalidated — don’t get any ideas, it won’t work!)
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart memcached.

Finally, the list of bugs closed this release:

Bug

  • [MBS-4436] – Medium titles cannot be longer than 255 charaters

Improvement

  • [MBS-1347] – Implement aliases for release groups, releases and recordings
  • [MBS-7906] – maybe don’t show “”≠null diff. in edit pages
  • [MBS-8279] – Remove empty_artists etc. database functions

New Feature

  • [MBS-8302] – Add Live Data Feed access token support

Task

  • [MBS-8266] – Make medium titles VARCHAR NOT NULL
  • [MBS-8278] – Update DB_SCHEMA_SEQUENCE in DbDefs.pm.sample
  • [MBS-8283] – Remove DB constraint that disallows empty event names

Not included in this list but also relevant is MBS-8349, which while fixed for a previous release, in this release is also applied to old slave servers, which may help performance for some queries.