MusicBrainz schema change release, 2021-05-17 (with upgrade instructions)

We’re happy to announce the release of our May 2021 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers.

This is a fairly minor release as far as schema changes go, but please do report any issues that you come across, especially related to the display of recordings, releases and release groups on artist and release group pages.

New, user-facing changes with this release are limited to the new ability to merge collections (MBS-10208) and the addition of ratings for places (MBS-11451). Additionally, MBS-11463 adds a new view that is used to fix a couple small requests related to disc IDs (MBS-11268) and release length calculation (MBS-11349). Two other changes – adding a first-release-date field to recordings (MBS-1424) and support for PKCE in OAuth (MBS-11097) are more or less end-user affecting but were already released on the main MusicBrainz servers a while ago. All other changes are under the hood only.

We ran into a few complications while working on this schema change update, so we decided to postpone two changes to our October schema change to ensure only stuff we are more confident on is released. Those are MBS-11457, which involves dropping the ordering_attribute column for series and would have had no direct effect on user experience, and MBS-11456, which would have added MBIDs for artist credits.

A few of the released new features and improvements — namely the first-release-date field for recordings, and the performance improvements to artist pages — make use of new materialized tables. These tables aren’t dumped, nor are they replicated, since they’re derived entirely from primary table data. Rather, we’ve added a new script to build them (admin/BuildMaterializedTables, included in the upgrade instructions below), and triggers to keep them up-to-date once they’re built. These triggers are created on replicated servers, too. If you use the web interface or web service at all, just note the extra step of running BuildMaterializedTables after upgrade.sh below!

A new release of MusicBrainz Docker is also available that solves an issue for live indexing and matches this update of MusicBrainz Server. See the release notes for update instructions.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The git tag is v-2021-05-19-hotfixes. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 25 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #26, but the database is currently at #25.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. If you’re using the live search indexing, stop it and, assuming sir is under the same directory as musicbrainz-server, run cd ../sir && python2.7 -m sir triggers && cd - && ./admin/psql < ../sir/sql/DropTriggers.sql && ./admin/psql < ../sir/sql/DropFunctions.sql
  6. Switch to the new code with git fetch origin followed by git checkout v-2021-05-19-hotfixes.
  7. Install newer dependencies Perl 5.30 or later and NodeJS 16 according to install prerequisites.
  8. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  9. Run ./upgrade.sh (it may take a while to vacuum at the end).
  10. Set DB_SCHEMA_SEQUENCE to 26 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  11. If you’re using the web interface or web service, run ./admin/BuildMaterializedTables --database=MAINTENANCE all to build new materialized tables. These will take several additional gigabytes of spaces and be kept up-to-date automatically via triggers. For more information, see INSTALL.md.
  12. If you’re using the live search indexing, assuming sir is under the same directory as musicbrainz-server, run cd ../sir && git fetch origin && git checkout v2.1.0 && python2.7 -m sir triggers && cd - && ./admin/psql < ../sir/sql/CreateFunctions.sql && ./admin/psql < ../sir/sql/CreateTriggers.sql and rebuild indexes (by running cd ../sir && python2.7 -m sir reindex && cd -) then start it in watch mode (with cd ../sir && git fetch origin && git checkout v2.1.0 && python2.7 -m sir amqp_watch)
  13. Turn cron jobs back on, if applicable.
  14. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart Redis. If you’re accessing your MusicBrainz server in a web browser, run ./script/compile_resources.sh.

Here’s the list of resolved tickets:

New Feature

  • [MBS-10208] – Allow merging collections
  • [MBS-11451] – Support ratings for places
  • [MBS-11463] – Add view to easily access medium track lengths
  • [MBS-11652] – Add support for artist series (hotfixed)

Improvement

  • [MBS-10962] – Speed up listing artist’s releases
  • [MBS-11268] – Show “Set track durations” on release/discids page
  • [MBS-11460] – Add materialized tables to fetch release groups by artist or track artist

Database Schema Change Task

  • [MBS-10647] – Add [no label] to b_del_label_special trigger for labels
  • [MBS-11453] – Change entity0_cardinality, entity1_cardinality to SMALLINT
  • [MBS-11459] – Create the edit_data_type_info function on mirrors
  • [MBS-11464] – Drop table statistics.log_statistic
  • [MBS-11466] – Change language.frequency and script.frequency to SMALLINT

Previously Released Changes

  • [MBS-1424] – Add a ‘First release date’ field to recordings
  • [MBS-10821] – Edit changing medium tracklist and format is stuck
  • [MBS-11097] – Support PKCE (Proof Key for Code Exchange) by OAuth clients
  • [MBS-11431] – Speed up /ws/js/check_duplicates

Schema change release: May 17, 2021

We’re having a schema change release on May 17, mostly to make small changes that will make our queries more efficient, ensure better constraints, and make some hardcoded options editable without schema changes in the future. We are also upping the required versions of both Perl and Node.js to 5.30 and 16.0 respectively (see the “Minimum version requirements” section below.)

Here’s our list of tickets for the Spring 2021 schema change, with descriptions of what’s being changed:

Schema changes

  • MBS-1424: Add a “first release date” field to recordings. A very popular request for years, this allows requesting the date of the first ever release a recording appeared on. This adds materialized tables recording_first_release_date and release_first_release_date which are updated via triggers whenever the earliest date changes. The change was released as an optional extension to the main MusicBrainz server schema on Dec 16, 2020, but it will be added to the main schema during this schema change.
  • MBS-10208: Allow merging collections. Users who decide two of their collections should be joined into a larger one should be able to do so without having to move all the entities in the collection manually. This requires adding a editor_collection_gid_redirect table (equivalent to other existing x_gid_redirect tables) to ensure the old collection links redirect to the one they have been merged into.
  • MBS-10566: Convert allowed_series_entity_type and allowed_collection_entity_type to tables to allow for additions without schema changes. The constraints allowed_series_entity_type and allowed_collection_entity_type specify which types of entity can be used in series and collections, respectively. As such, if we want to add the possibility to create series of artists, we need to modify the constraint during a schema change. For ease of use, we are moving the constraints to be their own tables instead, allowing us to update them as needed in the future outside of a schema change release.
  • MBS-10647: Add [no label] to b_del_label_special trigger for labels. The b_del_label_special trigger ensures that any attempt to remove a special purpose label fails. Currently it only checks the special case “Deleted label”, but since “[no label]” is also a special purpose label that should never be deleted, we will add its ID to the trigger check.
  • MBS-10821: Remove orphaned recordings from collections for deletion. Replaces a single function, delete_orphaned_recordings(), to add a new clause that makes it so that recordings referenced only in collections (but not linked to anything else in the database) can be deleted as orphans. This was released on the main MusicBrainz servers on June 15, 2020, but it will be added to the main schema during this schema change.
  • MBS-10962 / MBS-11460Add materialized tables and indexes to fetch releases and release groups by artist or track artist. These tickets will address performance issues on our current artist pages. They do not modify any existing tables, but as mentioned, add some new tables (to be updated via triggers) and indexes.
  • MBS-11097: Support PKCE (Proof Key for Code Exchange) by OAuth clients. Adds two new columns to the editor_oauth_token table. This feature is opt-in, but allows public OAuth2 clients to mitigate auth code interception attacks. The change was released as an optional extension to the main MusicBrainz server schema on Sep 21, 2020, but it will be added to the main schema during this schema change.
  • MBS-11431: Speed up /ws/js/check_duplicates. Adds new indexes only (on the artist, label, place, and series tables, plus their respective alias tables). Improves some slow queries in the editing interface related to duplicate checking, i.e. finding other entities with the same name. Since this is a non-breaking change, it was released on the main MusicBrainz servers on Mar 15, 2021, but it will be added to the main schema during this schema change.
  • MBS-11451: Support ratings for places. Places can be reviewed in CritiqueBrainz yet cannot be rated in MusicBrainz. This is a strange state of affairs: clearly, if they are worth reviewing in depth, they also deserve the option to rate them. As such, we will add place_rating_raw and place_meta tables, in the same way we have for other ratable entities.
  • MBS-11453: Change entity0_cardinality, entity1_cardinality to SMALLINT. The cardinality columns of the link_type table are used to indicate whether the entity on each side of the relationship is expected to have only a few uses of the relationship type in question, or many (too many to comfortably display/edit). This is what stops every single recording of a work showing up in an edit work page, for example. At the moment we allow only two values for cardinality, 0/1. While it is possible that we will want to allow a few other values in between what is effectively “do not show anywhere ever” and “show all the time”, it is clear we will not need more than 32.000 values. As such, we are moving these columns from INTEGER to SMALLINT to reduce table size.
  • MBS-11456: Add MBIDs and redirect tables for artist credits. Adds a gid column to the artist_credit table, and a new artist_credit_gid_redirect table. The MBIDs will allow public identification of artist credits outside of MusicBrainz, and open the door to useful features in the future.
  • MBS-11457: Drop series ordering_attribute. This column was added back when we were expecting to have different types of ordering attributes for series, but we have never used it. We are planning to just drop the column.
  • MBS-11459: Add a script to create edit_data_type_info. edit_data_type_info is a function used for development that we added in 2020 and we will now add to mirrors as well, both for consistency and for anyone who uses their mirror for development and just wants to use it.
  • MBS-11463: Add view to easily access medium track lengths. Our current way of finding the length of a medium is to either load all its tracks, then sum the durations, or to use the duration of its disc ID when present. The first of these requires a lot more processing than just getting the durations straight from the database, while the second ignores any data tracks not on the CD TOC. This adds a medium_track_durations view that allows easy access to the duration of the track lengths for any medium.
  • MBS-11464: Drop statistics.log_statistic. We added the statistics.log_statistic table for a Google Summer of Code project back during a 2012 schema change, but the work never got finished and implemented. We are not planning to implement it anymore, so this table is useless and we will be dropping it.
  • MBS-11466: Change language.frequency and script.frequency to SMALLINT. The frequency column of a language or script indicates how often it’s used and lets us sort the most frequently used entries at the top of our lists. But we don’t store an exact count, just a number from 0-4 indicating “frequently used,” “hidden,” or “other.” Like MBS-11453 above, we don’t expect to need a full INTEGER type to store these columns at any point in the future, so can safely move them to SMALLINT.

Minimum version requirements

We’re raising the minimum required version for Node.js to v16 (which is the next LTS release coming in April 2021). Our current required version, v10, is hitting the end of its life cycle in April, and given there shouldn’t be a particular difficulty installing the current Node.js LTS on any system, it makes sense to just upgrade to the most recent LTS by the time of the schema change day.

We’re also raising the minimum required version for Perl to 5.30. The latest version is 5.32, but the current Ubuntu LTS (20.04), which is likely to be the next base image for our Docker containers, only provides 5.30. Our current required version, 5.18, was released all the way back in 2013, so moving to 5.30 is already a fairly significant improvement.

We’ll post final details about this release just prior to the release and shortly after we complete it, including instructions on how you can update your own copy. If you have any questions, please do leave a comment below or on the linked JIRA tickets!

MusicBrainz Server update, 2020-10-19

Today’s MusicBrainz Server brings a new data report, a continued conversion to React, some bugfixes and small improvements, but also tests refactoring.

Meanwhile, the search server has been updated twice in a row to fix bugs in JSON output mostly with MB Solr 3.2 (release notes) and MB Solr 3.3 (release notes), including the MusicBrainz API breaking change announced last month.

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Thanks to amCap1712 for fixing bugs in MB Solr, and to loujine for contributing code with yet a new data report. Thanks to Avamander, bonchiver_, chaban, draconx, eloise_freya, GTF1982, hawke, hibiscuskazeneko, jesus2099, jrv, kellnerd, Kid Devine, Psychoadept, selflessself, and wcw1966 for having reported bugs and suggested improvements. Thanks to kellnerd, mfmeulenbelt, and salorock for updating the translations. And thanks to all others who tested the beta version!

The git tag is v-2020-10-19.

Bug

  • [MBS-10221] – Track Parser not filling in artists
  • [MBS-11149] – Misspelling of the word “misspellings”
  • [MBS-11150] – Add CD-TOC wrongly defaults to last listed artist when none selected
  • [MBS-11156] – Track parser unsets “Various Artists” track artist credits
  • [MBS-11162] – Work type description bubble starts as default even if type is selected
  • [MBS-11174] – Editor profile added entities: Missing Add release edit type 216
  • [MBS-11176] – Heading of the release group section in the external links sidebar has disappeared

Improvement

  • [MBS-5225] – Allow showing tracklists everywhere when attaching/viewing discIDs
  • [MBS-7256] – Add “Expand all mediums” option to the release page
  • [MBS-8725] – Allow mediums to have an unknown tracklist
  • [MBS-11115] – Show detailed information when attaching disc IDs
  • [MBS-11139] – Use HTTPS for display on Library of Congress links
  • [MBS-11163] – Show type descriptions when editing entities
  • [MBS-11165] – Update the VK logo used in the sidebar
  • [MBS-11167] – Normalize vk.com links to HTTPS
  • [MBS-11173] – When deleting users, change their No and Yes votes on pending edits to Abstain

New Feature

  • [MBS-11117] – Report for mediums with very long durations from discID

React Conversion Task

  • [MBS-11141] – Convert Edit Relationship edit to React
  • [MBS-11152] – Convert entity/ratings page to React

Other Task

  • [MBS-11148] – Remove Google Play links from the sidebar

MusicBrainz Server update, 2020-09-21

React conversion tasks are conspicuously absent from today’s release, but that’s just because we needed to take some time to get it all working with the recent refactoring. This new server update mainly brings strong security improvements for the OAuth service. It also comes with a fair amount of smaller bugfixes and improvements. The most noticeable of these probably are the added details to the merge recordings’ form and the statistics by entity type on editors’ profile pages.

Announcement for MusicBrainz API users: A small but breaking change will be deployed on October 19th (in one month from now), to fix the JSON formatting of release packaging in search results (SEARCH-579).

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Thanks to kellnerd and loujine for contributing code. Thanks to calculator.ftvb, chaban, hibiscuskazeneko, jesus2099, kellnerd, lalinksy, psychoadept, rdswift, and spitzwegerich for having reported bugs and suggested improvements. Thanks to jesus2099, kellnerd, mfmeulenbelt, outsidecontext, and salorock for updating the translations. And thanks to all others who tested the beta version!

The git tag is v-2020-09-21.

Bug

  • [MBS-10880] – Series automatic ordering (without numbers) fails for new release group
  • [MBS-11065] – Smart link blocks affecting legitimate links
  • [MBS-11069] – Diff highlighting not visible for certain display resolutions
  • [MBS-11098] – Big Cartel URLs are denied for labels
  • [MBS-11101] – Series relationships not showing for work series

Improvement

  • [MBS-2768] – Display AcoustIDs, Annotation and any other useful info when merging recordings
  • [MBS-7473] – Adding a new discid: Allow to specify the target by its releaseid
  • [MBS-11017] – Normalize IMSLP URLs to HTTPS and add validation
  • [MBS-11058] – Tighten security of OAuth service
  • [MBS-11061] – Don’t allow MusicBrainz URLs in relationships
  • [MBS-11062] – Link basic how-tos from the front page
  • [MBS-11086] – Add icon for tabs with errors in release editor
  • [MBS-11109] – Block further more smart links
  • [MBS-11119] – Set a Content-Security-Policy header on account/admin related forms

New Feature

  • [MBS-7485] – OAuth token revokation through API
  • [MBS-9769] – Show entities added statistics on editor profile page
  • [MBS-10835] – Disallow creating new accounts with an e-mail already in use
  • [MBS-11097] – Support PKCE (Proof Key for Code Exchange) by OAuth clients

Task

  • [MBS-10921] – Clear editing history of unrelated recording-of relationship edits

Reminder: Upgrading to PostgreSQL 12 on May 18, 2020

As we announced in February, in two weeks time (May 18, 2020) we’ll be upgrading our production database server to PostgreSQL v12 (from v9.5). At the same time, v12 will become the minimum supported version for MusicBrainz Server, so we ask that you upgrade afterwards as soon as possible! If you’re still unsure, a Q&A is below.

When do I need to upgrade my postgres by?

As soon as possible after May 18 if you’d like to keep your musicbrainz-server code up to date.

How do I perform the upgrade?

We’ll provide instructions closer to May 18. It’s recommended that you don’t upgrade until then, since we’ll be providing scripts to resolve some issues.

Will the live data feed (replication packets) stop working right away if I don’t upgrade?

No, as long as you keep your musicbrainz-server code checkout on the v-2020-05-11 tag (which will be the final release before May 18) or earlier. Future releases may work for a while too.

This is not a schema change release, so replication will continue to work smoothly until you upgrade. No tables or views will change.

However, to make the upgrade process smoother we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions, instead using PG’s builtin collation support for the former and replacing the latter with the unaccent extension from postgresql-contrib. A few SQL functions are being added to enable this, and some indexes need to be rebuilt. This will all happen as part of upgrade scripts we provide (or you can import from scratch). Some features of musicbrainz-server that use these old extensions may cease to work if you don’t apply them.

The extension changes above don’t actually make use of any new PG 12 features. We’ll avoid using such features for at least 1 month.

If I’m already running PostgreSQL 12, do I need to do anything?

Yes, but things will be easier for you. As mentioned in the previous answer, we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions to make the upgrade process smoother for pre-v12 instances. So you’ll only have to run some upgrade scripts we provide to replace those extensions and rebuild some indexes.

My host/distribution doesn’t have PostgreSQL 12 yet!

If you’re running Debian or Ubuntu, the PGDG maintains an APT repository with the latest versions. These are the same packages MetaBrainz uses in production.

Amazon RDS supports PostgreSQL 12 since March 31.

I absolutely cannot upgrade yet! What should I do?

You can stay on the v-2020-05-11 release of musicbrainz-server or earlier until then. Replication packets (i.e. the live data feed) will continue to work until the next schema change on that tag, but you’ll have upgraded to v12 by then, right?

Instead of performing a pg_upgrade and running these upgrade scripts you mentioned, can I just import fresh data dumps into a new v12 cluster?

Of course. Just make sure your musicbrainz-server git checkout is on the v-2020-05-18 tag (once that’s released) or later before performing the import. And keep in mind it may be slower than a direct upgrade.

MusicBrainz schema change release, 2019-05-13 (with upgrade instructions)

We’re happy to announce the release of our May 2019 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers.

This is a fairly minor release as far as schema changes go, but please do report any issues that you come across, especially any related to genres and collections.

Visible changes with this release are limited to an indication if a specific artist credit is being edited (MBS-5387). Work on some of the changes to collections and genres is quite advanced, and we’re hoping to release some of the new features onto beta already in a week or so from now, while others might take a while longer.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The git tag is v-2019-05-13-schema-change. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 24 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #25, but the database is currently at #24.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2019-05-13-schema-change.
  6. Install newer dependencies Yarn and NodeJS 8 or later according to install prerequisites.
  7. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  8. Run ./upgrade.sh (it may take a while to vacuum at the end).
  9. Set DB_SCHEMA_SEQUENCE to 25 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart redis. If you’re accessing your MusicBrainz server in a web browser, run ./script/compile_resources.sh.

Here’s the list of resolved tickets:

Bug

  • [MBS-5387] – ACs being edited aren’t marked as having pending edits on the aliases tab
  • [MBS-9365] – event_meta_fk_id was never created as part of any upgrade script
  • [MBS-9462] – Standalone databases created before schema 21 are missing some l_event_url triggers
  • [MBS-10146] – Regression: ISE on Remove DiscID page
  • [MBS-10149] – Swap track titles with artist credits fails to update both fields properly
  • [MBS-10150] – Regression: The link to the release group reviews in the release page is broken

Improvement

  • [MBS-9664] – Add database constraints to disallow loop relationship
  • [MBS-10044] – Add place area to place lists

Database Schema Change Task

  • [MBS-10052] – Add new schema for the event art archive
  • [MBS-10173] – Create a genre table in the DB and populate it with existing genres
  • [MBS-10174] – Create an addition timestamp in the DB for new collection items
  • [MBS-10175] – Create a position integer in the DB for collection items
  • [MBS-10176] – Create a comment text field in the DB for collection items
  • [MBS-10177] – Create an editor_collection_collaborator table for collaborative collections
  • [MBS-10178] – Create a genre_alias table
  • [MBS-10181] – Create filesize for cover art and each thumb in the DB

React Conversion Task

  • [MBS-9925] – Convert collection pages to React
  • [MBS-10179] – Convert all entity list components to React

Schema change release, 2017-05-15 (including upgrade instructions)

We’re happy to announce the release of our May 2017 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers.

This is a fairly minor release as far as schema changes go, but please do report any issues that you come across.

Currently, the only visible change for editors is the ability to add multiple lyrics languages to works. We’ve also modified the schema to support dynamic attributes for entities other than works, but the UI for that won’t be complete for another release or two.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 23 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #24, but the database is currently at #23.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2017-05-15-schema-change.
  6. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  7. Downgrade DBD::Pg by running cpanm TURNSTEP/DBD-Pg-3.5.3.tar.gz (version 3.6.0 breaks things currently).
  8. Run ./upgrade.sh (it may take a while to vacuum at the end).
  9. Set DB_SCHEMA_SEQUENCE to 24 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart redis. If you’re accessing your MusicBrainz server in a web browser, run npm install followed by ./script/compile_resources.sh.

For those curious, here’s the list of resolved tickets (excluding MBS-8393):

Bug

New Feature

  • [MBS-9271] – Prevent usernames from being reused

Task

  • [MBS-9273] – Fix the a_ins_edit_note function in older setups to not populate edit_note_recipient for own notes
  • [MBS-9274] – Fix the edit_note_idx_post_time_edit index in older setups to handle NULL post_time

Improvement

  • [MBS-5452] – Support multiple lyric language values for works

May 2017 Schema Change Release: May 15, 2017

We have picked our set of tickets and the date for our May 2017 schema change release: May, 15th 2017. This will be a fairly standard and minor schema change release — we’re only tackling 3 tickets that affect downstream users and no other infrastructure changes.

Take a look at our  list of tickets for this schema change release. There really are only two tickets that will affect most of our downstream users:

  • MBS-8393: “Extend dynamic attributes to all entities” Currently our works have the concept of additional attributes which allows the community to decide which sorts of new attributes to apply to a work. (e.g. catalog numbers, rhythmic structures, etc) This ticket will implement these attributes to all of our entities. Also, this ticket will not change any of the existing database tables, it will only add new tables.
  • MBS-5452: “Support multiple lyric language values for works” Currently only one language or the special case “multiple languages” may be used to identify the language used in lyrics. This ticket allows more than one language to be specified for lyrics of a work.

The following tickets are special cases — they will not really affect our downstream users who do not have edit data loaded into their system. We are only including this change at the schema change release time in order to bring some older replicated systems up to date. If you do not use the edit data, then please ignore these tickets.

  • MBS-9271: “Prevent usernames from being reused” This ticket does not change the schema, but for sake of minimizing downstream disruption, we’re going to carry out this ticket during the schema change.
  • MBS-9274: “Fix the edit_note_idx_post_time_edit index in older setups to handle NULL post_time” This ticket fixes an SQL index on an edit related table.
  • MBS-9273: “Fix the a_ins_edit_note function in older setups to not populate edit_note_recipient for own notes” This ticket also fixes an SQL index on an edit related table.

This is it — really minor this time around. If you have any questions, feel free to post them in the comments or on the tickets themselves.

 

Schema change release: What happened?

Now that we’ve finally finished the schema change release, I wanted to give an account of what happened in this arduous process. Before I dive into the details, I want to offer a picture that best sums up our current situation and challenges:

personal-container-mngmnt3

The shipping container is MusicBrainz and the boat is our hosting infrastructure. This picture perfectly describes the sort of challenges we’ve faced over the past few days. 🙂

Here is what happened:

Because the site was recently running slow and our search servers kept crashing, Zas and I were not available to help Bitmap prepare for the schema change release. This long process was left to Bitmap and Gentlecat to take care of on their own. We quickly realized that we were not ready for the release when the due date came and thus we delayed one week.

Sunday 22 May

Finally we were ready to proceed with the Postgres 9.5 upgrade. Once we started the process, we kept running into small problems that we didn’t get in our test setups. We do not have access to enough infrastructure to have a complete clone of our production environment, so we can only do so much to prepare for all the things that might happen when we run upgrades on our production servers.

All the while we attempted to start the upgrade, our backup database server was running much slower than anticipated. In the end we figured out that a step for optimizing the database (analyzing it) wasn’t carried out. During this time the site was really slow/unusable, but by the time the problem became apparent we had started the upgrade and could not turn back.

Once the upgrade was done, optimizing the database took much much longer than usual: 3 hours! This process wasn’t started until about 1am local time, which made for a very long night before that process finished. And even then we hit snags and had to start over a couple of times. At about 4:30am we had the site running on Postgres 9.5 in read only mode. The plan was to rest and start the schema change release in the morning.

Monday 23 May

Of course we had spent all of our time working on the Postgres upgrade and site stability, so our document that we use to plan the schema change was not in place. We spent the day preparing this and other bits for the release. To get an appreciation for what this document looks like, have a look! Note that some steps could be instant, others might take hours to carry out. Others might involve a sub-step or 20 not included in the document.

In the evening we were ready to make the change. By this point our backup DB was performing much better, so the read-only site worked acceptably. Thus, we started the release. Overall, the actual release process was reasonably smooth – we hit a few snags and had to do a lot of waiting for our slow servers. At about 1am in the morning things were finally complete. We proceeded with our sanity checks to make sure things went smoothly and all of them passed.

We proceeded to put the site into read-write mode and immediately saw portions of Postgres crashing, which is really bad. With community feedback we quickly deduced that some write operations were causing Postgres back-end processes to crash. We went back to read-only mode on the site and things stabilized and we finally went to bed at 3am.

Tuesday 24 May

In the morning we quickly found the source of database trouble with the help from the Postgres people on IRC. Thanks for the swift help Johto! We found that the steps for installing the updated third party extensions into Postgres had not completed correctly. Repeating the steps by hand fixed this problem.

Sadly yesterday morning we got an email informing us that our Live Data Feed replication stream had become corrupted. 😦 This was heartbreaking news to us, since it means a great inconvenience to all of our Live Data Feed users. We immediately split into two teams: Zas, chirlu and myself to fix the root cause of the issue and Bitmap to investigate fixing the stream.

I proceeded to setup a test environment was able to quickly reproduce the problem. Zas and chirlu were an amazing support team Googling issues as I came across them. Within fairly short time we fixed the problem and deployed the fix to our database server. The problem was caused by a bug in a piece of code that we’ve been using for 13 years! A change in Postgres caused this bug to actually become a problem and corrupt our replication feed. 😦

Once the problems were fixed we needed to initiate a new data dump and check to make sure the replication stream is working correctly. Of course we found a problem that we fixed and re-started the process to dump the data. Loads of hurry-up-and-wait situations to try our patience!

When we were satisfied that things were working correctly we re-enabled the site as read-write at about 1am and allowed people to continue editing. Exhausted we stumbled into bed waiting for data dumps to sync out to the FTP site.

Wednesday 25 May

Today Bitmap was flying home and as soon as WiFi became available on his flight he started working and helping with putting the schema change to bed. We’ve verified that everything is working as expected. At last this saga comes to and end and we can all take a break and catch up on sleep!

Thank you for your patience through all of this.