MusicBrainz Server update, 2020-10-19

Today’s MusicBrainz Server brings a new data report, a continued conversion to React, some bugfixes and small improvements, but also tests refactoring.

Meanwhile, the search server has been updated twice in a row to fix bugs in JSON output mostly with MB Solr 3.2 (release notes) and MB Solr 3.3 (release notes), including the MusicBrainz API breaking change announced last month.

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Thanks to amCap1712 for fixing bugs in MB Solr, and to loujine for contributing code with yet a new data report. Thanks to Avamander, bonchiver_, chaban, draconx, eloise_freya, GTF1982, hawke, hibiscuskazeneko, jesus2099, jrv, kellnerd, Kid Devine, Psychoadept, selflessself, and wcw1966 for having reported bugs and suggested improvements. Thanks to kellnerd, mfmeulenbelt, and salorock for updating the translations. And thanks to all others who tested the beta version!

The git tag is v-2020-10-19.

Bug

  • [MBS-10221] – Track Parser not filling in artists
  • [MBS-11149] – Misspelling of the word “misspellings”
  • [MBS-11150] – Add CD-TOC wrongly defaults to last listed artist when none selected
  • [MBS-11156] – Track parser unsets “Various Artists” track artist credits
  • [MBS-11162] – Work type description bubble starts as default even if type is selected
  • [MBS-11174] – Editor profile added entities: Missing Add release edit type 216
  • [MBS-11176] – Heading of the release group section in the external links sidebar has disappeared

Improvement

  • [MBS-5225] – Allow showing tracklists everywhere when attaching/viewing discIDs
  • [MBS-7256] – Add “Expand all mediums” option to the release page
  • [MBS-8725] – Allow mediums to have an unknown tracklist
  • [MBS-11115] – Show detailed information when attaching disc IDs
  • [MBS-11139] – Use HTTPS for display on Library of Congress links
  • [MBS-11163] – Show type descriptions when editing entities
  • [MBS-11165] – Update the VK logo used in the sidebar
  • [MBS-11167] – Normalize vk.com links to HTTPS
  • [MBS-11173] – When deleting users, change their No and Yes votes on pending edits to Abstain

New Feature

  • [MBS-11117] – Report for mediums with very long durations from discID

React Conversion Task

  • [MBS-11141] – Convert Edit Relationship edit to React
  • [MBS-11152] – Convert entity/ratings page to React

Other Task

  • [MBS-11148] – Remove Google Play links from the sidebar

MusicBrainz Server update, 2020-09-21

React conversion tasks are conspicuously absent from today’s release, but that’s just because we needed to take some time to get it all working with the recent refactoring. This new server update mainly brings strong security improvements for the OAuth service. It also comes with a fair amount of smaller bugfixes and improvements. The most noticeable of these probably are the added details to the merge recordings’ form and the statistics by entity type on editors’ profile pages.

Announcement for MusicBrainz API users: A small but breaking change will be deployed on October 19th (in one month from now), to fix the JSON formatting of release packaging in search results (SEARCH-579).

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Thanks to kellnerd and loujine for contributing code. Thanks to calculator.ftvb, chaban, hibiscuskazeneko, jesus2099, kellnerd, lalinksy, psychoadept, rdswift, and spitzwegerich for having reported bugs and suggested improvements. Thanks to jesus2099, kellnerd, mfmeulenbelt, outsidecontext, and salorock for updating the translations. And thanks to all others who tested the beta version!

The git tag is v-2020-09-21.

Bug

  • [MBS-10880] – Series automatic ordering (without numbers) fails for new release group
  • [MBS-11065] – Smart link blocks affecting legitimate links
  • [MBS-11069] – Diff highlighting not visible for certain display resolutions
  • [MBS-11098] – Big Cartel URLs are denied for labels
  • [MBS-11101] – Series relationships not showing for work series

Improvement

  • [MBS-2768] – Display AcoustIDs, Annotation and any other useful info when merging recordings
  • [MBS-7473] – Adding a new discid: Allow to specify the target by its releaseid
  • [MBS-11017] – Normalize IMSLP URLs to HTTPS and add validation
  • [MBS-11058] – Tighten security of OAuth service
  • [MBS-11061] – Don’t allow MusicBrainz URLs in relationships
  • [MBS-11062] – Link basic how-tos from the front page
  • [MBS-11086] – Add icon for tabs with errors in release editor
  • [MBS-11109] – Block further more smart links
  • [MBS-11119] – Set a Content-Security-Policy header on account/admin related forms

New Feature

  • [MBS-7485] – OAuth token revokation through API
  • [MBS-9769] – Show entities added statistics on editor profile page
  • [MBS-10835] – Disallow creating new accounts with an e-mail already in use
  • [MBS-11097] – Support PKCE (Proof Key for Code Exchange) by OAuth clients

Task

  • [MBS-10921] – Clear editing history of unrelated recording-of relationship edits

Reminder: Upgrading to PostgreSQL 12 on May 18, 2020

As we announced in February, in two weeks time (May 18, 2020) we’ll be upgrading our production database server to PostgreSQL v12 (from v9.5). At the same time, v12 will become the minimum supported version for MusicBrainz Server, so we ask that you upgrade afterwards as soon as possible! If you’re still unsure, a Q&A is below.

When do I need to upgrade my postgres by?

As soon as possible after May 18 if you’d like to keep your musicbrainz-server code up to date.

How do I perform the upgrade?

We’ll provide instructions closer to May 18. It’s recommended that you don’t upgrade until then, since we’ll be providing scripts to resolve some issues.

Will the live data feed (replication packets) stop working right away if I don’t upgrade?

No, as long as you keep your musicbrainz-server code checkout on the v-2020-05-11 tag (which will be the final release before May 18) or earlier. Future releases may work for a while too.

This is not a schema change release, so replication will continue to work smoothly until you upgrade. No tables or views will change.

However, to make the upgrade process smoother we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions, instead using PG’s builtin collation support for the former and replacing the latter with the unaccent extension from postgresql-contrib. A few SQL functions are being added to enable this, and some indexes need to be rebuilt. This will all happen as part of upgrade scripts we provide (or you can import from scratch). Some features of musicbrainz-server that use these old extensions may cease to work if you don’t apply them.

The extension changes above don’t actually make use of any new PG 12 features. We’ll avoid using such features for at least 1 month.

If I’m already running PostgreSQL 12, do I need to do anything?

Yes, but things will be easier for you. As mentioned in the previous answer, we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions to make the upgrade process smoother for pre-v12 instances. So you’ll only have to run some upgrade scripts we provide to replace those extensions and rebuild some indexes.

My host/distribution doesn’t have PostgreSQL 12 yet!

If you’re running Debian or Ubuntu, the PGDG maintains an APT repository with the latest versions. These are the same packages MetaBrainz uses in production.

Amazon RDS supports PostgreSQL 12 since March 31.

I absolutely cannot upgrade yet! What should I do?

You can stay on the v-2020-05-11 release of musicbrainz-server or earlier until then. Replication packets (i.e. the live data feed) will continue to work until the next schema change on that tag, but you’ll have upgraded to v12 by then, right?

Instead of performing a pg_upgrade and running these upgrade scripts you mentioned, can I just import fresh data dumps into a new v12 cluster?

Of course. Just make sure your musicbrainz-server git checkout is on the v-2020-05-18 tag (once that’s released) or later before performing the import. And keep in mind it may be slower than a direct upgrade.

MusicBrainz schema change release, 2019-05-13 (with upgrade instructions)

We’re happy to announce the release of our May 2019 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers.

This is a fairly minor release as far as schema changes go, but please do report any issues that you come across, especially any related to genres and collections.

Visible changes with this release are limited to an indication if a specific artist credit is being edited (MBS-5387). Work on some of the changes to collections and genres is quite advanced, and we’re hoping to release some of the new features onto beta already in a week or so from now, while others might take a while longer.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The git tag is v-2019-05-13-schema-change. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 24 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #25, but the database is currently at #24.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2019-05-13-schema-change.
  6. Install newer dependencies Yarn and NodeJS 8 or later according to install prerequisites.
  7. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  8. Run ./upgrade.sh (it may take a while to vacuum at the end).
  9. Set DB_SCHEMA_SEQUENCE to 25 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart redis. If you’re accessing your MusicBrainz server in a web browser, run ./script/compile_resources.sh.

Here’s the list of resolved tickets:

Bug

  • [MBS-5387] – ACs being edited aren’t marked as having pending edits on the aliases tab
  • [MBS-9365] – event_meta_fk_id was never created as part of any upgrade script
  • [MBS-9462] – Standalone databases created before schema 21 are missing some l_event_url triggers
  • [MBS-10146] – Regression: ISE on Remove DiscID page
  • [MBS-10149] – Swap track titles with artist credits fails to update both fields properly
  • [MBS-10150] – Regression: The link to the release group reviews in the release page is broken

Improvement

  • [MBS-9664] – Add database constraints to disallow loop relationship
  • [MBS-10044] – Add place area to place lists

Database Schema Change Task

  • [MBS-10052] – Add new schema for the event art archive
  • [MBS-10173] – Create a genre table in the DB and populate it with existing genres
  • [MBS-10174] – Create an addition timestamp in the DB for new collection items
  • [MBS-10175] – Create a position integer in the DB for collection items
  • [MBS-10176] – Create a comment text field in the DB for collection items
  • [MBS-10177] – Create an editor_collection_collaborator table for collaborative collections
  • [MBS-10178] – Create a genre_alias table
  • [MBS-10181] – Create filesize for cover art and each thumb in the DB

React Conversion Task

  • [MBS-9925] – Convert collection pages to React
  • [MBS-10179] – Convert all entity list components to React

Schema change release, 2017-05-15 (including upgrade instructions)

We’re happy to announce the release of our May 2017 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers.

This is a fairly minor release as far as schema changes go, but please do report any issues that you come across.

Currently, the only visible change for editors is the ability to add multiple lyrics languages to works. We’ve also modified the schema to support dynamic attributes for entities other than works, but the UI for that won’t be complete for another release or two.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 23 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #24, but the database is currently at #23.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. Switch to the new code with git fetch origin followed by git checkout v-2017-05-15-schema-change.
  6. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  7. Downgrade DBD::Pg by running cpanm TURNSTEP/DBD-Pg-3.5.3.tar.gz (version 3.6.0 breaks things currently).
  8. Run ./upgrade.sh (it may take a while to vacuum at the end).
  9. Set DB_SCHEMA_SEQUENCE to 24 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  10. Turn cron jobs back on, if applicable.
  11. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart redis. If you’re accessing your MusicBrainz server in a web browser, run npm install followed by ./script/compile_resources.sh.

For those curious, here’s the list of resolved tickets (excluding MBS-8393):

Bug

New Feature

  • [MBS-9271] – Prevent usernames from being reused

Task

  • [MBS-9273] – Fix the a_ins_edit_note function in older setups to not populate edit_note_recipient for own notes
  • [MBS-9274] – Fix the edit_note_idx_post_time_edit index in older setups to handle NULL post_time

Improvement

  • [MBS-5452] – Support multiple lyric language values for works

May 2017 Schema Change Release: May 15, 2017

We have picked our set of tickets and the date for our May 2017 schema change release: May, 15th 2017. This will be a fairly standard and minor schema change release — we’re only tackling 3 tickets that affect downstream users and no other infrastructure changes.

Take a look at our  list of tickets for this schema change release. There really are only two tickets that will affect most of our downstream users:

  • MBS-8393: “Extend dynamic attributes to all entities” Currently our works have the concept of additional attributes which allows the community to decide which sorts of new attributes to apply to a work. (e.g. catalog numbers, rhythmic structures, etc) This ticket will implement these attributes to all of our entities. Also, this ticket will not change any of the existing database tables, it will only add new tables.
  • MBS-5452: “Support multiple lyric language values for works” Currently only one language or the special case “multiple languages” may be used to identify the language used in lyrics. This ticket allows more than one language to be specified for lyrics of a work.

The following tickets are special cases — they will not really affect our downstream users who do not have edit data loaded into their system. We are only including this change at the schema change release time in order to bring some older replicated systems up to date. If you do not use the edit data, then please ignore these tickets.

  • MBS-9271: “Prevent usernames from being reused” This ticket does not change the schema, but for sake of minimizing downstream disruption, we’re going to carry out this ticket during the schema change.
  • MBS-9274: “Fix the edit_note_idx_post_time_edit index in older setups to handle NULL post_time” This ticket fixes an SQL index on an edit related table.
  • MBS-9273: “Fix the a_ins_edit_note function in older setups to not populate edit_note_recipient for own notes” This ticket also fixes an SQL index on an edit related table.

This is it — really minor this time around. If you have any questions, feel free to post them in the comments or on the tickets themselves.

 

Schema change release: What happened?

Now that we’ve finally finished the schema change release, I wanted to give an account of what happened in this arduous process. Before I dive into the details, I want to offer a picture that best sums up our current situation and challenges:

personal-container-mngmnt3

The shipping container is MusicBrainz and the boat is our hosting infrastructure. This picture perfectly describes the sort of challenges we’ve faced over the past few days. 🙂

Here is what happened:

Because the site was recently running slow and our search servers kept crashing, Zas and I were not available to help Bitmap prepare for the schema change release. This long process was left to Bitmap and Gentlecat to take care of on their own. We quickly realized that we were not ready for the release when the due date came and thus we delayed one week.

Sunday 22 May

Finally we were ready to proceed with the Postgres 9.5 upgrade. Once we started the process, we kept running into small problems that we didn’t get in our test setups. We do not have access to enough infrastructure to have a complete clone of our production environment, so we can only do so much to prepare for all the things that might happen when we run upgrades on our production servers.

All the while we attempted to start the upgrade, our backup database server was running much slower than anticipated. In the end we figured out that a step for optimizing the database (analyzing it) wasn’t carried out. During this time the site was really slow/unusable, but by the time the problem became apparent we had started the upgrade and could not turn back.

Once the upgrade was done, optimizing the database took much much longer than usual: 3 hours! This process wasn’t started until about 1am local time, which made for a very long night before that process finished. And even then we hit snags and had to start over a couple of times. At about 4:30am we had the site running on Postgres 9.5 in read only mode. The plan was to rest and start the schema change release in the morning.

Monday 23 May

Of course we had spent all of our time working on the Postgres upgrade and site stability, so our document that we use to plan the schema change was not in place. We spent the day preparing this and other bits for the release. To get an appreciation for what this document looks like, have a look! Note that some steps could be instant, others might take hours to carry out. Others might involve a sub-step or 20 not included in the document.

In the evening we were ready to make the change. By this point our backup DB was performing much better, so the read-only site worked acceptably. Thus, we started the release. Overall, the actual release process was reasonably smooth – we hit a few snags and had to do a lot of waiting for our slow servers. At about 1am in the morning things were finally complete. We proceeded with our sanity checks to make sure things went smoothly and all of them passed.

We proceeded to put the site into read-write mode and immediately saw portions of Postgres crashing, which is really bad. With community feedback we quickly deduced that some write operations were causing Postgres back-end processes to crash. We went back to read-only mode on the site and things stabilized and we finally went to bed at 3am.

Tuesday 24 May

In the morning we quickly found the source of database trouble with the help from the Postgres people on IRC. Thanks for the swift help Johto! We found that the steps for installing the updated third party extensions into Postgres had not completed correctly. Repeating the steps by hand fixed this problem.

Sadly yesterday morning we got an email informing us that our Live Data Feed replication stream had become corrupted. 😦 This was heartbreaking news to us, since it means a great inconvenience to all of our Live Data Feed users. We immediately split into two teams: Zas, chirlu and myself to fix the root cause of the issue and Bitmap to investigate fixing the stream.

I proceeded to setup a test environment was able to quickly reproduce the problem. Zas and chirlu were an amazing support team Googling issues as I came across them. Within fairly short time we fixed the problem and deployed the fix to our database server. The problem was caused by a bug in a piece of code that we’ve been using for 13 years! A change in Postgres caused this bug to actually become a problem and corrupt our replication feed. 😦

Once the problems were fixed we needed to initiate a new data dump and check to make sure the replication stream is working correctly. Of course we found a problem that we fixed and re-started the process to dump the data. Loads of hurry-up-and-wait situations to try our patience!

When we were satisfied that things were working correctly we re-enabled the site as read-write at about 1am and allowed people to continue editing. Exhausted we stumbled into bed waiting for data dumps to sync out to the FTP site.

Wednesday 25 May

Today Bitmap was flying home and as soon as WiFi became available on his flight he started working and helping with putting the schema change to bed. We’ve verified that everything is working as expected. At last this saga comes to and end and we can all take a break and catch up on sleep!

Thank you for your patience through all of this.

Schema change release, 2016-05-23 (with upgrade instructions)

Starting with this release, PostgreSQL 9.5 is now our minimum supported version. In order to import any future data sets, you will need to upgrade your installation to version 9.5.

Due to unforeseen problems with the Live Data Feed (AKA replication), users with slave databases will be required to first import a fresh data dump into their new 9.5 installation. We apologize that this is the case, but even had this stream not been broken, doing a clean import is faster and easier than doing the migration. For details on what happened during this rather lengthy schema change release, stay tuned for a post mortem blog post that covers the details.

If you have a non-replicated standalone database, you can use pg_upgrade and run ./upgrade.sh directly, but for simplicity we strongly recommend importing the latest data dump. Thus, we will only provide instructions for a clean import:

  1. Make sure you have PostgreSQL 9.5 installed, and your database settings in lib/DBDefs.pm are updated to point to the 9.5 installation if you currently have an older version of postgres running. If you already have postgres 9.5 and want to replace the existing database there, you’ll need to drop it first (using dropdb or from within psql). Be careful that you’re not dropping any important data if this is a standalone database that you’ve made changes to.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Switch to the new code with git fetch origin followed by git checkout v-2016-05-23-schema-change-v2
  5. Run cpanm --installdeps --notest . to ensure your perl-based dependencies are up to date. Note the dot at the end.
  6. Set DB_SCHEMA_SEQUENCE to 23 in lib/DBDefs.pm
  7. Download the latest data dumps. If you don’t need historical edit data, excluding the edit dump will speed up your import significantly.
  8. Initialize a new database from the data dumps downloaded in step 7. Detailed instructions for doing this are located in INSTALL.md in the musicbrainz-server repository; if your data dumps are in /tmp, the command should simply be something like ./admin/InitDb.pl --createdb --import /tmp/mbdump*.tar.bz2.
  9. After the import has finished, turn cron jobs back on, if applicable.
  10. Restart the MusicBrainz web server, as well as memcached, if applicable.

We would like to thank bitmap, Gentlecat, zas, chirlu, reosarevok, gcilou for contributing directly to the release and we’d also like to thank all of the people who helped test, debug or otherwise offer support in this quite difficult release. Thank you!

And finally, here’s the list of changes you can expect in the upgrade:

Bug

  • [MBS-6406] – Admins can’t change email addresses
  • [MBS-8288] – Missing indexes for inverse lookup on *_gid_redirect tables
  • [MBS-8669] – Primary key for place table missing on old slaves
  • [MBS-8906] – Release pages ISE if CB doesn’t return JSON from its API for whatever reason
  • [MBS-8928] – If you submit the release editor without being logged in, it displays “[object Object]” as an error mesage
  • [MBS-8943] – Some pages do not respect DB_READ_ONLY setting

Improvement

  • [MBS-1873] – Fix vote tallies for edits
  • [MBS-3887] – Duplicate artist and label names not being checked against alias
  • [MBS-8287] – Log deleted entities that were in a subscribed collection
  • [MBS-8433] – Work attributes don’t have a uuid
  • [MBS-8716] – Store the edit data in a JSONB column
  • [MBS-8717] – Move the edit data to a separate table
  • [MBS-8838] – Add gids to all *_type* tables
  • [MBS-8873] – Convert and unify artist credit editors to React
  • [MBS-8909] – Add logos to IMDb and VGMdb links in the sidebar
  • [MBS-8939] – Update the Instagram logo used in the sidebar
  • [MBS-8940] – Let banner message editors dismiss the banner only temporarily

Task

  • [MBS-8656] – Bring edit table indexes back into sync
  • [MBS-8719] – Stop materializing of edit and vote counts
  • [MBS-8720] – Add a materialized view of edit note recipients
  • [MBS-8727] – Prevent duplicate votes
  • [MBS-8800] – Create the earthdistance extension and add a geodetic index for place coordinates
  • [MBS-8804] – Add BRIN indexes for timestamp columns
  • [MBS-8897] – add new entity icons
  • [MBS-8938] – Schema changes to support alternative tracklists

Upgrading Postgres for MusicBrainz Live Data Feed users

We’re slowly approaching that time of year: Schema change release time. After skipping our fall update to focus on some internal tasks, we’re ready to have another schema change release in the spring: May 16, 2016

We have started the process to collect features we wish to release for this schema change release and we’ll be publishing that list in the coming weeks. However, we’re contemplating the impact of one more change we’d like to make: Upgrading to a more recent version of Postgres.

Internally we are going upgrade to Postgres 9.5, which was recently released, so we expect that the Postgres team will have worked out the most significant kinks before we’re ready to move to it. However, even though we are moving to 9.5, we are considering the impact on our downstream users/customers who need to make the same or similar change.

While we are moving to version 9.5 of Postgres, we have the option of only adopting features from Postgres 9.4, which means that our downstream users may continue to use Postgres 9.4. However, Postgres 9.5 has some nice features we’d like to use (e.g. UPSERT), so we’re pondering if it is possible for us to require Postgres 9.5 from all of ours Live Data Feed users starting on May 16, 2016. 

We have already informally queried a few of ours users and so far it seems that requiring Postgres 9.5 is feasible. If you are a Live Data Feed user and feel that this requirement of Postgres 9.5 is too much for your and your organization by May 16, 2016, please leave a comment to this blog post!