MusicBrainz schema change release, 2022-05-16 (with upgrade instructions)

We’re happy to announce the release of our May 2022 schema change today! Thanks to all who were patient during today’s downtime as we released everything to our production servers, and thanks to ikerm2003, mfmeulenbelt rinmon and salo.rock for updating the translations.

This is once again a fairly minor release as far as schema changes go, but please do report any issues that you come across, especially related to the propagation of ratings and tags.

New, user-facing changes with this release include withdrawn-only release groups showing in the official overview again (MBS-12208) and the final disappearance of Amazon cover art (MBS-12200). To this regard, the report of releases that have Amazon cover art but no Cover Art Archive front cover will stay available for editors to check their subscribed entities.

Additionally, several small changes will allow, in the next couple of releases, to store more information about genres (including URL relationships) and to recognize and special-case mood tags (MBS-12190). Another new feature that will start to be used in the API and artist credit pages is the addition of MusicBrainz IDs for artist credits, which allow referring to them with a unique and more persistent ID (MBS-11456). Finally, a few more under-the-hood only changes are made, which should ensure better performance for finding artists, events, etc. for all areas contained in a given one, and less bugs when adding and changing tags and ratings.

The area containment changes make use of a new materialized table. Like the ones we added last year, this table isn’t dumped nor replicated, since it is derived entirely from primary table data. Rather, it will be created during migration (or, in a new install, by running the admin/BuildMaterializedTables script) and triggers will be added to keep it up-to-date once it has been built. These triggers are created on replicated servers, too.

The accompanying new version of the search index rebuilder brings performance improvements for both the main server and mirrors, and simplifies maintenance. See the release notes for details.

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Now, on to the instructions.

Schema Change Upgrade Instructions

Note: Importing the latest data dump is always a valid alternative to running ./upgrade.sh on an existing database, if you’d prefer to also get new data in one go. Just follow the relevant instructions in INSTALL.md. The git tag is v-2022-05-16.1-schema-change. The rest of the instructions here assume an in-place upgrade.

  1. Make sure DB_SCHEMA_SEQUENCE is set to 26 in lib/DBDefs.pm.
  2. If you’re using the live data feed (your REPLICATION_TYPE is set to RT_SLAVE), ensure you’ve replicated up to the most recent replication packet available with the old schema. If you’re not sure, run ./admin/replication/LoadReplicationChanges and see what it tells you; if you’re ready to upgrade, it should say “This replication packet matches schema sequence #27, but the database is currently at #26.”
  3. Take down the web server running MusicBrainz, if you’re running a web server.
  4. Turn off cron jobs if you’re automatically updating the database via cron jobs.
  5. If you’re using the live search indexing, stop it and, assuming sir is under the same directory as musicbrainz-server, run cd ../sir && python2.7 -m sir triggers && cd - && ./admin/psql < ../sir/sql/DropTriggers.sql && ./admin/psql < ../sir/sql/DropFunctions.sql
  6. Switch to the new code with git fetch origin followed by git checkout v-2022-05-16.1-schema-change.
  7. Run cpanm --installdeps --notest . (note the dot at the end) to ensure your perl-based dependencies are up to date.
  8. Run ./upgrade.sh (it may take a while to vacuum at the end).
  9. Set DB_SCHEMA_SEQUENCE to 27 in lib/DBDefs.pm as instructed by the output of ./upgrade.sh.
  10. If your REPLICATION_TYPE is set to RT_SLAVE, change it to RT_MIRROR. (The previous terminology will work for the time being, but is now deprecated.)
  11. If you’re using the live search indexing, assuming again that sir is under the same directory as musicbrainz-server, run cd ../sir && git fetch origin && git checkout v3.0.1 && python2.7 -m sir triggers && cd - && ./admin/psql < ../sir/sql/CreateFunctions.sql && ./admin/psql < ../sir/sql/CreateTriggers.sql and rebuild indexes which takes hours (by running cd ../sir && python2.7 -m sir reindex && cd -) then start it in watch mode (with cd ../sir && git fetch origin && git checkout v3.0.1 && python2.7 -m sir amqp_watch)
  12. Turn cron jobs back on, if applicable.
  13. Restart the MusicBrainz web server, if applicable. It’s also recommended you restart Redis. If you’re accessing your MusicBrainz server in a web browser, run ./script/compile_resources.sh.

Here’s the list of resolved tickets:

Fixed Bug

  • [MBS-5359] – *_tag tables are corrupt and need to be regenerated
  • [MBS-11760] – Removing the last use of a tag does not always remove the tag
  • [MBS-12369] – Standalone databases may be missing foreign keys for the documentation schema

New Feature

  • [MBS-12190] – Add Mood support in the database

Improvement

  • [MBS-11456] – Add MBIDs to artist credits in the database with merge
  • [MBS-12141] – Block tag names that are empty or have uncontrolled whitespace with database constraints
  • [MBS-12224] – Keep tags’ ref_count and aggregate vote counts updated with triggers
  • [MBS-12249] – Add a materialized area_containment table kept up-to-date with triggers
  • [MBS-12256] – Keep rating and rating_count column on *_meta tables up-to-date with triggers
  • [MBS-12313] – Clarify item naming in the Search drop down menu

Database Schema Change Task

  • [MBS-11457] – Drop the series ordering_attribute column
  • [MBS-11755] – Remove unused tags
  • [MBS-12157] – Remove support for Amazon cover art
  • [MBS-12200] – Drop schema objects related to Amazon cover art support
  • [MBS-12225] – Rename “slave” to “mirror” (inclusive language update)
  • [MBS-12250] – Create dbmirror2 schema on production and mirror servers
  • [MBS-12252] – Add edit_genre table
  • [MBS-12253] – Add relationship tables for genres
  • [MBS-12254] – Add genre_annotation table
  • [MBS-12255] – Add genre_alias_type table and make genre_alias consistent

10 thoughts on “MusicBrainz schema change release, 2022-05-16 (with upgrade instructions)”

  1. Thank you for your effort and this instructions!

    Do you have an ETA when your full dumps for the MB and Search Server data will be ready, exported and mirrored for this new schema #27?

  2. The MB database (schema 27) is already exported and mirrored.
    The search server is still under maintenance, dumps are unlikely to be available today.

  3. I was just asking in case you didn’t know. Because the searches on the website (indexed by default) do work for me, otherwise.
    Maybe you redirected indexed searches towards direct searches, or something like that.
    But anyway, fine, no problem if it is known. 🙂

  4. Oh OK, it’s specifically the index for recordings that is HS!
    Thanks for your IRC answer!

  5. Now it finds recordings, but not yet all recordings.
    Good luck fixing, yvanzo, I still see the maintenance notice.

  6. Indexed search is now fully back in service.
    Dump will be available on Saturday as usual.

  7. Hi, two problems relating to coverartarchive:

    1. I did a “recreatedb.sh -fetch http://ftp.eu.metabrainz.org/pub/musicbrainz” which succeeded with the notable exception of not importing the (successfully downloaded) coverartarchive tables. Errors were of the type:

    No data file found for ‘cover_art_archive.art_type’, skipping

    2. To resolve, I did a manual copy of the 5 tables in the downloaded mbdump-cover-art-archive.tar.bz2; e.g.

    psql -d musicbrainz -h db -U musicbrainz -c ‘COPY cover_art_archive.art_type FROM stdin’ < art_type

    However, there is only a single row for art_type:

    musicbrainz=# SELECT * FROM cover_art_archive.art_type;
    id | name | parent | child_order | description | gid
    —-+——–+——–+————-+————-+————————————–
    1 | Poster | | 0 | | 7ced53fc-bb27-33ae-aeef-79d6e24fec3c
    (1 row)

    but there are many more type_ids in the cover_art_type table:

    musicbrainz=# SELECT type_id, count(type_id) FROM cover_art_archive.cover_art_type cat GROUP BY type_id ORDER BY type_id;
    type_id | count
    ———+———
    1 | 1805469
    2 | 445565
    3 | 722235
    4 | 425675
    5 | 30269
    6 | 170857
    7 | 21246
    8 | 62405
    9 | 70652
    10 | 15736
    11 | 7866
    12 | 27916
    13 | 4637
    14 | 20406
    15 | 2576
    (15 rows)

    Please would you look into this? Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.