Steps to fix missing genre and tag data on MusicBrainz mirror servers

If you recently updated your mirror server to the 2022-05-16 schema change release, we’re sorry to say that a bug in our upgrade script caused aggregate genre and tag data (if you had imported any) to be deleted. If you need this data, it can be re-imported from a recent dump, and we’ve written a script to help automate that.

You can safely ignore this post if

To restore the genre and tag data, follow these steps:

  1. Ensure you’ve replicated up to the most recent replication packet available. If you’re not sure, run ./admin/replication/LoadReplicationChanges. If you’re up-to-date, it should log “Replication packet … is not available.”
  2. Run git checkout production && git pull origin production.
  3. Turn off any cron jobs that update the database, including for replication.
  4. Run ./admin/sql/updates/20220720-mbs-12508.sh.
  5. Restart any cron jobs that you disabled.

You can verify that this process worked by checking the number of tags in the database: echo 'SELECT count(*) FROM tag' | ./admin/psql READWRITE. It should be over 200,000.

Sorry for the inconvenience, and let us know if you encounter any further issues.

Schema change release: May 16, 2022

Today we’re announcing a MusicBrainz database schema change release planned for May 16, 2022. The majority of these changes follow the theme of improving data integrity and consistency, performance, or just cleaning up old cruft. Others relate to new features for genres and artist credits. We’re also introducing a new entity based on tags, like genres that came before it: Mood. See below for more details, including information on how these changes affect the schema or existing data. We expect people will encounter zero breaking changes, but it doesn’t hurt to double check, especially if you have a specific or non-standard use of the database!

Here’s our list of tickets for the Spring 2022 schema change:

Schema changes

  • MBS-12256: Keep rating and rating_count column on *_meta tables up-to-date with triggers. An internal-only change to help us keep aggregate rating information (i.e. an average rating and count of ratings for each entity) up-to-date more easily, and help keep these values accurate. This change affects master/standalone databases, but should have no impact on mirror servers, where such triggers are not created. It’s possible that some existing aggregate ratings data was out of sync and will be updated with this change.
  • MBS-12224: Keep tags’ ref_count and aggregate vote counts updated with triggers. Like the change above for ratings, this is primarily internal-only and intended to help us keep tag counts in sync, though an adjacent goal is to make features like MBS-960 easier to implement. This also revives the tag.ref_count column, which hasn’t been updated for years, in order to provide a faster way of sorting tags/genres by usage count. Like above, this change should have no impact on mirror servers schema-wise, but will fix some existing corrupt tag counts.
  • MBS-12249: Add a materialized area_containment table kept up-to-date with triggers. Pages that make use of area containments, e.g. the list of artists from an area, which are expected to account for sub-areas, are currently quite slow and we’d like to improve upon this. The slowness is related to the recursive queries we use to get contained sub-areas – these queries are uncached and calculated on-the-fly. This ticket addresses these performance issues by caching area containment information in a new aptly-named area_containment table. Consistent with the tag and rating tickets above, this table will also be kept up-to-date with triggers. This change should have no impact on mirror servers except to make certain area requests faster; it does not affect existing data.
  • MBS-12250: Create dbmirror2 schema on production and mirror servers. The dbmirror extension we use to generate our replication packets a.k.a Live Data Feed is a 20 year old tool. It has issues and limitations that are difficult to fix, and we aim to replace it with something more maintainable. We wrote dbmirror2 to do that, but still have the task of getting it deployed to mirrors seamlessly. This will happen invisibly without any changes needed on mirrors! The action for this ticket is to simply create the schema for dbmirror2; it’s not actually used for replication yet. We’ll first have a testing phase and make sure external projects like mbdata work with the new replication packet format.
  • MBS-12200: Drop schema objects related to Amazon cover art support. For a long while, releases with Amazon URLs would be checked for cover art on Amazon, and if found, a link to the image would be cached for display. Unfortunately Amazon’s API to do this changed, and we haven’t synced artwork from them in years. We still have many old images cached and we still display those, but they aren’t guaranteed to be in sync. Last year we decided to drop support for displaying these, while giving time for users to upload any correct images to the Cover Art Archive. To help with this, we have a report of releases with Amazon cover art but no Cover Art Archive front cover.

    The schema change here involves dropping the release_coverart table (which was private and non-replicated) and the release_meta.amazon_store column (which was completely empty and unused). This change should have no impact on mirror servers, unless you were using this table or column for your own purposes, because they should be otherwise empty.
  • MBS-12141: Block tag names that are empty or have uncontrolled whitespace with database constraints. Recently we discovered a bug where empty or blank tag names could be submitted in the web service. This has since been fixed, but we’d also like to prevent such empty tags by adding a database constraint. This change has no effect on mirror servers schema-wise, where such constraints are not created. A few blank tags have already been deleted from the production database, but otherwise existing data is not affected. If any such tags exist in your standalone database, they’ll be deleted.
  • MBS-12252: Add edit_genre table. A requirement to start storing edit history for genres (right now changes leave no trail). This will add the empty table to mirror servers as well, but will not affect any existing data.
  • MBS-12253: Add relationship tables for genres. A requirement to be able to relate genres to other entities (such as URLs for the equivalent Wikipedia or Rate Your Music pages). This will add the empty tables (and accompanying example tables in the documentation namespace) to mirror servers as well, but will not affect any existing data.
  • MBS-12254: Add genre_annotation table. A requirement to make it possible to eventually add annotations to genres. This will add the empty table to mirror servers as well, but will not affect any existing data.
  • MBS-12255: Add genre_alias_type table and make genre_alias consistent. Originally the (as yet unused) genre_alias table was designed as a heavily simplified version of the alias tables for other entities. In retrospect, this was not a good decision, since it would make it harder to just use the generic implementation of our alias code for genres. As such, we’re adding a genre_alias_type table (originally genre aliases had no types) and replacing the genre_alias table with one having the extra columns matching other entities’ alias tables. These changes will also happen to mirror servers, but since the genre_alias table was completely unused it should not cause any issues. In case any standalone (not mirrored) servers were using genre_alias, we will ensure any existing data is transferred to the new version of it.
  • MBS-12241: Drop the whitespace_collapsed database constraint. We’ve had a constraint for years that tries to ensure that columns like entity names do not contain multiple consecutive spacing characters (disallowing names such as “This    Title”). In retrospect, this was overreaching, since there are several cases in which a specific number of spaces in a title can be shown to be artist intent. Additionally, we recently discovered that we had some very old data in the database that actually violated this constraint (causing issues when importing data to a standalone server). The data seemed to actually be correct (i.e. some of the aforementioned edge cases) so rather than amending it, we’re removing the constraint. This will have no effect on mirrors since they don’t run constraints.
  • MBS-12225: Rename “slave” to “mirror” (inclusive language update). We recently got a request from a long-time supporting organization to pick a different term for what we’ve historically called “slave server”. Since we were already sometimes using “mirror server” to mean the exact same thing, we are just changing the official name and will use “mirror server” in the future. RT_SLAVE will still work in DBDefs.pm, at least for now, but we’d suggest changing to the new (and equivalent) RT_MIRROR in your mirror servers’ DBDefs.pm. We’re likely to eventually drop support for RT_SLAVE in a future schema change, so we’ll remind you about changing it in future upgrade instructions.
  • MBS-12190: Add Mood support. Music mood was originally meant to be automatically calculated by the now-discontinued AcousticBrainz project. That’s obviously no longer in the cards, but this data would still be quite useful for ListenBrainz. As such, we’re planning to add basic support for mood tags in MusicBrainz in the same way we currently do genres. This can then be leveraged by ListenBrainz to collect the information directly from users playing music through their BrainzPlayer, and it can also of course be entered directly from MusicBrainz in the same way other tags (including genre tags) can. This will add new mood tables to mirrors and will detect some previously generic tags as mood tags, but it won’t cause any changes to the underlying data.
  • MBS-11760: Expand the database triggers which remove empty tags to all entities. When the last use of a tag (up- or downvote) is removed, we use triggers to completely remove the tag from the database. Some of the relevant triggers (for events, places, recordings and releases) were never created though, so if the last use was on an entity of one of those types the empty tag would not get purged. This doesn’t affect mirrors directly since the triggers don’t run on mirrors (but they should no longer get unused tags coming through replication).
  • MBS-11457: Drop series ordering_attribute. This column was added back when we were expecting to have different types of ordering attributes for series, but we have never used it. We planned to remove it last year already, but that required equivalent changes on the search server that couldn’t be made at the time. We are planning to just drop the column.
  • MBS-11456: Add MBIDs and redirect tables for artist credits. Adds a gid column to the artist_credit table, and a new artist_credit_gid_redirect table. It generates MBIDs for existing artist credits that will be replicated to mirrors.

    Artist credit MBIDs will be mainly exposed through the web service at first. The MBIDs will allow public identification of artist credits outside of MusicBrainz, and open the possibility of some more features in the future.
  • MBS-12208: Show withdrawn release groups in the official artist overview. The Withdrawn release status was added recently. Release groups containing only releases with this status were meant to be shown in the main (official) artist overview, but the way this is implemented means a small edit to the get_artist_release_group_rows function was needed. Mirrors using the materialized tables will need to update the data; more info will be provided as part of the upgrade instructions.

We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below, or on the linked JIRA tickets if relevant there!

NB: This post was updated on 21 March to include a ticket (MBS-12208) that we forgot to list originally

Planned downtime for Wednesday, March 2, 16:00 UTC

We’ll be performing some critical maintenance on our database server starting at 16:00 UTC on March 2. Some brief downtime (hopefully less than 15 minutes) will be required to transfer services from our primary database server to our standby, which will allow us to upgrade the primary’s system and perform hardware checks. Thanks for your patience during this time.

MusicBrainz Server update, 2021-09-06

Today we’re happy to release yyoung’s work to improve the external links editor; see his detailed blog post for more information. Thanks for your hard work, yyoung!

This release also contains quite a few bug fixes and improvements, given the amount of time since last release. Several fix guess-case issues, and we’ve refactored the guess-case code to be more maintainable in the future. A visible improvement you may notice is that we now display icons next to entity links in relationships to help distinguish different types of entities. (If you’ve been editing on MusicBrainz for a long time, you may remember something similar from the classic, pre-NGS website.)

A new release of MusicBrainz Docker is also available that matches this update of MusicBrainz Server. See the release notes for update instructions.

Thanks to aerozol, angriestchair, Beckfield, bsammon, chaban, CyberSkull, danBLOO, Dibou, Freso, HDS, jesus2099, KRSCuan, mr_maxis, mtrolley, navap, salo.rock, Shepard, yindesu, and yyoung for having reported bugs and suggested improvements. Thanks to Besnik, ffff23, kellnerd, mfmeulenbelt, panos, salo.rock, th1rtyf0ur, V6lur, and yoshi818 for updating the translations. And thanks to all others who tested the beta version!

The git tag is v-2021-09-06.

Fixed Bug

  • [MBS-11793] – Wikipedia abstract is fetched even though URL is ended
  • [MBS-11795] – Text overflow in external link relationship type description tooltip
  • [MBS-11802] – ISE on “Edit relationship type” edit for removed type
  • [MBS-11806] – Relationships for different tracks are wrongly grouped on release bottom credit display
  • [MBS-11812] – Merge queue: Missing whitespace before “New disc title”
  • [MBS-11832] – artist-credit/id page gives TypeError if the id does not exist
  • [MBS-11854] – Guess Case doesn’t properly capitalize after Unicode hyphen U+2010
  • [MBS-11861] – Weird odd/even classes client-side for tablesorted statistics
  • [MBS-11864] – Some DNB links are wrongly marked as invalid
  • [MBS-11922] – pre-NGS release type not shown for compilation
  • [MBS-11933] – /oauth2/token doesn’t validate the code parameter

Improvement

  • [MBS-2221] – Description for how to set a relationship as “in” a date
  • [MBS-2418] – Show “Edit URL” edits in entity edit histories
  • [MBS-2421] – Small icon near recording / work / release / artist / … names to distinguish them
  • [MBS-3774] – Add URL relationship with begin and end dates
  • [MBS-7859] – Hide relationships from original recordings to other/derived versions in release view
  • [MBS-10054] – Better editing/viewing of URL relationships from Artist page
  • [MBS-10910] – Display renamed labels on Overview
  • [MBS-11267] – Show label for cover art pieces when reordering cover art
  • [MBS-11391] – Show changes made to external link when editing URL relationship
  • [MBS-11622] – Clean up Apple Music label URLs
  • [MBS-11650] – Add Tag statistics to profile page
  • [MBS-11680] – Group editing URL relationships by external link
  • [MBS-11693] – Give useful message when rejecting Musixmatch /album links
  • [MBS-11722] – Don’t preselect basic as language proficiency
  • [MBS-11732] – Remove LYRICSnMUSIC from lyrics whitelist
  • [MBS-11733] – Remove WikiaParoles from lyrics whitelist
  • [MBS-11788] – Guess case: Lowercase “official” in ETI
  • [MBS-11796] – Add Internet Archive logo for sidebar
  • [MBS-11797] – Guess case: Lowercase “censored”, “uncensored”, “explicit” in ETI
  • [MBS-11798] – Disallow Instagram /accounts link and other internal links
  • [MBS-11808] – Don’t show entities in tag pages where vote count for tag is lower than 1
  • [MBS-11810] – Merge queue: Rename “disc title” to “medium title”
  • [MBS-11811] – Make Tracklist editing buttons behave consistently on collapsed mediums
  • [MBS-11823] – Don’t insert line breaks inside tag words
  • [MBS-11824] – featured artist reports should look for featured artists without a space
  • [MBS-11825] – Use consistent order for art types when editing vs adding cover art
  • [MBS-11833] – Drop “f.” from the featured artists reports
  • [MBS-11846] – Display release artist on release group view
  • [MBS-11850] – Make footer links more visible
  • [MBS-11862] – Do not show deprecated relationship types with 0 uses in selectors
  • [MBS-11863] – Allow DNB links for works
  • [MBS-11875] – Braille should not be looked for in “Releases with unlikely language/script pairs” in connection to a spoken/written language
  • [MBS-11888] – Automatically set/disable ended when setting an end date in DateRangeFieldset
  • [MBS-11891] – Use HTTPS when linking to Jira
  • [MBS-11912] – Also allow Mainly Norfolk as a lyrics source
  • [MBS-11913] – Auto-select “stream” additionaly to “download” for Jamendo URLs
  • [MBS-11915] – Don’t show area icon if there’s already a flag icon
  • [MBS-11924] – Remove redundant tabs/links/info from deleted editor profiles

New Feature

  • [MBS-9426] – Interface to remove usernames from blocked list
  • [MBS-9902] – Support auto-select/cleanup/validation of more than one relationship type for external links
  • [MBS-11689] – Report: pseudo-releases marked as the original tracklist
  • [MBS-11828] – Add admin interface for checking whether a username is blocked
  • [MBS-11848] – Add report for “Releases with Amazon cover art without any Cover Art Archive images”

React Conversion Task

  • [MBS-11834] – Convert Add Release edit to React
  • [MBS-11835] – Convert Change Wikidoc edit to React
  • [MBS-11836] – Convert Edit Barcodes edit to React
  • [MBS-11837] – Convert Edit Release Label edit to React
  • [MBS-11838] – Convert Edit Release edit to React
  • [MBS-11839] – Convert Remove Relationship Attribute edit to React
  • [MBS-11840] – Convert Reorder Mediums edit to React

Other Task

  • [MBS-11805] – Add flow typing to guess case code
  • [MBS-11856] – Remove reports for releases with cover art relationships
  • [MBS-11928] – Drop consul-template for deployment

PostgreSQL 12 Upgrade Instructions for MusicBrainz Server

Thanks to everyone for your patience during our downtime today. As promised, here are steps to follow to upgrade your own PG instance to v12. (Confused? See the previous blog post on this subject.)

If you’re already running v12, there are still some instructions you must follow!

For MusicBrainz Docker

If you’re running the new MusicBrainz Docker setup, an upgrade script exists for you to use. See the release notes for specific – hopefully brief – instructions.

For a Manual Setup (INSTALL.md Based)

If you aren’t using Docker but rather set up musicbrainz-server by hand following INSTALL.md, see the steps below.

Know that as an alternative, you can always import new data dumps from scratch (again following the steps in INSTALL.md) into a new PG 12 cluster. Just make sure you’re on the v-2020-05-18-postgres12 tag of musicbrainz-server while doing so.

If on the other hand you don’t mind getting your hands a bit dirty, you can use the quicker method below. Like INSTALL.md, this assumes you’re using Ubuntu/Debian and their postgresql-common cluster management tools.

If you’re already running v12, you should still follow these steps; however, you can skip the ones involving apt-get, pg_dropcluster, and pg_upgradecluster. The main steps you need to follow in this case are running the 20200518-pg12-before-upgrade.sql and 20200518-pg12-after-upgrade.sql scripts in that order.

On distros other than Debian/Ubuntu where the postgresql-common tools aren’t available, you’ll have to manage with initdb and pg_upgrade on your own.

  1. First take down the web server running MusicBrainz (stop plackup) to prevent database access.
  2. Turn off any cron jobs updating or accessing the database (e.g. for the live data feed/replication packets).
  3. Switch to the latest musicbrainz-server code with:
    git fetch origin && \
    git checkout v-2020-05-18-postgres12
  4. With PG 9.5 (or whatever version you’re using) still running, run the following “pre-upgrade” script:
    psql -U postgres -d musicbrainz_db \
    -f admin/sql/updates/20200518-pg12-before-upgrade.sql

    This assumes that “postgres” is the name of your PG superuser, and “musicbrainz_db” is the name of your database. If you see a few messages about things not existing, that’s normal.

  5. Install packages for PostgreSQL 12. On Ubuntu/Debian you can obtain them from the PGDG apt repo.
    apt-get update && \
    apt-get install postgresql-12 postgresql-server-dev-12

    If you’re installing postgresql-12 for the first time, this will automatically create a new cluster at /var/lib/postgresql/12/main. Remove that empty cluster. Don’t run this if you already had v12 installed and have data there!

    pg_dropcluster --stop 12 main
    If you did already have v12 installed with musicbrainz_db running there, leave the cluster alone and skip the next step involving pg_upgradecluster.

    In the unlikely event that you already have a v12 cluster, but also have musicbrainz_db running in a separate, older cluster, these instructions won’t work for you. We recommend importing fresh data dumps into the v12 cluster and dropping the old one.

  6. Upgrade the old cluster. This assumes it’s version 9.5; if you’re using version 10 or 11, make sure to replace 9.5 below with 10 or 11. If you have other databases in your old cluster besides musicbrainz_db, be aware that this will upgrade all of them to PG 12.
     pg_upgradecluster -v 12 9.5 main
  7. If all goes well, the new cluster should be up and running. (You can drop the old one if you like; the output of the pg_upgradecluster command will tell you how.) Now run the following “post-upgrade” script on the database:
    psql -U postgres -d musicbrainz_db -f \
    admin/sql/updates/20200518-pg12-after-upgrade.sql
    This may take a bit, as it has to recreate some indexes.
  8. The upgrade is complete. You can turn cron jobs back on, if applicable.
  9. Restart the MusicBrainz web server / plackup, if applicable. If you’re accessing the server in a web browser, the usual release upgrade steps apply, like running ./script/compile_resources.sh again.

If you run into any trouble following the above, please let us know and we’ll try to help resolve your issue as soon as possible!

Reminder: Upgrading to PostgreSQL 12 on May 18, 2020

As we announced in February, in two weeks time (May 18, 2020) we’ll be upgrading our production database server to PostgreSQL v12 (from v9.5). At the same time, v12 will become the minimum supported version for MusicBrainz Server, so we ask that you upgrade afterwards as soon as possible! If you’re still unsure, a Q&A is below.

When do I need to upgrade my postgres by?

As soon as possible after May 18 if you’d like to keep your musicbrainz-server code up to date.

How do I perform the upgrade?

We’ll provide instructions closer to May 18. It’s recommended that you don’t upgrade until then, since we’ll be providing scripts to resolve some issues.

Will the live data feed (replication packets) stop working right away if I don’t upgrade?

No, as long as you keep your musicbrainz-server code checkout on the v-2020-05-11 tag (which will be the final release before May 18) or earlier. Future releases may work for a while too.

This is not a schema change release, so replication will continue to work smoothly until you upgrade. No tables or views will change.

However, to make the upgrade process smoother we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions, instead using PG’s builtin collation support for the former and replacing the latter with the unaccent extension from postgresql-contrib. A few SQL functions are being added to enable this, and some indexes need to be rebuilt. This will all happen as part of upgrade scripts we provide (or you can import from scratch). Some features of musicbrainz-server that use these old extensions may cease to work if you don’t apply them.

The extension changes above don’t actually make use of any new PG 12 features. We’ll avoid using such features for at least 1 month.

If I’m already running PostgreSQL 12, do I need to do anything?

Yes, but things will be easier for you. As mentioned in the previous answer, we’ll be dropping the musicbrainz-collate and musicbrainz-unaccent extensions to make the upgrade process smoother for pre-v12 instances. So you’ll only have to run some upgrade scripts we provide to replace those extensions and rebuild some indexes.

My host/distribution doesn’t have PostgreSQL 12 yet!

If you’re running Debian or Ubuntu, the PGDG maintains an APT repository with the latest versions. These are the same packages MetaBrainz uses in production.

Amazon RDS supports PostgreSQL 12 since March 31.

I absolutely cannot upgrade yet! What should I do?

You can stay on the v-2020-05-11 release of musicbrainz-server or earlier until then. Replication packets (i.e. the live data feed) will continue to work until the next schema change on that tag, but you’ll have upgraded to v12 by then, right?

Instead of performing a pg_upgrade and running these upgrade scripts you mentioned, can I just import fresh data dumps into a new v12 cluster?

Of course. Just make sure your musicbrainz-server git checkout is on the v-2020-05-18 tag (once that’s released) or later before performing the import. And keep in mind it may be slower than a direct upgrade.

MusicBrainz Server update, 2020-01-20

This release mostly fixes small bugs. Please note that the display code for release lists (for area, artist, collection, instrument, label and series) has been reworked too.

Thanks to chaban for continuously reporting issues, hibiscuskazeneko for paying attention to external links, rotab who fixed a couple of bugs, and all others who reported issues or helped test or translate today’s release!

The git tag is v-2020-01-20.

Bug

  • [MBS-10492] – Regression: When minute component starts with 0 (zero) it’s omitted
  • [MBS-10501] – Collaborator avatars missing from collection page
  • [MBS-10522] – Subscribers not transferred after entity merge
  • [MBS-10531] – Invalid requests are sent to maps service when access token is not set
  • [MBS-10536] – Release group link “see all versions of this release” has span.name-variation
  • [MBS-10553] – User report reason is sent to admins translated
  • [MBS-10560] – Regression: release edits display abbreviated rather than full country names in their release events
  • [MBS-10565] – Can’t add a new type for series
  • [MBS-10567] – Only show allowed series entity types when creating series types
  • [MBS-10571] – Localized ModBot notes are not properly formatted when sent via email
  • [MBS-10572] – Pages that display release events trigger an error when a non-English UI language is selected: “Domain `countries` was not found.”

Improvement

  • [MBS-10552] – Add Deezer links to the sidebar

MusicBrainz Server update, 2019-11-25

Starting with this release, we read our genres list from the genre table rather than a hardcoded list inside a JSON file. This should have no user-visible impact, but let us know if you encounter any new issues related to genres. (This change should however help us improve genres further.)

We also have a small list of bug fixes and improvements, listed below. One neat new feature is the ability to sort edit searches by date closed or closing.

Thanks to chaban, culinko, drsaunde, jesus2099, mglubb, lotheric, psychoadept, sothotalker, and all others who reported issues or helped test or translate today’s release!

The git tag is v-2019-11-25.

Bug

  • [MBS-7097] – Release listed multiple times in “Non-digital releases with download relationships” report
  • [MBS-10466] – MusicBrainz Happy Birthday wishes doesn’t take into account timezones
  • [MBS-10467] – Pages ported to React do not show the new edit notes banner
  • [MBS-10473] – Static resources fail to build when NODE_ENV=production
  • [MBS-10485] – User profile’s “Statistics Edits (view)” links to bogus URL
  • [MBS-10488] – Regression: User profile subscribe links no longer work

New Feature

  • [MBS-9491] – Move genres to be read from the database

Improvement

  • [MBS-4299] – Warning when merging releases with diff. recording artists should show disambiguation
  • [MBS-10204] – Better overview of user edits on user page
  • [MBS-10471] – Add option to view edits by date closed

React Conversion Task

  • [MBS-9922] – Convert the series public pages to React