Schema change release: May 15, 2023

MusicBrainz is announcing a new schema change release for May 15, 2023. The actual schema (database) changes we’ve detailed below shouldn’t have much perceivable impact on mirror servers, especially if you only use the web service; most are to remove unused tables/columns or tweak how certain tables are materialized.

The only breaking change to a replicated table is dropping the cdtoc.degraded column (MBS-12573). It’s unused in our codebase, so will have no effect on the web service. However, if you’re querying the cdtoc table directly, you’ll want to make sure your SQL queries don’t expect this column to exist.

One minor change affects first-release-date data for release groups and recordings in the web service (MBS-12800), but only where cancelled releases exist. As of this writing, there are currently only 133 releases with the “cancelled” status.

One change we consider significant is that, as part of this release, we plan to enable replication with dbmirror2 (MBS-12107) by default. This should require no changes on the part of mirror server operators and should be completely invisible/seamless during the upgrade if all goes according to plan. We consider it significant because it’s rewriting a critical part of our replication system and changing the packet format used underneath.

Tools that operate on the current (let’s call them version 1) replication packets, like mbdata, will continue to work; MusicBrainz will continue to generate the old v1 packets for the foreseeable future while we work to move external tools to the new, v2 packet format.

Without further introduction, here’s the list of tickets for the Spring 2023 schema change:

Schema changes

The following tickets change the database schema in some way.

MBS-4685: Allow removing own edit notes when there’s been no follow-up / MBS-11312: Allow original editors and admins to modify edit notes. In order to give editors and admins the option to modify or delete edit notes, this will add a new (non-replicated) table, edit_note_change, where the information about any changes made will be stored. No changes will be made to the structure of the existing edit_note table.
MBS-12573: Drop unused degraded column from cdtoc. This was a column that was used many years ago to store whether CD TOCs were based on a less accurate algorithm. For a long time, it’s always contained FALSE (since we dropped the less accurate data at some point). It’s also completely unused. We are just dropping the column entirely.
MBS-12704: Remove legacy “Watch artist” code. Over 10 years ago, we had code to follow artists and be notified of new releases and whatnot, but the whole system was never upgraded during the NGS migration. We have decided that if we re-implement this, it will be as part of ListenBrainz, not MusicBrainz. As such, we are dropping the relevant tables: editor_watch_artist, editor_watch_preferences, editor_watch_release_group_type, editor_watch_release_status.
MBS-12794: Don’t use spammer tags/ratings when calculating tag counts / rating averages. We do not delete spammer accounts (we just hide them) in order to train our code to identify other potential spammers. An unfortunate side effect of this is that any ratings and tags left by spammers are currently still taken into account when calculating aggregate ratings and tags for MusicBrainz entities. We’ll be making a change to our triggers that update these aggregates to ignore any ratings and tags by editors marked as spammers. This change affects master/standalone databases, but should have no impact on mirror servers, where these triggers do not exist. Some tag and rating values will be changed by this though.
Edit (Mar. 23): The team has removed MBS-12794 from this release after deciding that the implementation needs more discussion.
MBS-12800: Exclude cancelled releases when calculating first recording / release group dates. We store data about releases which were announced, but then were cancelled before they were released. These are often stored with release dates, representing the date the release was meant to have been released. While this makes sense, it has the side effect that these dates affect the calculation of the first release date for any recordings linked to these releases, and the release groups they are in. Clearly, the recordings and release groups were not released on these dates, and we should use the date of their first actual release instead. This will change the triggers that update the release_first_release_date materialized table, which itself will affect the calculations of recording_first_release_date and the release date columns in release_group_meta. These triggers run on mirrors as well.

The remaining tickets below do not make any changes to the database schema.

Replication system changes

MBS-12107: Replication with dbmirror2. See the introduction above and the linked ticket for more information. The prerequisite database changes required for this were already applied as part of the Spring 2022 schema change, so enabling the v2 system only requires code changes.

Upgrade script changes

MBS-12370: Rename schema change upgrade script suffixes to clearly indicate which nodes they run on. This is an internal-only change which should have no impact on users.

We’ll post upgrade instructions for standalone/mirror servers on the day of the release. If you have any questions, feel free to comment below or on the relevant JIRA tickets.

One thought on “Schema change release: May 15, 2023”

Ooh, editing edit notes, I didn’t realize this was happening! Going to mean a lot less typo-based double commenting, nice

One thought on “Schema change release: May 15, 2023”

Leave a Reply Cancel reply