Musing: compacting replication packets

I’ve recently been working to write more about MusicBrainz internals, and thoughts about the project. Often this blog doesn’t see many posts, and most of them are on official topics like releases, so putting this here is an experiment. I hope you’ll all enjoy hearing about something a bit less concrete (and perhaps less dry, more technical) than usual!

How Replication Works

Replication is a pretty important part of MusicBrainz, though perhaps to the average user it’s a bit hidden. For those readers who aren’t familiar with it, replication is the mechanism behind the live data feed: users have a PostgreSQL database and using tools we provide (or some third-party alternatives) regularly download and apply so-called “replication packets” which describe the changes to the database in a specific period.

Replication packets are .tar.bz2 archives with a collection of files in them: COPYING with the license info, README with a very sparse description of replication, SCHEMA_SEQUENCE with the version of the database schema the replication packet applies to, REPLICATION_SEQUENCE with a sequence number that the code uses to apply replication packets in the correct order, TIMESTAMP with, well, a timestamp, and finally a folder mbdump, which contains two files: dbmirror_pending and dbmirror_pendingdata. Those of you who use the MusicBrainz data dumps may recognize this format: it’s the same as the data dumps. dbmirror_pending and dbmirror_pendingdata are two database tables that are used by replication to store the data about changes while those changes are being applied to the database.

Let’s look more closely at what those two tables contain. dbmirror_pending has columns for a sequence ID, a fully-qualified table name, an operation, and a transaction ID. dbmirror_pendingdata has columns for a sequence ID, a boolean for if data specifies keys only, and finally for the change data itself. Jointly, these two tables combine, conceptually, into an ordered list of operations to perform on the database. Since I tend to think in JSON, here’s a way you could imagine a single operation looking:

{"table": "musicbrainz.release",
"operation": "update",
"existing_row": "\"id\"='1290306' \"name\"='620308' \"artist_credit\"='234861' \"release_group\"='1269028' \"status\"='1' \"packaging\"='1' \"country\"='150' \"language\"='120' \"script\"=",
"update_row": "\"id\"='1290306' \"gid\"='e37dfeea-0f25-48fa-85c0-b4d174ff172d' \"name\"='620308' \"artist_credit\"='234861' \"release_group\"='1269028' \"status\"='1' \"packaging\"='1' \"country\"='150' \"language\"='120' \"script\"= \"date_year\"='2009' \"date_month\"= \"date_day\"= \"barcode\"='8715777007870' \"comment\"='' \"edits_pending\"='3' \"quality\"='-1' \"last_updated\"='2013-05-15 13:01:05.065623+00'"}

This operation would specify that it should update the table musicbrainz.release by taking the row whose id is 1290306, gid is e37dfeea-0f25-48fa-85c0-b4d174ff172d, name is 620308, etc. (as listed in ‘existing_row’) and change it to have id 1290306, gid e37dfeea-0f25-48fa-85c0-b4d174ff172d, name 620308, etc. (as listed in update_row).

Compacting replication packets

So there’s a summary of how replication works, in rough, abstract terms. Now on to the real topic of the post: making replication packets more compact. As the current system works, every change is put into the replication packet; that is, if in the course of an hour (one replication packet), a table changes twenty times, then twenty operations will end up in the replication packet. Sometimes this is useful: some data users use database-level triggers to update their own derived information, and sometimes that requires seeing every change, even if it changes again very soon thereafter. However, for most people, replication is just a way to get their database up to date every hour. For these people, having all twenty updates is wasteful — as far as they’re concerned, it could just be a simple update from how the row looked at the start of the hour to how it looked at the end. This becomes especially true for the (currently rather underused) larger replication packets (daily and weekly, specifically), which include more changes (and thus more changes to the same rows).

To formalize this idea a bit more, let’s make one more abstraction: a chain of operations. A chain is, informally, an ordered list of operations on the same data (or the same row). The simplest chain is just one operation, and from there we can work with chains in a variety of ways:

  • Chains can be combined: if the final state of a chain corresponds to the initial state of another chain, and the first chain’s final operation is before the initial operation of the second (without any other chains whose initial state corresponds to the first chain’s final state in-between), then the two chains can be combined into one chain.
  • Chains can be reordered: if there is no way to combine two chains by the above rule (including by way of intermediate operations or chains), then the two chains operate on completely separate data and thus can happen in any order. (For the database-savvy who might have noticed: slave databases, that use replication, don’t have foreign keys or other constraints which might make this not true).
  • Chains can be combined even more: If the final operation of a chain is a deletion, and the initial operation of another is an insertion, the two chains can be combined by turning the deletion + insertion into a single update. (As you might imagine, the ability to change the order of chains helps a lot here!)
  • Chains can be collapsed: perhaps most important! Any number of updates in a chain can be turned into a single update, from the initial state of the first update to the final state of the last update. Additionally, an insertion followed by an update can be turned into a single insertion, directly to the final state of the update. Additionally, an update followed by a deletion can be turned into a single deletion, directly from the initial state of the update. Finally, an insertion followed by a deletion can be ignored entirely, since it has no lasting effect on the database.

Thus, by creating, combining, reordering, and collapsing chains, we can make replication packets do many fewer operations (which, in turn, can make the packets smaller and have them apply faster). I’ve still glossed over the details of how this could be implemented, though, so to wrap things up, here’s a basic algorithm for optimizing packets:

  1. Loop through operations in the order ProcessReplicationChanges (the script responsible for applying changes to the database from a replication packet) would: order the transactions in ascending order by the maximum sequence ID within the transaction, and within a transaction by ascending sequence ID.
  2. Take action depending on the type of operation:
    • If an insertion: see if the output already contains a deletion on the same table. If so, take the initial state of the deletion and the final state of the insertion and output an update instead of the deletion. If no such deletion exists, simply copy the insertion to the output.
    • If an update: see if the output already contains an insertion or an update on the same data (that is, find an operation whose final state matches the initial state of the update). If you find one, replace it with an operation of the same type (and initial state, if applicable) but with the final state of the new update instead of what it previously included. If you don’t find one, just copy over the update to the output.
    • If a deletion: see if the output already contains an insertion on the same data. If it does, remove it, add nothing to the output, and move on. If not, look for an update on the same data. If you find one, replace it with a deletion from the initial state of the update. If not, look for an insertion on the same table. If you find one, replace it with an update from the initial state of the deletion to the final state of the insertion. Finally, if you found nothing, copy the deletion to the output.
  3. Dump the output as a replication packet. Transactions shouldn’t matter, so put them all in the same transaction.

Final notes, FAQs

  • Why isn’t this already being done? Daily and weekly packets are relatively new, and since some data users do want to see every single operation, it doesn’t make sense to do this to the hourly packets. It’s also somewhat annoying to get the operations in the right order, because they aren’t in the replication packet dump, due to how transactions have to be processed. The musicbrainz-server code gets around this by importing the data to a PostgreSQL table and letting the database do the work of putting things in the right order. mbslave instead loads the entire packet into memory and sorts it there, which obviously is potentially dangerous with a larger packet (which, as noted above, are more likely to derive value from this process). Altogether, the safest way to implement this would be to mimic musicbrainz-server’s process, but to do this on the production servers it would need to use different table names so as to not interfere with the normal replication packet creation process. But mostly, because nobody’s written it yet.
  • How much benefit would this really bring? Simple answer: I don’t really know, but I know it’s some. Autoedits (additions, especially) have a tendency to produce multiple operations where one would do fine, because they first increment the ‘edits_pending’ column of whatever they’re editing in one operation, then in the process of applying the edit (automatically, and immediately, in the same transaction) decrement it. Editors also tend to change a bunch of different things about the same entity all at once, but sometimes not in one edit but rather in several. Of course, any deletion is easily counterbalanced by the many insertions that happen all the time in MusicBrainz; perhaps most notably, daily scripts run to clean up unused entities, so any packet that includes those changes will probably be able to collapse some insertion + deletion pairs. So there’s several cases where obvious chains would appear already.

Hopefully those of you who’ve made it down this far found this enlightening — or, at least, interesting. At some point in the future this might be something we do with the daily and weekly replication packets we’re already creating (currently just by concatenating together hourly packets), but for now there’s no solid plan to do so. Thus, just musing for now!

Thanks for reading!

Server update, 2013-11-25

Hello again! We’ve got another freshly-pressed release of musicbrainz-server, just sent out to our agents in the field. Thanks to JesseW, Freso, and nikki for supplementing my and ollie’s work this round.

Some things you might be excited about this release:

  • For artists with only standalone recordings, they’ll now be shown on the overview tab, similar to artists with only VA releases (thanks JesseW!)
  • The newly-added Bandcamp relationship for artists and labels should now clean up and autoselect the type (thanks Freso)
  • Images in Wikimedia Commons will now appear on artist and place pages (thanks nikki!)
  • Place coordinates now support a few more formats, to support comma as a decimal separator and to support the format used on the Japanese Wikipedia.
  • You can now provide a list of statistics to the timeline graph by separating the raw statistic names with ‘+’. For example, showing the pace of addition of geonames URLs to areas: https://musicbrainz.org/statistics/timeline/count.area+count.ar.links.l_area_url.geonames#+r-

Otherwise, a variety of bug fixes and small improvements. We hope you like it! The tag for this release is v-2013-11-25, which this week I’ve remembered to push to both github and bitbucket.

Bug

  • [MBS-6529] – “An error occured while loading this edit” for an old edit
  • [MBS-6888] – XML Web Service omits ASINs in output for Releases which have ASINs assigned
  • [MBS-6902] – ISE: Caught exception in MusicBrainz::Server::Controller::Artist->edits “Can’t call method “id” on an undefined value at lib/MusicBrainz/Server/Data/Utils.pm line 410, line 3.”
  • [MBS-6924] – Merging places does not copy the address to the target
  • [MBS-6925] – IMDb artist autoselect shouldn’t block company
  • [MBS-6935] – Cover art uploader incorrectly falls back to old uploader in Safari
  • [MBS-6947] – Release inline credits broken for recording-recording relationships when both recordings are in the release
  • [MBS-6951] – tracks with recording pending edits are not marked as edit pending any more in release pages
  • [MBS-6956] – Medium title(s) not shown on release pages
  • [MBS-6960] – AC bubble not linking to the artists
  • [MBS-6961] – AC not shown on tracks even though it differs from release AC
  • [MBS-6963] – Release editor incorrectly claims Various Artists has been used for tracks
  • [MBS-6966] – Edit/remove relationship links shown on release page when not logged in
  • [MBS-6980] – inline search : oversized width outside of screen
  • [MBS-6986] – Match recordings by MBID not working any more

Improvement

  • [MBS-1754] – Display standalone recordings on overview if artist has no release groups
  • [MBS-6400] – Display on release merge edit the same info displayed on merge preview page (release events/labels/catalog numbers etc)
  • [MBS-6755] – Create whitelist and display images in the sidebar
  • [MBS-6891] – “Add Label”/”Add Event” links/buttons need cursor:pointer
  • [MBS-6899] – Accept coordinates using a comma as a decimal point
  • [MBS-6929] – Support coordinates from the Japanese Wikipedia
  • [MBS-7011] – Allow specifying a list of arbitrary lines to graph in the timeline graph.

New Feature

  • [MBS-6998] – Add autoselect for Bandcamp URLs for Labels and Artists

Task

  • [MBS-6920] – Add muzikum.eu to the lyrics whitelist

Update: an earlier version of this post failed to include MBS-6755

Venue and Studio Support: Introducing Places

MusicBrainz now supports venues and studios via our new “place” entity!

This was one of our Google Summer of Code projects for this year and many thanks to Nicolás Tamargo for his work on it. We released his work a few weeks ago and after a few initial hiccups, it’s looking good and we want to let you all know about it. 🙂

So what can we do with places?

The most obvious thing we can do now is store information about recording, mixing and mastering locations.

For example, the studios listed in the credits for Universe by Kyoko Fukada:

places-releasecredits

and the venue for the recordings on Live in Cartoon Motion by Mika:

places-recordingcredits

We can of course link the place to a variety of external sites, as can be seen in the list of URLs for Wembley Arena:

places-urls

Some places are made up of several parts. In those cases, we can link one place as being part of another. For example the various studios at Abbey Road Studios:

places-parts

or the hall and theatre of the Barbican Centre:

places-parts2

We were already able to add engineers to the database as artists, now we can also say which studio they work at, as seen here for the studio Railroad Tracks:

places-engineers

Many orchestras and sometimes other artists have a home venue where they perform on a regular basis. These can now be linked, like in you can see for the Barbican Centre: Barbican Hall:

places-primaryvenue

A premiere is sometimes held for a work and now we can link those works to where the premiere was held, e.g. the following works which were premiered at Carnegie Hall:

places-premiere

The place can also have coordinates, which make it possible to pinpoint the location on a map. The MusicBrainz website doesn’t show any maps at present, but here’s a map of all places with coordinates by Mineo:

places-map50

Events?

No, we do not yet support events.

Thanks to nikki for writing this post.

Server update, 2013-11-11

Another fortnight, another release; thanks to Freso, warp, bitmap, reosarevok, and the MusicBrainz team for their work!

The tag for this release is v-2013-11-11. We had some small problems during release; sorry to anyone who ran into an error during the short period before we reverted things to get our server configurations in order!

Bug

  • [MBS-4438] – Release editor: Track durations are not loaded the first time you access the recordings tab
  • [MBS-5592] – Relationship editor permits multiple identical relationships
  • [MBS-6066] – Random internal server errors when searching
  • [MBS-6298] – ‘View all relationships’ links to a tab that’s not in the list of tabs
  • [MBS-6449] – Logic for showing “at least” in the edit search for the number of results is wrong
  • [MBS-6661] – work_attribute check is wrong
  • [MBS-6673] – Cover art uploading not working in IE
  • [MBS-6689] – Pasting an MBID initiates a search
  • [MBS-6769] – Inline search shows sort name even when identical to name
  • [MBS-6785] – Tagger button broken in Opera
  • [MBS-6851] – Can’t relate recordings to places in the relationship editor
  • [MBS-6858] – Relationship type documentation not accessible
  • [MBS-6872] – Tags page does not show places
  • [MBS-6873] – delete_unused_url and delete_orphaned_recordings don’t account for places
  • [MBS-6878] – Inline search check for non-latin characters treats Vietnamese characters as non-latin
  • [MBS-6883] – Relationship editor fails to load existing relationships
  • [MBS-6884] – Guess case treats “studio” in place names as extra title information
  • [MBS-6892] – Relationship editor needs “(more documentation)” link after relationship type description
  • [MBS-6900] – Cannot edit places with empty coordinates
  • [MBS-6901] – Places lat/long parser does not understand »+55° 54′ 14.49″, +8° 31′ 51.64″«
  • [MBS-6905] – beta: Comma shown in coordinates field when editing places with no coordinates
  • [MBS-6907] – beta: Coordinates parsing should not require seconds
  • [MBS-6908] – Internal server error searching for multiple editor flags in the edit search
  • [MBS-6909] – “Editor flag is not” in edit search does not work
  • [MBS-6914] – beta: Clicking result in inline search hides popup instead of selecting result
  • [MBS-6916] – beta: No dropdown in inline search when there are no results
  • [MBS-6921] – Artist Credit join phrase not displayed in tracklist

Improvement

  • [MBS-1421] – Require an edit note for all destructive edits
  • [MBS-2985] – Report: download relationships in non-digital releases.
  • [MBS-6239] – Use Wikidata URLs to fetch interwiki links
  • [MBS-6353] – Display on release group merge edit the same info displayed on merge preview page (mainly release group types)
  • [MBS-6394] – Automatically cut out hyphens during ISRC addition
  • [MBS-6456] – Show country and subdivision when displaying areas in sidebar and on profile pages
  • [MBS-6554] – Uppercase letters when entering ISRCs instead of rejecting lowercase ones
  • [MBS-6771] – Use localised aliases in the inline search where possible
  • [MBS-6824] – Coordinate fields should understand and convert degree minute second format
  • [MBS-6828] – Coordinates should be editable as one field
  • [MBS-6830] – Coordinate fields should strip degrees signs
  • [MBS-6831] – Coordinate fields should understand directions
  • [MBS-6832] – Coordinates should be presented better
  • [MBS-6838] – Release group dropdown in add release does not contain sufficient info
  • [MBS-6839] – Video attribute should be shown in merge recording edits
  • [MBS-6846] – Strip excessive digits in coordinates
  • [MBS-6859] – Inline search for places should show localised aliases
  • [MBS-6882] – Titles for video pages should say “Video”
  • [MBS-6911] – Make MB.Control.ArtistCredit a view model for knockout, use $.widget for MB.Control.Autocomplete

Task

  • [MBS-6729] – Add whosampled to the Other databases whitelist
  • [MBS-6893] – Add “Rockens Danmarkskort” to “Other databases” whitelist for Places

Server Update, 2013-08-19

Another two weeks, another release! A variety of bug fixes, primarily, and some assorted improvements. For those of you using replication on resource-constrained servers, this release includes code to make ProcessReplicationChanges run in constant memory, which should hopefully be an improvement.

With help from Michael Wiencek and the rest of the MusicBrainz team, here’s what we’ve done in the last fortnight:

Bug

  • [MBS-1549] – Guess Case: “¿” in sentence mode
  • [MBS-2151] – Editing Track Times should remove existing value
  • [MBS-2959] – Release-group XML first release date is using incorrect sorting logic for incomplete dates
  • [MBS-4389] – Editing artist credits shows “undefined” as disambiguation comment
  • [MBS-4788] – Editing → Add Release allows user to set release type even when existing release group has been selected
  • [MBS-5508] – Uploading cover art from https pages causes a warning about insecure pages
  • [MBS-5965] – Remove relationship edits store translated attributes
  • [MBS-6192] – Internal server error on ws query if offset is non-numeric
  • [MBS-6220] – tooltip (title) shows “&” instead of “&”
  • [MBS-6243] – “In x hours” is only shown for edit expiry times for the same day
  • [MBS-6418] – Relationship editor allows adding more relationships than the server can handle
  • [MBS-6546] – When the search server reports an internal server error the client receives a 400 status code
  • [MBS-6558] – Internal server error logging in with a Unicode username
  • [MBS-6570] – Removing relationship dates in the relationship editor fails
  • [MBS-6571] – Filename display in cover art edits is hardcoded to jpg
  • [MBS-6588] – “Direct database search” returns the same work as 2 different names
  • [MBS-6591] – JSON: Composer and release lookup
  • [MBS-6610] – Various Artist Alias and Tags should not be shown unless looking up Various Artist artist endpoint
  • [MBS-6622] – All log in pages are not served via HTTPS
  • [MBS-6628] – ModBot is unable to close some merge label edits
  • [MBS-6630] – Remove button when uploading covers is shown after the image has been submitted
  • [MBS-6632] – Relationship documentation examples don’t work with release groups
  • [MBS-6637] – http://musicbrainz.org/release/add doesn’t redirect to beta when desired
  • [MBS-6638] – ‘use beta site’ on main site will unset preference
  • [MBS-6650] – Editing a relationship type allows submitting the form with no changes
  • [MBS-6656] – Relationship editor fails when counting number of selected recordings
  • [MBS-6657] – Internal server error when using a negative limit
  • [MBS-6667] – Beta: No edit note field in the relationship editor
  • [MBS-6677] – Donations from random end users dropped to nearly nil.
  • [MBS-6679] – beta: ISE displaying artist edits

Improvement

  • [MBS-193] – Add open edit and cancelled edit stats to users
  • [MBS-1820] – Release search results should include more information
  • [MBS-2108] – More detail for works quicksearch
  • [MBS-3204] – Better name/explanation or warning for type “Pseudo-Release”
  • [MBS-6644] – ProcessReplicationChanges should be able to run in low memory environments

Upcoming feature: contested edit extension

The next release (a week from Monday) will include a useful new feature: extending the expiration of edits that receive ‘No’ votes! I’d like to take a bit to explain how it’ll work.

The problem

Especially since the amount of time edits stay open was reduced to 7 days, but also before, several problematic situations could arise when edits were contested:

  • If voters cast ‘No’ votes shortly before the expiration of the edit, the original editor may not have time to respond to the concerns before the edit closes. As a result, it’s generally been considered bad etiquette to cast ‘No’ votes right before an edit expires unless the edit is particularly destructive.
  • In a somewhat related case, sometimes an edit can get many ‘No’ votes in short succession. Since 3 unanimous ‘No’ votes will close an edit, the period between the first vote cast and the edit being closed can be as short as an hour, which is certainly not enough time for the original editor or other voters to respond.
  • It’s also occasionally possible for edits to be put at risk of failing without an email being sent. Specifically, the current code only sends an email on the very first ‘No’ vote. Therefore, if a voter votes ‘No’ early in the voting period and later changes their vote, a second voter later voting ‘No’ would not result in an email being sent. However, a tied vote or a majority of ‘No’ votes will result in an edit being closed, so even a lone vote can tip the balance.

The solution

In light of all of these problems, the next release will work differently to give editors time to respond to votes against their edits.

In short: editors will always have at least 72 hours (three days) to respond after the first vote against their edits.

More specifically, and more technically:

To address the third point above, the emails for ‘No’ votes will now be sent whenever the count of ‘No’ votes goes from 0 to 1. That is: if two people vote ‘No’ with neither changing their vote in-between, only one email will be sent. But, in a case like the one described above, where an early ‘No’ vote is superseded and the total count goes back to 0, a subsequent ‘No’ vote will send a new email.

To address the second point above, ModBot will not reject an edit before its expiration time due to three unanimous ‘No’ votes unless 72 hours have passed since the earliest ‘No’ vote (that is, the vote which resulted in an email being sent). If the expiration time passes or an edit has three unanimous ‘No’ votes after 72 hours, the edit will be closed as usual.

Finally, to address the first point above, when new ‘No’ votes are cast close to an edit’s expiration time, the edit’s expiration time will be extended to allow 72 hours for response. This extension will, once again, only happen when the total count of ‘No’ votes goes from 0 to 1 – so only when an edit becomes contested and previously was not.

In total, these changes should hopefully ensure that editors are better informed about edits that are in danger of being voted down, and given sufficient time to respond to voter concerns.

In summary

First of all, this change will be fully live on Monday, June 24th. Before then, votes cast on the beta server may result in a small number of edits having their expiration times extended, but it won’t happen on the main server or for the majority of edits.

While editing: Rest assured you’ll be informed and given time to fix problems with your edits!

While voting: Don’t worry too much about casting ‘No’ votes when edits need improvement. Certainly be ready to supersede your votes if things do get fixed up, but if you find an edit in need of fixing just before it closes, or which already has a bunch of recent ‘No’ votes, don’t hold back or vote differently to give the original editor time to respond. This should take care of that for you!

Happy editing!

Schema 17/18 upgrade instructions

We’ve just completed our extra schema upgrade. The full instructions for upgrade follow:

Schema 16 to schema 17 upgrade

If you already ran the migration that was announced May 15th, or if you imported a data dump from May 15th or later, skip to the next section.

  1. Run replication with carton exec -Ilib -- ./admin/replication/LoadReplicationChanges until it cannot apply any packets in schema 16.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Make sure your REPLICATION_TYPE setting is RT_SLAVE in lib/DBDefs.pm
  5. Switch to the new code with git fetch origin followed by git checkout schema-16-to-17
  6. Run carton install --deployment to install any new perl modules.
  7. Run carton exec -Ilib -- ./upgrade.sh from the top of the source directory.
  8. Set DB_SCHEMA_SEQUENCE to 17 in lib/DBDefs.pm
  9. Turn cron jobs back on, if needed.
  10. Restart the MusicBrainz web server, if needed.

Schema 17 to schema 18 upgrade

  1. Run replication with carton exec -Ilib -- ./admin/replication/LoadReplicationChanges until it cannot apply any packets in schema 17.
  2. Take down the web server running MusicBrainz, if you’re running a web server.
  3. Turn off cron jobs if you are automatically updating the database via cron jobs.
  4. Make sure your REPLICATION_TYPE setting is RT_SLAVE in lib/DBDefs.pm
  5. Switch to the new code with git fetch origin followed by git checkout v-2013-05-24
  6. Run carton exec -Ilib -- ./upgrade.sh from the top of the source directory.
  7. Set DB_SCHEMA_SEQUENCE to 18 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if needed. EDIT: also restart memcached here, see http://tickets.musicbrainz.org/browse/MBS-6376

Note that the tags to check out for the two migrations are different.

Changes

For the list of changes in schema 17, see the former blog post. The changes for schema 18 are:

  1. Fix the track table corruption that the schema 16-17 upgrade created, by importing a copy of the ‘track’ table from the production database.
  2. Fix some indexes and constraints that should not be on slaves or which had bad names starting with ‘medium2013’ or ‘track2013’
  3. Create a missing index on medium.release that dramatically improves performance.
  4. Fix the ref_count column of the artist_credit table, which was not updated properly at the schema 16-17 upgrade.

Issues with 2013-05-15 schema change and the 'track' table.

As a heads-up for anyone using postgresql 9.1 or later (9.0 is the only confirmed-correct version) anyone running a slave server, it appears that there’s an issue with the upgrade script which will result in an incorrect track table in most cases.

An ostensible fix that was previously mentioned here does not work. We’re still working on a fix and will update this post as we have more details.

There is a fix, see http://blog.musicbrainz.org/?p=1962 for instructions. Thanks for your patience!

Server Update, 2013-03-11

Thanks to work from Nicolás Tamargo, nikki, Lukáš Lalinský, Paul Taylor, and the MusicBrainz team, we’ve just released a new version of the MusicBrainz website. As usual, this is mostly a bug fix release, but we do have one shiny new feature once again… OAuth2! It is now possible to authenticate with MusicBrainz using this, rather than the old digest auth system. For more details, see the Development/OAuth2 documentation page. This feature is new, so if you encounter any problems (here or elsewhere), please be sure to let us know!

Here’s a full list of what’s changed:

Bug

  • [MBS-4155] – Clicking twice the same relationship link → Internal Server Error
  • [MBS-4419] – Memcached should not be used for persistent data
  • [MBS-5358] – Can’t enter fuzzy ending date in the same year as the start date
  • [MBS-5829] – Internal server error when requesting /tracklist/ with an invalid ID
  • [MBS-5856] – Reorder medium edits use “Disc”
  • [MBS-5866] – “Direct database search” for a new artist is missing at least Gender and Country
  • [MBS-5873] – Modbot recommending to merge label with an artist
  • [MBS-5875] – Internal server error approving an edit with no votes
  • [MBS-5877] – Internal server error when adding a release with a non-existent release group MBID
  • [MBS-5878] – Internal server error when doing tag lookup
  • [MBS-5884] – Internal server error when viewing editing history
  • [MBS-5896] – Error in Spanish even though the interface language is German
  • [MBS-5908] – Some Wikipedia extracts not appearing on translated MB
  • [MBS-5914] – Tagger icons are not showing up in tables
  • [MBS-5915] – Add ISWC/IPI buttons display incorrectly
  • [MBS-5938] – sever-side warnings generated on a proper client request on /ws/2/
  • [MBS-5939] – User applications page shows empty tables when there’s nothing to display
  • [MBS-5945] – oauth2: entering an invalid URI then switching to “Installed Application” fails to submit (silently)
  • [MBS-5946] – oauth: web applications don’t remember authorizations for the same scope
  • [MBS-5950] – Crappy-written multiple-artist paragraph on FreeDB import
  • [MBS-5951] – Display the actions column for applications the same as elsewhere on the site
  • [MBS-5954] – Confirm revoking OAuth access

Improvement

  • [MBS-950] – Only allow permitted sites for some relationships
  • [MBS-5020] – Report:: recordings linked to same work more than once
  • [MBS-5218] – Report: Duplicate dated/undated relationships
  • [MBS-5839] – Relate To URL should auto-identify beatport as “can be purchased for download” relation.
  • [MBS-5940] – Remove fref=ts from Facebook URLs
  • [MBS-5949] – Update the wording about deleting accounts

New Feature

Sub-task

  • [MBS-2917] – Report:Releases with unknown track times

Replication issues, and packet 64833 is large

  1. We’ve ushered in the new year by discovering, then solving some issues with replication; packet number 64831 (from yesterday, 1AM UTC) didn’t build correctly and needed a bit of manual prodding. However, it’s now been pushed out and replication should be back to normal.
  2. As part of our fix process, we turned off production of replication packets for the bulk of today. As a result, packet number 64833 covers what would otherwise have been about 20 packets, and thus is somewhat large. The import process for this packet will accordingly take somewhat longer than usual.

Sorry for any inconvenience, and happy new year!