If your replicated slave threw an error trying to apply packet #104949 (showing the message
ERROR: duplicate key value violates unique constraint "artist_alias_idx_primary"), then you can un-break things by doing the following:
- Get the latest code from the
git checkout master && git pull origin master (OR, if you don’t want to update your code, clear the dbmirror_pending tables instead:
echo 'TRUNCATE dbmirror_pending CASCADE; TRUNCATE dbmirror_pendingdata CASCADE;' | ./admin/psql READWRITE)
- Proceed with replication as normal, either via cron, or by running
You can also re-import from dump 20170605-031203 or later, and replicate from there. We’re very sorry for the inconvenience.
The issue here was caused by a bug in our alias merge code that interacted strangely with dbmirror. Since that code went untouched for years, the trigger for this issue must have been extremely rare. I’ve put in place a fix for the merge logic to ensure it doesn’t happen there again, and am investigating dbmirror’s behavior to see why it didn’t sequence the updates correctly.
We have picked our set of tickets and the date for our May 2017 schema change release: May, 15th 2017. This will be a fairly standard and minor schema change release — we’re only tackling 3 tickets that affect downstream users and no other infrastructure changes.
Take a look at our list of tickets for this schema change release. There really are only two tickets that will affect most of our downstream users:
- MBS-8393: “Extend dynamic attributes to all entities” Currently our works have the concept of additional attributes which allows the community to decide which sorts of new attributes to apply to a work. (e.g. catalog numbers, rhythmic structures, etc) This ticket will implement these attributes to all of our entities. Also, this ticket will not change any of the existing database tables, it will only add new tables.
- MBS-5452: “Support multiple lyric language values for works” Currently only one language or the special case “multiple languages” may be used to identify the language used in lyrics. This ticket allows more than one language to be specified for lyrics of a work.
The following tickets are special cases — they will not really affect our downstream users who do not have edit data loaded into their system. We are only including this change at the schema change release time in order to bring some older replicated systems up to date. If you do not use the edit data, then please ignore these tickets.
- MBS-9271: “Prevent usernames from being reused” This ticket does not change the schema, but for sake of minimizing downstream disruption, we’re going to carry out this ticket during the schema change.
- MBS-9274: “Fix the edit_note_idx_post_time_edit index in older setups to handle NULL post_time” This ticket fixes an SQL index on an edit related table.
- MBS-9273: “Fix the a_ins_edit_note function in older setups to not populate edit_note_recipient for own notes” This ticket also fixes an SQL index on an edit related table.
This is it — really minor this time around. If you have any questions, feel free to post them in the comments or on the tickets themselves.
I’m pleased to announce that MetaBrainz has been accepted into the Google Summer of Code program for 2017!
If you are an eligible university student who would like to participate in Summer of Code and get paid to hack on a MetaBrainz project over the summer, take a look at our ideas page for 2017. If this sounds interesting, take a look at our getting started page.
We kindly ask that you carefully read the ideas page and the getting started page before you contact us for help!
Thanks and good luck applying!
I’m very happy to announce that we have a brand new Supporter Catalyst on our team. Elizabeth Bigger, AKA Quesito, joined our team at the beginning of the year and is now coming up to speed.
Her duties include making contact with any supporters who sign up on the MetaBrainz site and to sort out any questions they may have working with such a quirky organization like MetaBrainz. She’ll also be reaching out to established customers to make sure that they are on the right support level and that things are working smoothly for our supporters.
I anticipate her also helping out with other tasks such as putting on our annual summit and other events we may hold in our office in Barcelona.
Welcome on board, Elizabeth!
This release features code from GCI student dpmittal, who fixed four of the tickets below under our mentorship. One of those tickets was for displaying the excellent artist icons that former GCI student (and current mentor) gcilou created. Those icons are displayed to the left of the name at the top of artist pages (examples: person, group, choir, orchestra, character, other). Nice work, gcilou and dpmittal! We also have various fixes and improvements thanks to chirlu and Zastai, listed below.
The git tag is
- [MBS-4159] – Vimeo relationship under the External links section
- [MBS-7009] – Exception if replication type is slave but no data in replication_control
- [MBS-8268] – Ratings (stars) display does not update on its own
- [MBS-9117] – CD Stub track count not serialized correctly
- [MBS-8359] – Add “Guess Case” function for Event names
- [MBS-8870] – Add Setlist.fm links to the sidebar
- [MBS-1352] – Different icon for Unknown/Person/Group on Artist pages
- [MBS-8542] – Blacklist Jaikoz from making barcode edits
In my post from yesterday I talked about our continuing struggle to fix our replication stream. Overnight we learned two new things:
- The lengthy hard drive recovery process has failed and yielded no useful results. 😦
- We have a working DB diff program in place that allows us to create missing replication packets from two DBs at known packet numbers.
#2 is a great step forward in fixing our replication stream, but we’re missing a specific replicated database at a specific point in time. If you have a replicated database, please read on and see if you can help us:
- Is your database at replication packet 99847? The way to find out your current replication sequence is to look in slave.log in your server directory or to issue the query
select current_replication_sequence from replication_control;
at the SQL prompt.
- Do you have a complete replicated MusicBrainz database, including the cover_art_archive schema?
- Are you willing to make a dump of the DB and send it to us?
- Do you have a fast internet connection to make #3 possible?
If you’ve answered yes to all of the above, please send email to support at metabrainz dot org.
It has been a long week since our move to the new hosting provider in Germany. Our move across the Atlantic worked out fairly well in the grand scheme of things. The new servers are performing well, the site is more stable and we have a modern infrastructure for most of our projects.
However, such moves are not without problems. While we didn’t encounter many problems, the most significant one we did encounter was the failure to copy two small replication packets off the old servers. We didn’t notice this problem until after the server in question had been decommissioned. Ooops.
And thus began a recovery effort that is almost worthy of a bad Hollywood B-movie plot. Between myself traveling and the team finishing the most critical migration bits, it took 2 days for us to realize the problem and find a volunteer to fetch the drives from the broken server. Only in a small and wealthy place such as San Luis Obispo, could a stack of recycled servers sit in an open container for 2 days and not be touched at all. My friend collected the drives and immediately noticed that the drives were damaged in the recycling process, which isn’t surprising. And we can consider ourselves really lucky that this drive didn’t contain private data — those drives have been physically destroyed!
Since then, my friend has been working with Linux disk recovery tools to try and recover the two replication packets off the drive. Given that he is working with a 1TB drive, this recovery process takes a while and must be fully completed before attempting to pull data off the drive. For now we wait.
At the same time, we’re actively cobbling together a method to regenerate the lost packets. In theory it is possible, but it involves heroic efforts of stupidity. And we’re expending that effort, but so far, it bears no fruit.
In the meantime, for all of the people who use our replicated (Live Data) feed — you have the following choices:
- If you need data updates flowing again as soon as possible, we strongly recommend importing a new data set. We have a new data dump and fresh replication packets being put out, so you can do this at any time you’re ready.
- If the need for updates is not urgent yet and you’d rather not reload the data, sit tight. We’re continuing our stupidly heroic efforts to recover the replication packets.
- Chocolate: It really makes everything better. It may not help with your data problems, but at least it takes the edge off.
We’re terribly sorry for the hassle in all of this! Our geek pride has been sufficiently dinged that our chocolate coping mechanisms will surely cause us to put on a pound or two.
UPDATE 1: The first recovery examination has not located the files, but my friend will do a second pass tomorrow and turn over file fragments to us that might allow us to recover files. But that won’t be for another 8 hours or so.