Post schema change fix for importing clean data

Yesterday we found a bug that prevents the import of a post schema change update data set. We’ve pushed out a fix for this and tagged it with:

v-2012-05-15-import-fix

If you’re planning on importing a new data set, make sure to check out this tag, rather than the tag mentioned in this entry.

Schema change server update, 2012-05-15

Nearly one year after we released NGS, we have another schema change update with lots of new features!

This release contains 9 new features and improvements that take advantage of the new schema. These are:

  • More social user profiles which can now have Gravatars, languages (and the users proficiency) age and country.
  • More expressive aliases for artists, labels and works. Aliases can now have types, sort names and multiple aliases may be used per a locale, along with the ability to mark one alias as ‘primary’ for that locale.
  • Release group types have been separated into primary and secondary types. A release group now has 1 primary type and may have multiple secondary types. This allows us to have ‘remix compilation albums’, for example
  • Works may have multiple ISWCs
  • Artists, labels and relationships may be marked as ‘ended’ to indicate that they have ended, but the exact date is not known
  • Vinyl style/free text track numbers are now supported.
  • Works may have a lyrics language associated with them
  • Artists and labels may have multiple IPIs
  • We have moved to use ISO 639-3 for our language table. While not all languages are exposed at the moment, this gives us a lot more flexibility going forward.

Many thanks to nikki for going way beyond our expectations for testing (and patience!); to Ian McEwen for his continued work on statistics; and to the MusicBrainz team for making this all happen.

If you have a replicated instance of MusicBrainz, please follow these instructions to get your server running on the new schema:

  1. Take down the web server running MusicBrainz, if you’re running a web server.
  2. Turn off cron jobs if you are automatically updating the database via cron jobs.
  3. Make sure your REPLICATION_TYPE setting is RT_SLAVE
  4. Switch to the new code with git fetch origin followed by git checkout v-2012-05-15-schema-change
  5. Run carton install --deployment. If you have not switched your installation to using carton, please read INSTALL.md on how to do this.
  6. Run carton exec -- ./upgrade.sh from the top of the source directory.
  7. Set DB_SCHEMA_SEQUENCE to 15 in lib/DBDefs.pm
  8. Turn cron jobs back on, if needed.
  9. Restart the MusicBrainz web server, if needed.

If you are running a mbslave mirror, check out the latest code and read the upgrade instructions in the README file.

Bug

  • [MBS-3189] – Remove unused ref_count column and related functions
  • [MBS-4616] – Add work language statistics
  • [MBS-4629] – /cover-art page shows no collections
  • [MBS-4637] – Timeline graph won’t graph anything without an entry in statistics/view.js
  • [MBS-4640] – Clicking cover art opens box with “����” (4 U+FFFD)
  • [MBS-4642] – Thickbox CSS interferes with MB CSS
  • [MBS-4647] – Cover art page allows submitting edit with no cover art when JS is off
  • [MBS-4648] – Changing cover art type from “other” to unset causes Internal Server Error
  • [MBS-4678] – upgrade.sh is not ready for testing
  • [MBS-4679] – Internal server error adding secondary types to a release

Improvement

  • [MBS-1485] – Alias types
  • [MBS-1798] – Lyrics language for works
  • [MBS-1799] – Add ISO 639-3 language codes to the database
  • [MBS-1981] – Add blog feed to the home page
  • [MBS-2240] – Aliases: certain locale can be used only once in the list of aliases
  • [MBS-2532] – Allow more than one IPI per artist
  • [MBS-2851] – Timeline graph events should be in the database
  • [MBS-2885] – Allow more than one ISWC per work
  • [MBS-3646] – Split release group attributes into two types
  • [MBS-3788] – Alias improvements
  • [MBS-4625] – Improve wording of cover art tab when cover art comes from relationships
  • [MBS-4676] – Do not allow people entering deprecated relationships

New Feature

  • [MBS-842] – Allow vinyl style track numbers and sides
  • [MBS-1385] – Support unknown end dates
  • [MBS-3704] – Allow adding sort names to artist aliases
  • [MBS-4337] – Make user profile more social: add (optional) fields avatar, gender, birth year, country

Next schema change release: May 15

We’ve talked a bit about our upcoming schema change release, but we hadn’t nailed down the exact date of the release. Now that we’re tangibly close, we’ve settled on the May 15th as the actual release date.

As a reminder, here are the tickets that will change our schema and here are all the tickets scheduled for the 15th.

Looking for Language Liaisons

As some of you may know, this summer through Google Summer of Code I’m working on internationalization of musicbrainz-server. As outlined in my proposal, I’m currently looking to find what I call “language liaisons”: folks willing to be the go-to person about a given language for me and other developers.

Auf deutsch!
Auf Deutsch!

What’s expected of liaisons:

  1. Willing to be pestered occasionally, by me or other developers, about language-specific concerns: when adding new features, and thus adding new strings, we’d like to be able to ensure nothing’s added that will need to be changed before it can be translated into a given language.
  2. Willing to file bugs for strings already in the database that are untranslatable, should you find them.
  3. Be on the musicbrainz-i18n mailing list; this will be the main venue for organization and communication about i18n issues.
  4. Ideally, to be an active translator for your language – but this isn’t a requirement, because I’d like to get the widest global coverage I can; even if a language doesn’t currently have a translation, we don’t want to unintentionally sabotage future translators with untranslatable strings!
musicbrainz-japanese
日本語

I’ll also be determining a (related) list of “target languages” for the summer, with the intention of releasing translation on musicbrainz.org with these languages at the end of the summer. I’ll consider for inclusion on this list languages that are both in active translation on Transifex and have language liaisons.

If you’re interested in being a language liaison, please contact me: ianmcorvidae (at) musicbrainz (dot) org, editor ianmcorvidae, or ianmcorvidae on IRC, and join the mailing list.

If you’re interested in i18n generally,  please join the musicbrainz-i18n list. For more information on my project and musicbrainz-server i18n, see the server internationalization wiki pagemy post on my personal blog, and my official proposal, or come ask about it on IRC or the mailing list!

(less useful languages)
(less useful languages)

Server update, 2012-04-30

We’ve released another set of bug fixes and improvements for the server. Thanks to Joachim LeBlanc, Johannes Weißl and the rest of the MusicBrainz team for helping on this release!

Bug

  • [MBS-2553] – Change wording for no External Links in the sidebar
  • [MBS-4041] – Nonexistent elections cause Internal Server Error
  • [MBS-4544] – Predefined advanced searches on /search no longer work
  • [MBS-4547] – URL cleanup broken on URL edit page
  • [MBS-4563] – Plugin::Diff is broken
  • [MBS-4564] – robots.txt served as octet-stream on test.mb
  • [MBS-4566] – Advanced Search Syntax Link missing from Search
  • [MBS-4573] – Show a message when a label has no releases
  • [MBS-4597] – Merging can fail if entities have aliases with the same locale
  • [MBS-4609] – Comment element on release group in /ws/2 should be removed

Improvement

  • [MBS-1167] – "Read more" for annotation previews should load the full annotation in place, not take you to a different page
  • [MBS-1881] – Move relationship types/attributes/instruments lists
  • [MBS-2479] – Mark approvals differently from normal Yes votes
  • [MBS-2814] – Reports don’t highlight entities with pending edits
  • [MBS-3684] – Improve sorting of the "Releases with superfluous data tracks" report
  • [MBS-4038] – Let users edit their permissions on test servers
  • [MBS-4063] – Exclude stuff marked Single AND UK from the SeparateDiscs report
  • [MBS-4125] – Sort report for creative commons download relationships
  • [MBS-4476] – Provide indication of non-front cover art
  • [MBS-4493] – ISRCs with multiple recordings: Do something about the DEF05 ISRCs
  • [MBS-4530] – Remove the recordings with CC download relationships report
  • [MBS-4560] – "No votes" failed edit should not fail in silence
  • [MBS-4572] – Open up /artist/ in robots.txt

New Feature

  • [MBS-4551] – Display which git branch is active
  • [MBS-4567] – Use markdown for README and INSTALL
  • [MBS-4576] – Allow for google analytics support
  • [MBS-4590] – Use /doc/About instead of /doc/About_MusicBrainz
  • We’ve also improved cover art support slightly.

Sub-task

  • [MBS-4160] – SoundCloud relationship under the External links section

Server update, 2012-04-10

Sorry for being a week behind on this release, but we’ve just finished pushing out another set of changes. Many thanks to Lukáš Lalinský, Paul Taylor and the rest of the MusicBrainz team for making this release happen! Here’s what we’ve just released:

Bug

  • [MBS-3619] – Statistics page doesn’t validate
  • [MBS-3794] – A "no votes" edits got wrong status "failed vote"
  • [MBS-3834] – Hovering over artist names in "edit medium" tracklist changes does not show the artist sort name
  • [MBS-4082] – Titles including quote marks are truncated when adding from a CD Stub
  • [MBS-4131] – Edits show credited-as name when it’s the same as the artist name
  • [MBS-4183] – Release editor: enters new artist if no radio button selected on ‘add missing entities’ tab
  • [MBS-4314] – Add interface elements to reorder tracks on a tracklist without having to edit the tracknumbers.
  • [MBS-4367] – Current search URLs are broken on the test server
  • [MBS-4418] – beta.mb is unresponsive
  • [MBS-4467] – ws/2/label doesn’t include label comment
  • [MBS-4484] – Limited user can vote on edits
  • [MBS-4500] – ISE: ‘Can’t call method "is_auto_editor"’ when attaching a TOC
  • [MBS-4504] – Set default unknown value for medium.format_name

Improvement

  • [MBS-684] – TOC lookup displays too little release info
  • [MBS-834] – Weird behaviour of the search checkboxes
  • [MBS-1728] – Make ModBot’s edit notes grey again
  • [MBS-1764] – Inconsistent display for merge edits
  • [MBS-2242] – Disable editing of Medium title when there’s only one medium
  • [MBS-2412] – User tags should be sorted
  • [MBS-2867] – RE: "Add n track(s)" should reset to 1
  • [MBS-3171] – Overview and Recording pages need to be able to exclude featured guest spots from display
  • [MBS-3208] – Trim leading/trailing whitespace in release editor prior to Add Missing Entities check
  • [MBS-3398] – Detect "M" in front of track numbers
  • [MBS-3482] – "YouTube Relatiionship Type" needs autofix and doc
  • [MBS-3628] – Prevent people from adding broken Facebook URLs
  • [MBS-4234] – Move Basic search logic from mbserver to searchserver
  • [MBS-4390] – Display disambiguation comments better in tooltips
  • [MBS-4490] – "CD N" in Add disc from existing tracklist is confusing
  • [MBS-4523] – Statistics: Use commas and right-align numbers
  • [MBS-4535] – Make visited links more prominent

New Feature

  • [MBS-3160] – Add view to artist pages that shows release groups/recordings/etc credited to that artist only (solo releases only – exclude collaborations, feat., etc)
  • [MBS-3266] – Allow filtering by artist credit

The Git tag for this release is v-2012-04-10-ngs-bug-fixes.

Official schema change release announcement for May 2012 release

We’ve been working on hammering out the details of the upcoming schema change release and we’ve settled on 11 tickets that we’re going to implement. For a detailed description of how our database will change, please refer to the our documentation for these tickets.

This release will happen on or about 15 May, 2012.

Search server release: 2012-03-23

Earlier today we switched the latest release of our search server live. Paul says:

In this new release of the search server the basic search will now search more fields and return variations of what you searched such as spelling mistakes and the scoring has also been improved. Advanced search now lets you search for when a field doesn’t exist and you can do exact searches that take accent characters into account using the new accent fields.

We’ve also upgraded to the latest version of Lucene, added a way to see how scores are calculated for results and added integration tests. Hopefully this will all help with future stability.

UPDATE: This code lives in svn and can be found at r13480. I’ve also created this tag.

Thanks for your hard work on this release, Paul. And also thanks to Ollie for taking time to push the server portions of this live today. Here are our release notes:

Bug

  • [SEARCH-168] – Recording Search doesnt consider track duration only recoridng duration
  • [SEARCH-191] – Duplicate qdur values being added to document

Improvement

  • [SEARCH-152] – Update Code to use Lucene 3.5
  • [SEARCH-160] – Searching artist by name and initial(s) is impossible, the initials are useless
  • [SEARCH-163] – Search for exact string, with no accents
  • [SEARCH-166] – Artist search does not always offer the obvious (to a human!) results
  • [SEARCH-170] – Move Basic search logic from mbserver to searchserver
  • [SEARCH-171] – Basic Search should work more like Solr Dismax Parser
  • [SEARCH-173] – Index Catalogos without space
  • [SEARCH-174] – Provide a way to search for empty/non-empty fields
  • [SEARCH-177] – Support for initialize index remotely http://localhost:8080/?init=mmap
  • [SEARCH-188] – Add Integration Tests

New Feature

  • [SEARCH-172] – Make it possible to search for barcode:"[none]"
  • [SEARCH-194] – Add debugging option to be able to get lucene explain for results

Schema change update

As per my previous blog post, today is the deadline for submitting tickets for consideration in the May 15 2012 schema change release. Eleven tickets have been championed for this release — you can see each of these tickets in the schema change release fix version in jira.

I would like to call your attention to three tickets that I think ought to get more input from the community:

  1. Split release group attributes into two types
  2. Lyrics language for works
  3. Add ISO 639-3 language codes to the database

If any of these issues interest you, please take a look and leave a comment if you have any input. Finally, I will be keeping track of the progress of the schema change release on this wiki page.

I’m hoping to put together a complete schema change document that outlines the exact changes to our schema on April 2.

Many thanks to all of the volunteers who adopted tickets and are moving them forward!