The raw data feeding into the duplicate artist/album reports has been updated. These are available under Edit The Data/Suggestions as the last two entries: “Albums that might need merging” and
“More possibly duplicate artists”. The last time this report was generated (July) we had 1703 possible album duplicates and 1872 possible artist duplicates. We now have 2714 possible album duplicates and 2424 possible artist duplicates.
Note that there is a delay between when I upload the raw data and it is reflected on the server – I think this was set up to happen once a day, but it may be only once a week.
As always, if anyone is looking at these, and there is a confirmed false positive, let me know and I will (a) make sure it doesn’t show up in the next report, and (b) see if I can improve the overall reporting. So far very few people have submitted false positives.
It’s daily (well, nightly if you’re in California’s time zone). Also I’ve just re-run those reports by hand, so the reports should now be up-to-date.
Would it be possible to combine the two reports into third that might be shorter but more certain i.e. “artists with similar names who also each have an album with the same tracklist.”?
Zaireeka by the Flaming Lips is a false-positive (4 disc set, each with the same tracklist).
I’ve added Wiki pages describing these reports, with pages for reporting false positives:
http://wiki.musicbrainz.org/wiki.pl?DuplicateSuggestions
Incidentally, the reports that Dave generated by hand this afternoon are still using the June data:
Data last supplied: 2004-06-14 12:27:34 EDT
Report generated: 2004-10-25 15:03:10 EDT
The latest (2004-10-25) reports ran at 10 PM GMT with current data, so it all seems to be working. Thanks again for getting this going (again).
@alex