furia furialog · New Particles · The War Against Silence · Aedliga (songs) · photography · code · other things
DERI semantic-web researcher Alexandre Passant just announced a semantic-web-based music-recommendation engine called dbrec. It runs on dbpedia data, and computes "Linked Data Semantic Distance" between bands to find likely suggestions. This is an intriguing premise, and probably a worthy experiment.  

The site isn't labeled "intriguing experiment", though, it's labeled "intelligent music recommendations". Here are its top "intelligent" recommendations for, just to pick a random example near the beginning of the alphabet, Annihilator:  

Jeff Waters
Primal Fear
Randy Black
Bif Naked  

"Intelligent" is not really the word for this.  

On the one hand, the quality of the recommendations is not mostly Passant's fault. The underlying data isn't that great, and you can see how not-great it is in Passant's generated explanations, like this one for how Jeff Waters and Annihilator relate:  

Annihilator (band) is 'associated musical artist' of Jeff Waters (7 artists sharing it)
Annihilator (band) is 'associated acts' of Jeff Waters (7 artists sharing it)
Annihilator (band) is 'associated band' of Jeff Waters (7 artists sharing it)
Jeff Waters is 'current members' of Annihilator (band) (2 artists sharing it)
Annihilator (band) and Jeff Waters share the same value for 'genre'
- Thrash metal (529 artists sharing it)
- Groove metal (101 artists sharing it)
- Speed metal (170 artists sharing it)
- Heavy Metal Music (1534 artists sharing it)
Annihilator (band) and Jeff Waters share the same value for 'reference' (1 artists sharing it)

This is an incredibly obtuse way of saying, as the human-readable Wikipedia article about Waters puts it in the first sentence: "Jeff Waters is the guitarist and mastermind of the thrash metal band Annihilator". Passant's data doesn't quite record this fact, so he's left to try to make sense of the difference between "associated musical artist", "associated act" and "associated band" and "reference".  

Unsurprisingly, not much interesting sense results. Everything in this example is connected primarily through personnel overlap. Drummer Randy Black has played in both Primal Fear and Annihilator, and on one Bif Naked album. D.O.A. founder Randall Archibald sang on two Annihilator albums. Extreme are on the list because drummer Mike Mangini played briefly in both bands. Bif and Annihilator share the hometown "Canada".  

These connections aren't irrelevant, exactly. If you were trying to get the phone number of Annihilator's booking agent, they might be worth scanning through in case you spot somebody you went to high school with.  

As musical recommendations, though, they suck. As "intelligent" musical recommendations they're idiotic. Annihilator is a thrash-metal band, Extreme were metal-derived MOR pop, Bif Naked is a punk singer. Compare the list, with no claim of "intelligence", for Annihilator on empath:  

0.228 Anthrax
0.224 Dio
0.206 Sodom
0.182 Saxon
0.176 Black Label Society
0.175 Onslaught (Gbr)
0.170 Grave Digger

A person could probably do better than this, too, especially if they're allowed extra adjectives and a lower granularity than artists ("like Kill 'Em All-era Metallica", or "started off like early Slayer, but with more emphasis on technique"; and I don't even know Annihilator very well), but this list is at least not inane.  

And yet, Passant's work is almost certainly more technically sophisticated than mine. I used one genre, one data-source and one connection-metric, and produced a deliberately simple web-site with almost no ancillary information. Passant had to confront the sprawl of the Linked Open Data "cloud", figure out non-obvious weightings for a bunch of different connection paths, and display a lot more information than I deal with, in both breadth and depth.  

And yet, and yet, and yet: The recommendations are bad. Or, more accurately, the connections are what they are, but calling them recommendations is bad. Calling them "intelligent" is worse, and presenting the combination of "intelligent", "recommendation" and "linked data" to the general public is deadly. If "Linked Data" and "Semantic Web" mean ways for machines to tell me that if I like Annihilator I should listen to Extreme, then nobody needs them. If Linked Data, the movement, can't tell the difference between intelligent and idiotic, it's not to be trusted on anything.  

[2013 Postscript]

Site contents published by glenn mcdonald under a Creative Commons BY/NC/ND License except where otherwise noted.