23 October 2007 to 18 September 2007

It is possible for even the most preternaturally precocious child to actually miss a diaper from one tenth of an inch away.

Here is a gift, of unspecified value, to the field of set-comparison math: The Empath Coefficient, an alternate measure of the alignment between two sets. Conceptually this is intended as a rough proxy for measuring the degree to which the unseen or impractical-to-measure motivation behind the membership of set A also informs the membership of set B, but the math is what it is, so the next time you find yourself comparing the Cosine, Dice and Tanimoto coefficients, looking for something faster than TF-IDF to make some sense of your world, here's another thing to try. This is the one I used in

At its base, the Empath Coefficient is an asymmetric measure, based on the idea that in a data distribution with some elements that appear in many sets and some that appear in only a few, it is not very interesting to discover that everything is "similar" to the most-popular things. E.g., "People who bought

Where the Dice Coefficient, then, divides the size of the overlap by the average size of the two sets (call A the size of the first set, B the size of the second set, and V the size of the overlap):

the core of the Empath Coefficient adjusts this to:

By itself, though, that calculation will still be uninformatively dominated by small overlaps between small sets, so I further discount the similarities based on the overlap size. Like this:

So if the overlap size (V) is only 1, the core score is multiplied by 1/2 [1-1/(1+1)], if it's 2 the core score is multiplied by 2/3 [1-1/(2+1)], etc. And then, for good measure, I parameterize the whole thing to allow the assertion of a minimum overlap size, M, which goes into the adjustment numerator like this:

This way the sample-size penalties are automatically calibrated to the threshold, and below the threshold the scores actually go negative. You can obviously overlay a threshold on the other coefficients in pre- or post-processing, but I think it's much cooler to have the math just take care of it.

I also sometimes use another simpler asymmetric calculation, the Subset Coefficient, which produces very similar rankings to Empath's for any given A against various Bs (especially if the sets are all large):

The concept here is that we take A as stipulated, and then compare B to A's subset of B, again deducting points for small sample-sizes. The biggest disadvantage of Subset is that scores for As of different sizes are not calibrated against each other, so comparing A1/B1 similarity to A2/B2 similarity won't necessarily give you useful results. But sometimes you don't care about that.

This is the one I used for calculating artist clusters from 2006 music-poll data, where cross-calibration was inane to worry about because the data was so limited to begin with.

Here, then, are the forumlae for all five of these coefficients:

Cosine: V/sqrt(AB)

Dice: 2V/(A+B)

Tanimoto: V/(A+B-V)

Subset: (V-1)/B

Empath: 4V(1-M/(V+1))/(A+3B)

And here are some example scores and ranks:

A few things to note:

- In 1 & 2, notice that Dice, Tanimoto and Cosine all produce 1.0 scores for congruent sets, no matter what their size. Subset and Empath only

- 5 & 6, 7 & 8 and 11 & 12 are reversed pairs, so you can see how the two asymmetric calculations handle them.

- Empath produces the finest granularity of scores, by far, including no ties even within this limited set of examples. Whether this is good or bad for any particular data-set of yours is up to you to decide.

- Since all of these work with only the set and overlap sizes, none of them take into account the significance of two sets overlapping

There may or may not be some clear mathematical way to assess the fitness of each of these various measurements for a given data-set, based on its connectedness and distribution, but at any rate I am not going to provide one. If you actually have data with overlapping sets whose similarity you're trying to measure, I suggest trying all five, and examining their implications for some corner of your data you personally understand, where you yourself can meta-evaluate the scores and rankings that the math produces. I do not contend that my equations produce more-objective truths than the other ones; only that the stories they tell me about things I know are plausible, and the stories they have told me about things I didn't know have usually proven to be interesting.

**, my recent similarity-analysis of heavy-metal bands, if you want to see lots of examples of it in action.***em*pathAt its base, the Empath Coefficient is an asymmetric measure, based on the idea that in a data distribution with some elements that appear in many sets and some that appear in only a few, it is not very interesting to discover that everything is "similar" to the most-popular things. E.g., "People who bought

*Some Dermatological Diseases of the Domesticated Turtle*also bought*Harry Potter and the...*". In the Empath calculation, then, the size of the*Harry Potter*set (the one you're comparing) affects the similarity more than the size of the*Turtle*set (the one you're trying to learn about). I have arrived at a 1:3 weighting through experimenting with a small number of data-sets, and do not pretend to offer any abstract mathmatical justification for this ratio, so if you want to parameterize the second-set weight and call that the*N*path Coefficient, go ahead.Where the Dice Coefficient, then, divides the size of the overlap by the average size of the two sets (call A the size of the first set, B the size of the second set, and V the size of the overlap):

V/((A+B)/2)or

2V/(A+B)

the core of the Empath Coefficient adjusts this to:

V/((A+3B)/4)or

4V/(A+3B)

By itself, though, that calculation will still be uninformatively dominated by small overlaps between small sets, so I further discount the similarities based on the overlap size. Like this:

(1-1/(V+1)) * V/((A+3B)/4)or

4V(1-1/(V+1))/(A+3B)

So if the overlap size (V) is only 1, the core score is multiplied by 1/2 [1-1/(1+1)], if it's 2 the core score is multiplied by 2/3 [1-1/(2+1)], etc. And then, for good measure, I parameterize the whole thing to allow the assertion of a minimum overlap size, M, which goes into the adjustment numerator like this:

4V(1-M/(V+1))/(A+3B)

This way the sample-size penalties are automatically calibrated to the threshold, and below the threshold the scores actually go negative. You can obviously overlay a threshold on the other coefficients in pre- or post-processing, but I think it's much cooler to have the math just take care of it.

I also sometimes use another simpler asymmetric calculation, the Subset Coefficient, which produces very similar rankings to Empath's for any given A against various Bs (especially if the sets are all large):

(V-1)/B

The concept here is that we take A as stipulated, and then compare B to A's subset of B, again deducting points for small sample-sizes. The biggest disadvantage of Subset is that scores for As of different sizes are not calibrated against each other, so comparing A1/B1 similarity to A2/B2 similarity won't necessarily give you useful results. But sometimes you don't care about that.

This is the one I used for calculating artist clusters from 2006 music-poll data, where cross-calibration was inane to worry about because the data was so limited to begin with.

Here, then, are the forumlae for all five of these coefficients:

Cosine: V/sqrt(AB)

Dice: 2V/(A+B)

Tanimoto: V/(A+B-V)

Subset: (V-1)/B

Empath: 4V(1-M/(V+1))/(A+3B)

And here are some example scores and ranks:

# | A | B | V | Dice | Rank | Tanimoto | Rank | Cosine | Rank | Subset | Rank | Empath | Rank |

1 | 100 | 100 | 100 | 1.000 | 1 | 1.000 | 1 | 1.000 | 1 | 0.990 | 1 | 0.990 | 1 |

2 | 10 | 10 | 10 | 1.000 | 1 | 1.000 | 1 | 1.000 | 1 | 0.900 | 2 | 0.909 | 2 |

3 | 10 | 10 | 5 | 0.500 | 3 | 0.333 | 3 | 0.500 | 5 | 0.400 | 4 | 0.417 | 4 |

4 | 10 | 10 | 2 | 0.200 | 12 | 0.111 | 12 | 0.200 | 12 | 0.100 | 11 | 0.133 | 12 |

5 | 10 | 5 | 3 | 0.400 | 6 | 0.250 | 6 | 0.424 | 6 | 0.400 | 5 | 0.360 | 5 |

6 | 5 | 10 | 3 | 0.400 | 6 | 0.250 | 6 | 0.424 | 6 | 0.200 | 8 | 0.257 | 8 |

7 | 10 | 5 | 2 | 0.267 | 10 | 0.154 | 10 | 0.283 | 10 | 0.200 | 7 | 0.213 | 10 |

8 | 5 | 10 | 2 | 0.267 | 10 | 0.154 | 10 | 0.283 | 10 | 0.100 | 11 | 0.152 | 11 |

9 | 6 | 6 | 2 | 0.333 | 9 | 0.200 | 9 | 0.333 | 9 | 0.167 | 9 | 0.222 | 9 |

10 | 6 | 4 | 2 | 0.400 | 6 | 0.250 | 6 | 0.408 | 8 | 0.250 | 6 | 0.296 | 6 |

11 | 6 | 2 | 2 | 0.500 | 3 | 0.333 | 3 | 0.577 | 3 | 0.500 | 3 | 0.444 | 3 |

12 | 2 | 6 | 2 | 0.500 | 3 | 0.333 | 3 | 0.577 | 3 | 0.167 | 9 | 0.267 | 7 |

A few things to note:

- In 1 & 2, notice that Dice, Tanimoto and Cosine all produce 1.0 scores for congruent sets, no matter what their size. Subset and Empath only

*approach*1, and give higher scores to larger sets. The idea is that the larger the two sets are, the more unlikely it is that they coincide by chance.- 5 & 6, 7 & 8 and 11 & 12 are reversed pairs, so you can see how the two asymmetric calculations handle them.

- Empath produces the finest granularity of scores, by far, including no ties even within this limited set of examples. Whether this is good or bad for any particular data-set of yours is up to you to decide.

- Since all of these work with only the set and overlap sizes, none of them take into account the significance of two sets overlapping

*at some specific element*. If you want to probability-weight, to say that sharing a seldom-shared element is worth more than sharing an often-shared element, then look up term frequency -- inverse document frequency, and plan to spend more calculation cycles. Sometimes you need this. (I used tf-idf for comparing music-poll voters, where the set of*set-sizes*was so small that without taking into account the popularity/obscurity of the*albums*on which voters overlapped, you couldn't get any interesting numbers at all.)There may or may not be some clear mathematical way to assess the fitness of each of these various measurements for a given data-set, based on its connectedness and distribution, but at any rate I am not going to provide one. If you actually have data with overlapping sets whose similarity you're trying to measure, I suggest trying all five, and examining their implications for some corner of your data you personally understand, where you yourself can meta-evaluate the scores and rankings that the math produces. I do not contend that my equations produce more-objective truths than the other ones; only that the stories they tell me about things I know are plausible, and the stories they have told me about things I didn't know have usually proven to be interesting.

¶

**How You Win**· 4 October 2007My favorite moment in the New England Revolution's 3-2 defeat of FC Dallas for the 2007 U.S. Open Cup is not unheralded rookie Wells Thompson beating semi-heralded international Adrian Serioux to the ball and slipping it past almost-semi-heralded international Dario Sala for what ends up being the game-winning goal. It is not once-unheralded-rookie Pat Noonan's no-look back-flick into Thompson's trailing run. It is not once-unheralded-rookie Taylor Twellman's pinpoint cross to Noonan's feet. It is before that, as Twellman and Noonan work forward, and Noonan's pass back towards Twellman goes a little wide. As he veers to chase it down, Twellman gives one of his little chugging acceleration moves, a tiny but unmistakable physical manifestation of his personality. I've watched him do this for years. I can recognize him out of the corner of my eye on a tiny TV across a crowded room, just from how he runs. Just from how he

This is the Revolution's first trophy, after losing three MLS Cups and one prior Open Cup, all in overtime or worse. They are my team because I live here, that's how sports fandom mainly works. But I

Arguably other teams, using more opportunistic methods, have acquired better players. Several of them have acquired more trophies. But none of them, I think, are more coherently themselves. None of them can hold up a trophy and know that they

Sometimes, as a sports fan, you get to be happy. Much more rarely, you get to be

*steps*as he runs.This is the Revolution's first trophy, after losing three MLS Cups and one prior Open Cup, all in overtime or worse. They are my team because I live here, that's how sports fandom mainly works. But I

*care*about them, not just support them, because Steve Nicol runs the team with deliberate atavistic moral clarity. The Revs*develop*players. Of the 12 players who appeared in this victory, 8 were Revolution draft picks, 1 was a Revolution discovery player, 2 of the remaining 3 were acquired in trades before Nicol took over, and the last one (Matt Reis) was acquired in an off-season trade before Nicol's first season even started. Of the 5 other field-players on the bench last night, even, 3 are Nicol draft-picks and other 2 are Revs discoveries. As is Shalrie Joseph, suspended for a red-card picked up in the semi-final during an altercation with, ironically enough, yet another Revolution draft-pick now playing in the USL. Even the Revs' misfortunes are products of their own dedication.Arguably other teams, using more opportunistic methods, have acquired better players. Several of them have acquired more trophies. But none of them, I think, are more coherently themselves. None of them can hold up a trophy and know that they

*earned*it, as a self-contained organization, this completely.Sometimes, as a sports fan, you get to be happy. Much more rarely, you get to be

*proud*.It is a small, powerful thing to rescue small truths hidden in seas of numbers. It is an even smaller and unfathomably deeper joy to hover, enraptured, in the countless endless instants between when some tiny thing happens to her for the first time, and when she shrieks with the joy of a universe expanding.

The Deciblog just published Justin Foley's reply to my implication that he botched his analysis of first letters of heavy-metal band names. [Read those if you want the rest of

Foley cc'ed a bunch of other people in the actual email, and in an ensuing thread that got well-underway before I noticed it in my spam filter (which wouldn't have happened if I'd had the good sense to put all Southern Lord label personnel in my Address Book proactively), someone beat me to taking statistical issue with Foley's idea that my 50,000+ EM-derived sample-size was "too large", but agreed that in the abstract some sort of weighting scheme could account for the idea that Metallica earns M more points than some unkown band called The Austerity Program earns for A (or, in Foley's original analysis, T).

To all of which I said:

Weighting is easy. Let's say that a band only counts if somebody has actually bothered to write a review of one of their releases, and we'll weight them by the number of releases that have reviews. This method counts 6778 of EM's artists, who have 14057 releases between them.

Here are the percentages from the whole sample, the smaller sample unweighted, and the smaller sample weighted:

As you see, both restricting the sample and weighting do make small differences in the percentages, but S still wins, and D is still only in third.

It's also easy to

goal length: 10

searching: [] 6778 partial matches

searching: [s] 718 partial matches

searching: [sa] 122 partial matches

searching: [sac] 23 partial matches

searching: [sacr] 23 partial matches

searching: [sacri] 9 partial matches

searching: [sacrif] 5 partial matches

searching: [sacrifi] 5 partial matches

searching: [sacrific] 5 partial matches

searching: [sacrifici] 4 partial matches

searching: [sacrificia] 3 partial matches

I submit that when Daree Eeee and the mighty Sacrificia tour together, Daree Eeee will be going on first, and carrying their own mangy amps off the stage when they're done with their 3 crappy songs...

glenn

PS: I most definitely did not type in any numbers by hand.

PPS: Excel is a fine tool for lots of things. Not *these* things, though.

The aforementioned FH then clarified the less-rigorous most-metal algorithm he had in mind, which was also easy to produce:

It's more or less just as easy to do it that way, considering only the weighted likelihood of a given letter at a given position with a given preceeding character.

searching: [] 6778 candidates

searching: [s] 718 candidates

searching: [sa] 1109 candidates

searching: [sar] 870 candidates

searching: [sara] 533 candidates

searching: [saran] 450 candidates

searching: [saran ] 521 candidates

searching: [saran o] 419 candidates

searching: [saran or] 271 candidates

searching: [saran ore] 270 candidates

searching: [saran orer] 182 candidates

I think Saran Orer get a guitar tech and some sandwiches, and go on after Daree Eeee, but they're still playing for people who are there to hail Sacrificia.

I hope everything is clear now, as I'm way overdue to get back to posting pictures of my daughter...

[Discussion, if you can bear the thought, here on vF.]

*this*to make any sense, not that I'm saying you need to want that...]Foley cc'ed a bunch of other people in the actual email, and in an ensuing thread that got well-underway before I noticed it in my spam filter (which wouldn't have happened if I'd had the good sense to put all Southern Lord label personnel in my Address Book proactively), someone beat me to taking statistical issue with Foley's idea that my 50,000+ EM-derived sample-size was "too large", but agreed that in the abstract some sort of weighting scheme could account for the idea that Metallica earns M more points than some unkown band called The Austerity Program earns for A (or, in Foley's original analysis, T).

To all of which I said:

Weighting is easy. Let's say that a band only counts if somebody has actually bothered to write a review of one of their releases, and we'll weight them by the number of releases that have reviews. This method counts 6778 of EM's artists, who have 14057 releases between them.

Here are the percentages from the whole sample, the smaller sample unweighted, and the smaller sample weighted:

? | All | SU | SW |

# | 0.3 | 0.4 | 0.3 |

A | 9.1 | 9.8 | 9.8 |

B | 5.9 | 6.2 | 6.2 |

C | 6.3 | 6.4 | 6.0 |

D | 8.9 | 8.1 | 8.3 |

E | 4.9 | 4.6 | 4.3 |

F | 3.6 | 3.7 | 3.1 |

G | 3.0 | 3.5 | 3.4 |

H | 3.9 | 3.9 | 3.7 |

I | 3.7 | 3.5 | 3.7 |

J | 0.6 | 0.6 | 0.8 |

K | 2.2 | 2.3 | 2.6 |

L | 3.1 | 3.0 | 2.8 |

M | 7.4 | 6.8 | 8.2 |

N | 4.2 | 4.0 | 4.0 |

O | 2.3 | 2.4 | 2.6 |

P | 4.0 | 3.7 | 3.6 |

Q | 0.2 | 0.2 | 0.3 |

R | 3.3 | 2.9 | 3.2 |

S | 10.8 | 10.6 | 10.5 |

T | 4.5 | 4.6 | 4.4 |

U | 1.3 | 1.3 | 1.2 |

V | 2.7 | 2.8 | 2.8 |

W | 2.7 | 3.3 | 2.8 |

X | 0.3 | 0.4 | 0.4 |

Y | 0.2 | 0.3 | 0.3 |

Z | 0.7 | 0.7 | 0.5 |

As you see, both restricting the sample and weighting do make small differences in the percentages, but S still wins, and D is still only in third.

It's also easy to

*rigorously*calculate the most metal of all names, in essentially exactly the way [FH] suggests. Using only the smaller sample, we can build up the name by at each position taking the most common letter (again weighting each band name by the number of reviewed releases) among the names which match what we have so far, working towards a goal length obtained in the same weighted-average fashion. This produces this incremental search result:

goal length: 10

searching: [] 6778 partial matches

searching: [s] 718 partial matches

searching: [sa] 122 partial matches

searching: [sac] 23 partial matches

searching: [sacr] 23 partial matches

searching: [sacri] 9 partial matches

searching: [sacrif] 5 partial matches

searching: [sacrifi] 5 partial matches

searching: [sacrific] 5 partial matches

searching: [sacrifici] 4 partial matches

searching: [sacrificia] 3 partial matches

I submit that when Daree Eeee and the mighty Sacrificia tour together, Daree Eeee will be going on first, and carrying their own mangy amps off the stage when they're done with their 3 crappy songs...

glenn

PS: I most definitely did not type in any numbers by hand.

PPS: Excel is a fine tool for lots of things. Not *these* things, though.

The aforementioned FH then clarified the less-rigorous most-metal algorithm he had in mind, which was also easy to produce:

It's more or less just as easy to do it that way, considering only the weighted likelihood of a given letter at a given position with a given preceeding character.

searching: [] 6778 candidates

searching: [s] 718 candidates

searching: [sa] 1109 candidates

searching: [sar] 870 candidates

searching: [sara] 533 candidates

searching: [saran] 450 candidates

searching: [saran ] 521 candidates

searching: [saran o] 419 candidates

searching: [saran or] 271 candidates

searching: [saran ore] 270 candidates

searching: [saran orer] 182 candidates

I think Saran Orer get a guitar tech and some sandwiches, and go on after Daree Eeee, but they're still playing for people who are there to hail Sacrificia.

I hope everything is clear now, as I'm way overdue to get back to posting pictures of my daughter...

[Discussion, if you can bear the thought, here on vF.]

And for completeness, here are the top bands by average rating across all releases, counting only the bands that have reviews from at least 10 different reviewers.

Vintersorg and Windir are the only bands to get an average above 90 with 20 or more reviewers. So clearly

The worst metal band in the world is Apocalypse, who got an average review of 7.0 from 14 reviewers. Dishonorable mention to Six Feet Under, the only band with at least 4 releases and 10 reviewers who averaged below 50 (49.91 from 47 reviewers).

# | Artist | Reviewers | Average | Spread |

1 | Repulsion | 11 | 96.43 | 3.064 |

2 | Esoteric (UK) | 13 | 96.2 | 3.544 |

3 | Gorguts | 15 | 95.43 | 4.03 |

4 | Lykathea Aflame | 12 | 94.6 | 4.363 |

5 | Atheist | 12 | 94.56 | 4.272 |

6 | Solitude Aeturnus | 10 | 93.9 | 5.485 |

7 | Sacramentum | 11 | 93.7 | 6.067 |

8 | Disembowelment | 10 | 93.43 | 3.959 |

9 | Deeds of Flesh | 12 | 93.33 | 5.375 |

10 | The Axis of Perdition | 10 | 93.33 | 4.607 |

11 | Martyr (Can) | 11 | 93.29 | 4.399 |

12 | Cult of Luna | 10 | 92.56 | 6.735 |

13 | Persuader | 11 | 92.4 | 5.886 |

14 | Katharsis (Ger) | 10 | 92.0 | 5.715 |

15 | Novembre | 12 | 91.86 | 5.436 |

16 | Vintersorg | 21 | 91.67 | 6.968 |

17 | Demilich | 15 | 91.58 | 7.522 |

18 | Saint Vitus | 13 | 91.36 | 6.526 |

19 | Belphegor (Aut) | 15 | 90.94 | 6.571 |

20 | Manticora | 11 | 90.64 | 7.889 |

21 | Windir | 21 | 90.64 | 6.986 |

22 | Agent Steel | 15 | 90.2 | 6.002 |

23 | Negurã Bunget | 10 | 90.17 | 10.123 |

24 | Pentagram (US) | 10 | 90.13 | 6.827 |

25 | Maudlin of the Well | 11 | 90.08 | 8.558 |

26 | Deströyer 666 | 15 | 90.07 | 7.676 |

Vintersorg and Windir are the only bands to get an average above 90 with 20 or more reviewers. So clearly

*those*are the greatest bands in all of heavy metal.The worst metal band in the world is Apocalypse, who got an average review of 7.0 from 14 reviewers. Dishonorable mention to Six Feet Under, the only band with at least 4 releases and 10 reviewers who averaged below 50 (49.91 from 47 reviewers).

And here are the 25

Most of these follow the "great once, crap now" pattern (I think we can now officially call this "Sepulturding"), which makes one wonder whether developing a fan-base is really worth the bother in the end. Deathspell Omega deserve a special note: if they'd had the sense to release

*least*consistent:# | Artist | Spread | Average |

1 | Sepultura | 25.022 | 67.55 |

2 | In Flames | 21.621 | 60.74 |

3 | Megadeth | 19.333 | 71.54 |

4 | Krieg | 18.455 | 71.19 |

5 | Deicide | 18.007 | 74.15 |

6 | Deathspell Omega | 17.925 | 83.93 |

7 | Metallica | 17.863 | 69.16 |

8 | Virgin Steele | 17.747 | 74.5 |

9 | Six Feet Under (US) | 17.601 | 55.15 |

10 | Dissection (Swe) | 17.275 | 70.67 |

11 | Sentenced | 16.626 | 66.79 |

12 | Moonspell | 16.312 | 78.28 |

13 | Nuclear Assault | 16.234 | 72.04 |

14 | Mayhem (Nor) | 15.754 | 67.69 |

15 | Machine Head (US) | 15.453 | 55.06 |

16 | Within Temptation | 15.379 | 61.25 |

17 | Slayer (US) | 14.537 | 73.97 |

18 | Children of Bodom | 13.943 | 77.63 |

19 | Black Label Society | 13.937 | 75.05 |

20 | Pantera | 13.921 | 69.97 |

21 | Celtic Frost | 13.717 | 73.4 |

22 | Cannibal Corpse | 13.476 | 74.89 |

23 | Motörhead | 13.319 | 78.23 |

24 | Danzig | 13.037 | 80.0 |

25 | Pain of Salvation | 12.934 | 87.77 |

Most of these follow the "great once, crap now" pattern (I think we can now officially call this "Sepulturding"), which makes one wonder whether developing a fan-base is really worth the bother in the end. Deathspell Omega deserve a special note: if they'd had the sense to release

*Infernal Battles*under a different name, their other 4 albums would give them a standard deviation of 1.66 on an average of 92.86, and we could have a very obscure statistical argument over whether that means they are in fact even greater than Fates Warning.My analytical tools make various otherwise-elusive questions easy to answer, so while I'm playing with heavy-metal data, here's another thing I wondered about: which bands have the narrowest and widest

Here are 25 most consistent. "Spread" is the standard deviation, "Average" is the average rating of the releases used in the calculation.

I sense a hastily-assembled cash-in Coroner boxset in our future. I think this also means that Fates Warning is the most consistently great band in all of heavy metal. So now we know. And Lamb of God gets some sort of weird prize for being the most consistently mediocre.

*ranges*of ratings? To answer this meaningfully I counted only releases that have 4 or more reviews, and only bands that have 4 or more of these releases and at least 10 different reviewers. For these I then averaged the ratings for each such release, and ran standard deviations on the sets of averages. So a low standard deviation means there's some consensus that the quality of the band's output is consistent. High means consensus that the quality varies widely.Here are 25 most consistent. "Spread" is the standard deviation, "Average" is the average rating of the releases used in the calculation.

# | Artist | Spread | Average |

1 | Coroner | 0.908 | 88.21 |

2 | Helstar | 1.455 | 90.54 |

3 | Moonsorrow | 1.676 | 89.98 |

4 | Dark Angel (US) | 1.767 | 82.15 |

5 | Candlemass | 1.842 | 89.78 |

6 | Lamb of God | 1.845 | 68.5 |

7 | Obituary | 2.004 | 85.32 |

8 | Type O Negative | 2.035 | 89.16 |

9 | Accept | 2.193 | 88.06 |

10 | Agent Steel | 2.479 | 90.49 |

11 | Fates Warning | 2.531 | 93.36 |

12 | Alice in Chains | 2.538 | 88.83 |

13 | Iron Savior | 3.025 | 88.25 |

14 | Falconer | 3.083 | 84.42 |

15 | Therion (Swe) | 3.159 | 90.38 |

16 | Sodom | 3.294 | 83.4 |

17 | Kamelot | 3.463 | 90.52 |

18 | Gorgoroth | 3.496 | 84.71 |

19 | Judas Iscariot | 3.602 | 89.03 |

20 | Bolt Thrower | 3.652 | 88.31 |

21 | Suffocation (US) | 3.701 | 86.48 |

22 | Angra | 3.758 | 88.63 |

23 | Enslaved (Nor) | 3.926 | 88.85 |

24 | Vader | 4.162 | 85.78 |

25 | Bal-Sagoth | 4.249 | 89.9 |

I sense a hastily-assembled cash-in Coroner boxset in our future. I think this also means that Fates Warning is the most consistently great band in all of heavy metal. So now we know. And Lamb of God gets some sort of weird prize for being the most consistently mediocre.

If you're going to waste your time doing obsessive analysis of data on which nobody's life or ecology depends, you ought to at least do it diligently and efficiently.

About a month ago The Deciblog published Justin Foley's attempt to answer the timeless question "How likely is a metal band to start their name with a particular letter of the alphabet?". For his sample set, Foley took the combined rosters of several metal labels (he doesn't reveal either list), which gave him 814 names, for which he then calculated the first-letter distributions, reaching the startling conclusion that the

Foley put his results in a bar-chart, which I assume means he used a spreadsheet, so hopefully he didn't spend a whole lot of time hand-counting. But he should have spent even less. The Encyclopaedia Metallum is not only just sitting there with a collaboratively-amassed and collectively-moderated database of 50,000+ metal bands, but they've even already split it up by first-letter and there are band-counts right at the top of each letter-page. Add, divide, and you're done.

Here, then, is the much better-informed version of this still-pointless breakdown:

Most of Foley's numbers aren't that far off. His small sample-size leads him to overestimate J and Y, and underestimate Q and V, but the absolute numbers for these letters are small anyway. He also seems to underestimate A, P and R, for reasons which are not apparent in his opaque reporting, but might have to do with language tendences, as EM's list is probably more global than his.

But the biggest discrepancy in Foley's numbers, by far, is T, which he credits with 9.3% of the band-names, where EM data indicates less than half that. Here I have a wearyingly mundane but highly plausible theory: Foley has accidentally counted all the bands whose names begin with "The " as T, despite specifically saying that he didn't. This is obviously both philosophically and methodologically repugnant, and although I regret the maelstrom of blogospheric outrage that will undoubtably accompany my public exposure of this error, I think we owe ourselves (the) Truth.

Of course, the thousands of metal fans around the world who have put time and effort into building the Encyclopaedia Metallum did it because they care about the music, not the alphabet. The site is the definitive central reference source for most metal-related matters, and certainly the final arbiter of obscurity for the vast unknown majority of the bands it lists.

It also has, in addition to its factual content, tens of thousands of percentage-scored, user-attributed, peer-moderated reviews of metal recordings. What it does

So here is an actual contribution to the world's knowledge on this admittedly peripheral subject:

Easier, actually. I bet it took me less time to do

About a month ago The Deciblog published Justin Foley's attempt to answer the timeless question "How likely is a metal band to start their name with a particular letter of the alphabet?". For his sample set, Foley took the combined rosters of several metal labels (he doesn't reveal either list), which gave him 814 names, for which he then calculated the first-letter distributions, reaching the startling conclusion that the

*most*likely letter is S.Foley put his results in a bar-chart, which I assume means he used a spreadsheet, so hopefully he didn't spend a whole lot of time hand-counting. But he should have spent even less. The Encyclopaedia Metallum is not only just sitting there with a collaboratively-amassed and collectively-moderated database of 50,000+ metal bands, but they've even already split it up by first-letter and there are band-counts right at the top of each letter-page. Add, divide, and you're done.

Here, then, is the much better-informed version of this still-pointless breakdown:

? | % | * |

# | 0.3% | |

A | 9.1% | ********* |

B | 5.9% | ***** |

C | 6.3% | ****** |

D | 8.9% | ******** |

E | 4.9% | **** |

F | 3.6% | *** |

G | 3.0% | *** |

H | 3.9% | *** |

I | 3.7% | *** |

J | 0.6% | |

K | 2.2% | ** |

L | 3.1% | *** |

M | 7.4% | ******* |

N | 4.2% | **** |

O | 2.3% | ** |

P | 4.0% | *** |

Q | 0.2% | |

R | 3.3% | *** |

S | 10.8% | ********** |

T | 4.5% | **** |

U | 1.3% | * |

V | 2.7% | ** |

W | 2.7% | ** |

X | 0.3% | |

Y | 0.2% | |

Z | 0.7% |

Most of Foley's numbers aren't that far off. His small sample-size leads him to overestimate J and Y, and underestimate Q and V, but the absolute numbers for these letters are small anyway. He also seems to underestimate A, P and R, for reasons which are not apparent in his opaque reporting, but might have to do with language tendences, as EM's list is probably more global than his.

But the biggest discrepancy in Foley's numbers, by far, is T, which he credits with 9.3% of the band-names, where EM data indicates less than half that. Here I have a wearyingly mundane but highly plausible theory: Foley has accidentally counted all the bands whose names begin with "The " as T, despite specifically saying that he didn't. This is obviously both philosophically and methodologically repugnant, and although I regret the maelstrom of blogospheric outrage that will undoubtably accompany my public exposure of this error, I think we owe ourselves (the) Truth.

Of course, the thousands of metal fans around the world who have put time and effort into building the Encyclopaedia Metallum did it because they care about the music, not the alphabet. The site is the definitive central reference source for most metal-related matters, and certainly the final arbiter of obscurity for the vast unknown majority of the bands it lists.

It also has, in addition to its factual content, tens of thousands of percentage-scored, user-attributed, peer-moderated reviews of metal recordings. What it does

*not*have is any sort of similarity analysis to make use of the huge data-graph represented by the connections between bands, users and ratings. There is a wearyingly mundane reason for this, too: trying to do similarity analysis with SQL queries will make you want to eat your own neck.So here is an actual contribution to the world's knowledge on this admittedly peripheral subject:

*, the missing similarity analysis of EM user/review/band data, accurate as of yesterday. Pick a band, see the other bands that people who like the first band also like. Data wants to form shapes. In a better world, this would be just as easy for EM to do themselves, updating live, as it is for them to serve their raw data into web pages.***em**pathEasier, actually. I bet it took me less time to do

*this*analysis than it took Foley to make a bar-chart of first letters. But I have better tools. I have better tools because at the moment I'm paid to design better tools. If I do my job well enough, eventually you'll have my better tools, too. I'm not designing them to tabulate heavy metal, I'm designing them to answer questions. Not all answers turn out to be shaped like Truths, of course. But if you can't answer them, you can't be sure which are which.