furia furialog · New Particles · The War Against Silence · Aedliga (songs) · photography · code · other things     ↑vF
13 May 2008 to 20 December 2007
Eventually, probably, we will figure out how to have computers make some kind of sense of human language. That will be cool and useful, and will change things.  

But it's a hard problem, and in the short term I think much of the work required is mostly harder than it is valuable. The big current problems I care about in information technology involve letting computers do things computers are already good at, not beating human heads against them hoping they'll become more human out of sympathy.  

So I'm already not the most receptive audience for Powerset, the latest attempt at "improving search" via natural-language processing. I don't think "search" is the problem, to begin with, and I don't think "searching" by typing sentences in English is an improvement even if it works.  

And I don't think it works. But make up your own mind. I put together a very simple comparison page for running a search on Powerset and Google side-by-side. And then I ran some. Like these:  

what's the closest star?
who was the King of England in 1776?
what movie were Gena Rowlands and Michael J. Fox in together?
new MacBook Pros today?
who are the members of Apple's board of directors?
what's the population of Puerto Rico?
when is Father's Day?
what was the last major earthquake in Tokyo?"
bands like Enslaved
who is Nightwish's new singer?
who is Anette Olzon?  

and then, because Powerset suggested it:  

who is Anette Olson?  

and then, because Google suggested it:  

who is Annette Olson?  

I think, given these results, it's very hard to argue that Powerset's NLP is doing us much good. At least, not yet. And I'm not their (or anybody's) VC, but I wouldn't be betting a team of salaries that it's going to any time soon.  

[12 August 2008 note: the above queries are all still live, and some generate different results today than they did when I posted this. Powerset now gets Anette Olzon, although they still also suggest Anette Olson despite having no interesting results for it, and it still takes Google to suggest fixing Anette Olson to Annette Olson.  

The most bizarre new result, though, is that currently the Powerset query-result page for "what movie were Gena Rowlands and Michael J. Fox in together?" is itself the top hit in Google for that query.]

15 years of gas prices and Exxon/Mobil stock.

The core of the "semantic web" idea, at least as far as I'm concerned, is that we're trying to do for data what the first web did for pages. We're trying to make dataspaces, both individual and aggregate, that can be explored and analyzed both by people directly, and by machines on our behalf. The people half of this, at least, is not mysterious or obscure or even speculative. It looks like IMDb, or any other site where there's pretty much a page for each individual thing, and you can click your way from every thing to everything else.  

The machine part is more complicated, but only by a little. Instead of regular old-web links, which just tell the computer where to go, a "semantic" link also says what it means to go there. So the old-web page for Rush Hour 3 links to Jackie Chan and Chris Tucker, but also to ads and the IMDb front page and job-listings for IMDb.com, and as far as the machines can tell, these links are all essentially equal. When IMDb gets their act from web 2.0 to 3.0, the links will be annotated so that the ones that go to Jackie and Chris and the other cast members are labeled "actor", and the other links aren't, and then you can ask a question less like "What web pages mention the words 'Jackie' and 'Chan' and 'older'?" and more like "How many people in that movie were older than him, anyway?", and the machines might have enough material to figure it out for you.  

And that, and not coruscating pie-charts, is how you'll start to recognize the pieces of the new web as it begins to emerge: its sites will help you get real answers to real questions without you having to get out scratch-paper and click a hundred links yourself. The more time you spend thinking about this idea, I believe, the more revolutionary you'll realize it is. In terms of how computers augment human capacities for understand information, the jump from the regular web to the semantic web will be a bigger deal than the jump from magazines and books and newspapers to the web. Maybe bigger by a lot.  

Which is why I was excited to finally get an invitation to the private beta program for Twine, despite basically not knowing what it was. My wildly hopeful guess, from the pre-release hints about "personal information", had been that Twine might be the long-awaited reincarnation of the soul of Lotus Agenda, a personal information management program in a world where a lot more people now have enough information piling up around them for "managing" it to be a generalizable problem.  

Twine, it turns out, at least so far, is a social bookmarking application. Bookmarking is not exactly what I meant by information mangement, any more than daytimer+contacts is what I meant by it in 1992. I gather that there is semantic-web technology behind Twine, somewhere, and I think this is supposed to make the "other tags" Twine recommends for your bookmarks better than the other tags del.icio.us recommends, or the other feeds Google Reader recommends, or the microwave that Amazon tells you was purchased by other people who pre-ordered a Douglas Coupland novel. Or it's supposed to eventually make this true, anyway, some day when/if there are more bookmarks and more people in Twine, which is after all still "in beta", which means that you're supposed to imagine that it will eventually get smart about everything it's currently dumb about.  

And in Twine's case, this might eventually make it a really good social bookmarking application. If so, I will happily switch from del.icio.us to Twine for my minimal and basically expendable social-bookmarking needs.  

But as an ambassador for the Semantic Web, Twine is an embarrassment. Or, maybe more accurately, it's embarrassed. It buries its semantic-web-ness inside, like it's the information-technology version of oat bran, and the reaction they're going for is "Oh, these donuts taste so good you'd barely know they had any Semantic Web in them!" But oat bran doesn't keep donuts from being junk food, and RDF-storage and named-entity-extraction doesn't make social-booking any less page-oriented.  

And I probably wouldn't care if Nova hadn't set up so much semantic-web context around himself and his company and their product. But we've collectively screwed up the presentation of this simple idea about how the next web will be better, somehow, and a lot of people have become convinced that the semantic web is some kind of clanking information C-3PO from an idiot-fantasy future, complaining about etiquette and waddling like it has a Commodore 64 wedged up its ass. So for a little while, at least, anybody working on the tools for building the new web is automatically an apologist learning how to be an evangelist instead. So I want everything that says Semantic Web on it to point clearly to the way the future is really and simply better. I don't want it to look like NLP alchemy, or like temperamental magic someone is trying to use in place of levers or pulleys or Perl. And I especially don't want it to look like some old thing that most people already didn't need.  

But then, this is the standard I will be held to, too, if we manage to build and ship the semantic-web application I'm working on. I want to be part of the way the world gets better, and to do something that is not embarrassed of the future it is helping to build. We'll see.
You were eating pieces of cereal, I was rounding up the ones that got away from you. I leaned over to brush a few closer, and you looked at me, reached up, and put one in my mouth. Not your first gift to me, by any means, but maybe the first you picked yourself. And so nowhere near the first time I am thankful in your presence, but maybe the first thanks to go simply between us, not everywhere around us.
between inmity and outstarting
these hurt feeders pray for us to freeze
and glacial promise melts into alluvial years

[and in pencil on the back of the page, in a different hand]  

our upflung tears (re)fill the lakes of the moon

1. Nightwish: "Amaranth"
2. Nightwish: "Last of the Wilds"
3. Deathspell Omega: "Bread of Bitterness"
4. Rotting Christ: "Enuma Elish"
5. In This Moment: "Beautiful Tragedy"
6. Dir en grey: "CLEVER SLEAZOID"
7. Wolves in the Throne Room: "Dea Artio"
8. Jesu: "Stanlow"
9. Jesu: "Storm Comin' On"
10. Asrai: "Sour Ground"
11. Secrets of the Moon: "Confessions"
12. Dark Tranquillity: "Misery's Crown"  

13. Tori Amos: "Bouncing Off Clouds"
14. Tori Amos: "Secret Spell"
15. Low: "Murderer"
16. Runrig: "Clash of the Ash"
17. Manic Street Preachers: "Autumnsong"
18. Jimmy Eat World: "Chase This Light"
19. Maxïmo Park: "Girls Who Play Guitars"
20. Parts & Labor: "Brighter Days"
21. Paramore: "That's What You Get"
22. Radiohead: "All I Need"
23. Amiina: "Seoul"  

24. OLIVIA: "Stars shining out"
25. Damone: "Revolution!"
26. L’Arc~en~Ciel: "Seventh Heaven"
27. FINE LINES: "Spin Into Love"
28. Manic Street Preachers: "Boxes & Lists"
29. Editors: "Bones"
30. Epica: "Fools of Damnation (The Embrace That Smothers, Part IX)"
31. Eyes of Eden: "Sleeping Minds"
32. Bat for Lashes: "Prescilla"
33. Camera Obscura: "Alaska"
34. Sigur Rós: "Hliómalind"
35. Amon Tobin: "The Killer's Vanilla"
36. Candlemass: "Clearsight"
37. Dimmu Borgir: "The Conspiracy Unfolds"
38. Diary of Dreams: "hypo)crypticK(al"
39. Ulrich Schnauss: "Medusa"
40. Samael: "Suspended Time"
41. Sirenia: "Sundown"
42. Tarja: "Die Alive"
43. Helloween: "As Long as I Fall"  

44. Life Without Buildings: "The Leanover"  


The Best of 2007 (The War Against Silence)  

This Scrabble variant occurred to me in the shower this morning: instead of working with 7 tiles at a time, drawn blindly, you get to pick your tiles. You still use the blind draw to determine who goes first, but then you spread out all the tiles face up, and the players take turns picking one tile each until they're all distributed. You each keep your tiles face-up in front of you for the whole game, so there's no mystery about who has what, and nobody needs to try to keep track of what's left. Each turn, then, you can use anything you have in front of you. No exchanging tiles, obviously, since they're all distributed, and you can't use more than 7 in any one turn. Bingos are way too easy in this version, so no bonus points for them.  

I haven't tried this, but it sounds intriguing. In play it seems like it would be kind of the Chess version of Scrabble, much more about planning and board-position. Plus the tile-draw at the beginning evokes trading-card-game deck-building. Or being picked for teams in elementary school, although in this case you're always a captain, so it shouldn't be as traumatic. Presumably the opening rounds of the draw would be amenable to analysis, if not optimization, but chess openings are fairly exhaustively explored and it doesn't seem to ruin the fun, so I think that's probably fine.  


This discussion led me to one more rule: The first word by each player must use only 1-point tiles. This makes the first move more clearly a strategic one, rather than just an exercise in playing the highest-scoring word you can make in a vaccuum.
Site contents published by glenn mcdonald under a Creative Commons BY/NC/ND License except where otherwise noted.