Quality Issues (Still) Plague People Search

Dec 31, 2008 (10:12 AM EST)

Read the Original Article at

"If you want to keep your job, use Spoke," advises recent e-mail from the folks behind "the fastest growing and most up-to-date business network in the U.S." Sounds like something to look into — social / people networks are one of the most important BI assets to have emerged in recent years — and I figured I owe Spoke another chance after panning it back in 2004. Grading according to the same accuracy, completeness, quality, usefulness, and usability standards I'd apply to other BI tools, I'm afraid I'd give Spoke a low C. Here's why.Spoke's e-mail says, "in uncertain times, you probably find yourself working twice as hard to maintain the same level of business success." It says that "last month,... information about over 2.5 million people was updated or verified!"

My own preference is (per the hackneyed expression) to work smarter, not harder. Who knows more about working smarter than my IE colleague, Neil Raden, co-author with James Taylor of Smart (Enough) Systems? Plus I picked on Neil for my interview of Andrew Borthwick of people-search competitor Spock, so using him again allows me to compare results.

Searching Spoke for information on Neil, I learn that he's with "Hired Brains Magazine" — well no, Hired Brains is a consultancy — and his years with Archer Decision Sciences are not those Spoke lists. Spoke makes a peculiar jump to say that Neil worked for Archer Daniels Midland — one "Archer" is the same as another, eh? — and it doesn't list his prior employment by such a notable organizations as now-infamous insurer AIG.

So based on this example and the several more trials I did but won't write about — and the fact that there's a ton of material by or about Neil on the Web that could have been harvested to augment his Spoke profile — Spoke does not rate highly on the accuracy or the completeness of its information holdings.

Actually my Spoke search on Neil's name was prompted by an inquiry from him. Neil had noticed that I'm (also) a Facebook friend with someone he went to school with, a neighbor of mine, Bruce. I told Neil that Bruce is a lawyer and lobbyist here in Washington and Neil wanted to know whom Bruce lobbies for. I didn't know, so I did a Google search and a Spoke page came up ranked high. I'll link to an image.

Spoke found a bio of Bruce on a college Web page. The bio clearly states "Bruce is an attorney and a registered lobbyist." It also, prior to that line, states that Bruce and his wife have a daughter who is a junior at the college. Spoke's conclusion: Bruce, class of '74, is himself a junior at that college. Other individuals from the college from Spoke's database are therefore, incorrectly, listed as Bruce's colleagues. Clearly whatever information-extraction algorithms Spoke is applying are way sub-par.

My last Spoke search for the purposes of this article was on Spoke CEO Frank Vaculin, whose name appears as an example in the search box. I was logged in when I did the search, which produced this sketchy result, which I captured as a screenshot from this Spoke query.

I separately ran a Google search on "Frank Vaculin" that, by contrast, brought up a different, much richer and more correct page. So much for a "single version of the truth." That better page also claims, under "Frank Vaculin's Colleagues," "4,146 contacts at Spoke Software, Inc." That figure is clearly incorrect.

Further, incredibly, the Google search exposed what I assume is Vaculin's private e-mail address! I'd guess he used it as his account username.

This has not been an enjoyable article to write. After writing the bulk of this article, I wrote Spoke CEO Frank Vaculin for comment. I told him, "Spoke search results still look pretty inaccurate and also incomplete in terms of harvesting and presenting information about individuals that is freely available on the Web." I asked, "What is Spoke doing to improve results?" and I asked him specifically about mining LinkedIn profiles. LinkedIn's use of hResume XML tagging practically invites this use of the site!

Vaculin replied, "these are good but hard questions." He elaborated on steps Spoke is taking to address them, but he asked me not to publish his reply. Board member Philippe Cases, whom Vaculin included in the exchange, mentioned an April 2007 change in company business model, but because my 2004 Spoke account still works and for other reasons, it's clear Spoke did not fundamentally rebuild its database or methods.

Fortunately, there are alternatives: Spock, ZoomInfo, and a new entrant, 123people, among them. Why are their results, while not perfect, superior to Spoke's? Some combination of better algorithms, better execution, and fresher data.

According to 123people publicist Jennifer Green, "123people is a high performance people search functionality that finds and identifies people's information from publicly available sources across the Web, searching popular social media sources, news sources, images and pictures from international sources such as traditional search engines, Flickr, Wikipedia, YouTube, Facebook, LinkedIn and many more."

That is, the engine federates a search to a variety of sites and aggregates the results, creating what Green characterizes as a "mashup of real time search results." Unfortunately, she was not forth-coming about technical underpinnings or even about the folks behind the company, but the results are nonetheless decent, better than Spoke's, even if the site's a black box. (Among other people-search engines, Wink is marginally better than Spoke, and yoName is in ways worse.)

People search is a business-intelligence tool. It's of increasing importance given the rapid growth of on-line social content and networks. We should apply appropriate quality standards, learn from the short-comings of sites such as Spoke that invite scrutiny, and choose the sites we use accordingly."If you want to keep your job, use Spoke," advise the folks behind "the fastest growing and most up-to-date business network in the U.S." Sounds like something to look into; social/people networks are one of the most important BI assets to have emerged in recent years. Grading according to accuracy, completeness, quality, usefulness, and usability, I'd give Spoke a low C. Here's why.