Archive for August, 2008
The social bookmarking phenomenon emerged several years ago as the trendy habit of people using free social bookmarking sites to categorize and manage their favorite webpages.
Traditionally, we resorted to the search engine’s internal classification system to handle your website’s themes. Search engines use the keywords on each webpage together with the keyword density to classify webpages by topic types. However, since search engines do not really understand webpages and keywords, they rely on statistical methods to classify your webpage according to their existing database of webpages with similar keywords.
Detecting when words are “similar” is easy for human beings. However, this is not an easy job for search engines, which are powered by computers. Computers do not understand synonyms that are different in spelling but similar in meaning. As I discussed in a previous post, several years ago Google acquired a company called Applied Semantics that attempts to handle this problem with its own invention, semantic technology.
Now the practice of “tagging” comes along to solve this problem. The best entity to determine a particular webpage’s classification is the webpage’s author, who is human being, fully understanding what the webpage s/he has written is about. Thus s/he “tags” it, using different words or sets of words to summarize the content of his/her website pages.
For example, since I am the author of this webpage, I can tag this it with keywords such as “social bookmarking”, “tag”, “tagging”, “Applied Semantics” or even some other related topics (such as social marketing sites “Technorati”, “de.licio.us”) that I think are the most important keywords related to this page. On the other hand, as viewed by a search engine, this webpage might be classified as “webpage” as this word appears most frequently in the article. Do you see the differences of machine versus human being?
At the same time, websites like Technorati and de.licio.us emerge as the mainstreams of so-called social bookmarking services as they allow users to register an account and bookmark their favorite websites with appropriate tags they themselves assigned to the webpages. They can even share their database with others (hence the term “social”). These bookmarking websites steadily emerge as a good source of “commentaries” and “classifications” of webpages in cyberspace. Some people further comment that the goal of tagging is not to classify, but to memorize.
It’s very logical that search engines will also consider the information from these bookmarking websites as a source of authoritative sites and webpages for particular popular keywords. This leads to the practice by some people of manipulating the social bookmarking websites (e.g., creating multiple user accounts to bookmark their own webpages with the carefully chosen tags as keywords) to artificially generate their own “popular” webpages within social bookmarking websites. Such people hope this will increase the search engines’ positioning of their webpages in search results.
An interesting book on this topic can be found here. This book teaches you how to use this tactic when blogging using popular website software like WordPress, and actually reveals the drawback of referencing a webpage by tagging from the search engines’ perspective. This is because they are able to be manipulated by human beings, and so can create bias for a webpage.
The use of different variations of a keyword, such as “programme”, “program”, “programmes”, “programs” for the same concept can create a lot of confusion as well, creating additional problems with tagging.
One way to take advantage of this growing trend is to add a user-friendly component in your webpage to allow users to easily add your webpage to their favorite social bookmarking websites. If you take a look at the end of each post in my blog, you will see some lines of popular bookmarking websites like Del.icio.us, Spurl, Furl, Simpy, Blink, Digg, and those specializing in blogs like Technorati. Those lines allow my visitors to easily bookmark my webpage in their social bookmarking accounts. Someone coined the term “Social Media Optimization” (SMO), parallel to what we commonly referred to “Search Engine Optimization” (SEO). But note that SMO also extends to Web 2.0 sites’ optimization such as Facebook.com, Myspace.com, etc. We’ll talk about this in a later post.
Tags: Social Media Optimisation, Web 2.0 Optimization, Social Bookmarking
Cuil.com, a new search engine launched two days ago, is set to be another competitor for Google in the web search industry.
What makes this search engine different from others is the profile of its founders. Most of them are ex-employees of Google, Inc. In particular, one of the main architects of this new search engine, Anna Patterson, was an important contributor of Google’s present search algorithm.
Attracting a lot of curious traffic, Cuil.com’s launch experienced such high traffic that the site was periodically out of service the first day.
According to some news sources, Patterson left Google because of its refusal to try innovative changes to their search algorithm. Anderson’s own search technology was acquired by Google in 2004, when her search algorithm was incorporated into Google’s search engine. She left Google for a new venture, creating another search engine with the debut of its self-proclaimed innovative search algorithm.
Unlike Powerset, the natural-language search engine recently acquired by Microsoft, Cuil focuses its full effort on improving the cost and speed of indexing web pages (with its search algorithms remaining a mystery to us), hoping to return more relevant and powerful search results to web surfers than Google.
Upon my first few attempts, the only thing that impresses me so far is the format of its output pages. The magazine-styled output page tries to provide pictures together with the search pages’ content to enrich the user’s search experience. Though from what I can tell, the pictures provided by its search result pages are mostly extractions from the returned web pages, and some of those (as you can probably imagine) are really silly extractions that hardly accurately represent the web sites recommended.
Most importantly, Cuil.com fails to return web pages that I know are important for a particular search term. I conducted an interesting test using this search engine to search its own name, “cuil”, but none of the returned web pages even show the web site’s own link, http://www.curl.com!
Interestingly, if I use Google to search for the same term, it returns the related news about curl.com, and indeed the first search result is the search engine’s own hyperlink http://www.curl.com (Quite ridiculous, eh?)
Whether this search engine can establish a foothold in the search engine industry remains unknown. But what can be sure is that the emergence of a new search engine provides us with more search choices in quality web surfing, and that is truly beneficial to all of us.
To Google, perhaps this is also another push to improve its search algorithm to handle the new competition. That could be good. In fact, this view is shared by Google itself. An official of Google said they welcome the new search engine to the competition, since it drives them to provide even more superior service to its customers.
I have set up a new Google Alert to track the term “cuil.com” for any updated news about this search engine. Have you?
Tags: Curl, curl.com, Google, Anna Patterson, Powerset





