User:Ray3055/Notes

From Wikipedia, the free encyclopedia
Jump to: navigation, search

KISBEE, Thomas (Naval Officer)[edit]

Inventor of the 'Kisbie ring' and 'breeches buoy'. Also 1st Lieutenant on board the HMSV Driver, which was the first steam paddle sloop to circumnavigate the world (1841-47). Thomas spent 1846-7 transporting Governor Grey around the North Island of New Zealand during the Maori Uprising. See also Kisbee Bay BORN 1792 FARCET,HUNTS.,ENGLAND DIED 1877 GREAT YARMOUTH, ENGLAND (Info provided by Judith Kisbee) [www.kisbee.co.uk]

Florence Nightingale[edit]

1859 first draft of her 'Notes... publication: [4]

Stemming[edit]

Multilingual Stemming[edit]

Multilingual stemming applies morphological rules of two or more languages simultaneously instead of rules for only a single language when interpreting a search query. Commercial systems using multilingual stemming exist.[1]

Commercial Usage[edit]

Many commercial companies have been using stemming since at least the 1980's and have produced algorithmic and lexical stemmers in many languages. [2] [3]

The Snowball stemmers have been compared with commercial lexical stemmers with varying results. [4] [5]

Stemming Strength[edit]

Weak and strong -- refs:

Paice -- A 'weak' or 'light' stemmer is one which only merges a few of the most highly related words together - for example, just singular and plural forms, or inflective variants of verbs ("react", "reacts", "reacting", "reacted"). A 'strong' or 'heavy' stemmer, on the other hand, merges a much wider variety of forms (… "reaction", "reactions", "reactive", "reactivity", "reactant" etc.). In IR searching, light stemming seems to be most effective; heavy stemming increases the chances of ambiguity and confusion (for instance, merging "author" with "authority" is generally not helpful). In other situations - e.g., for displaying terms in a user interface - a heavier stemmer may be quite useful.

Whether the impact of a stemmer is positive depends not only on its strength, but also on whether performs its task accurately - stemmer strength metrics do not represent actual stemming accuracy

Frakes & Fox (in press) have listed the following ways to measure stemmer strength:

Number of words per conflation class This is the average size of the groups of words coverted to a particular stem (regardless of whether they are all correct). Thus, if the words "engineer", "engineering" and "engineered", and no others, were all stemmed to the stem "engineer", then the size of that conflation class would be 3. If the conflation of 1,000 different words resulted in 250 distinct stems, then the mean number of words per conflation class would be 4. This metric is obviously dependent on the number of words processed, but for a word collection of given size, a higher value indicates a heavier stemmer. The value is easily calculated as follows:

WC = Mean number of words per conflation class

N = Number of unique words before Stemming

S = Number of unique stems after Stemming

Thus, a stemmer may replace the endings "-ies" and "-ied" by "-y". We could count this as 3 (number of letters removed), or 2 (reduction in length), or even 4 (letters removed + letters added). Alternatively, we could use a metric such as the Hamming .....

Others state weak stemmers are just 'inflectional' stemmers, while strong stemmers use 'derivational' stemming also.

References[edit]

  1. ^ [1] "Understanding Stemming". Coveo Knowledge Base (2006)
  2. ^ [2] International Developer Tools. dtSearch
  3. ^ [3] Building Multilingual Solutions by using Sharepoint Products and Technologies. Microsoft Technet
  4. ^ CLEF 2003: Stephen Tomlinson compared the Snowball stemmers with the Hummingbird lexical stemming (lemmatization) system.
  5. ^ CLEF 2004: Stephen Tomlison "Finnish, Portuguese and Russian Retrieval with Hummingbird SearchServer"

External links[edit]

http://www.clef-campaign.org/


Search Engines[edit]

Thunderstone Search Appliance[edit]

http://www.highbeam.com/doc/1G1-97130092.html http://www.infoworld.com/archives/emailPrint.jsp?R=printThis&A=/article/04/10/15/42TCsearch_1.html - Review Google/Thunderstone

Thunderstone[edit]

performance solutions for enterprise search, text mining, real-time alerting, web crawling/indexing, Internet publishing and information management to corporations, government agencies, on-line service providers and developers worldwide. Thunderstone's flagship product, TexisT, is the foundation of its entire product line. Texis, in one package, provides every full-text, SQL, multimedia management, and dynamic publishing operation needed to support an enterprise search application. Thunderstone's unique technology offers concept-based search, fast pattern matching, geographic searching, set logic, foreign language support and the ability to query structured and unstructured information. Thunderstone also offers the Thunderstone Search Appliance, an all-in-one, hardware, software, crawling, indexing and search solution based on its Texis technology. The Thunderstone Search Appliance is GSA Schedule 70 listed.


Expansion Programs International, Inc. - Cleveland OHIO -(Thunderstone Software LLC), privately held - California corporation - founded in 1981 as EPI Inc. develops and markets a suite of software applications and tools that search, manage, filter and retrieve information.

Thunderstone's products are licensed to corporations, government agencies, on-line service providers, Internet publishers and developers worldwide.

From 1980-1995 most of Thunderstone's product licenses were embedded within OEM packages developed and sold by other organizations. Notable examples include, Wordperfect Corp. , Dow Jones, C3 - Telos, The Japan Times, and Reality Software.

Since 1995, the increased popularity of Internet search technology has raised Thunderstone's profile in large single-site applications like those at eBay, Novell, Advance Publications, Pactel, Associated Press, Ziff-Davis, and Bill Gates' Corbis.

List of search engines[edit]

A real mess of a list at present - needs a definition at top OR a description of what it is: "A list of Wikipedia articles about search engines, including web search engines, Meta search engines, Desktop search tools, and Web portals and vertical market websites that have a search facility for online databases.

Shorter Oxford English Dictionary (5th ed) - Computing: A program which searches for and identifies specified items in a database or network, esp. the Internet.

Baidu/Robin Lii[edit]

He was listed in CNN Money annual "50 people who matter now" in 2007. [1]

"And even before Google did it, Baidu allowed advertisers to bid for ad space and then pay Baidu every time a customer clicked on an ad. " New York Times September 2006.

Also model appears to be pay for position - and not sponsored and listed seperatley like Google.

Google license AdWords technology?


Baidu/Shawn Wang[edit]

In December, 2007 Baidu became the first company from China to be included in the NASDAQ-100 index. [2]


http://ir.baidu.com/phoenix.zhtml?c=188488&p=irol-newsArticle_Print&ID=1090264&highlight=

Shawn Wang joined Baidu (BIDU:baidu com inc )as CFO in September 2004; he'd previously been a partner at the accounting and consulting firm Pricewaterhouse Coopers. He helped lead the company through its successful initial public offering on NASDAQ in August 2005, and through its recent inclusion in the NASDAQ-100; making Baidu the first company from China to be included in the index.

He was named "CFO of the Year" by CFO Magazine in 2005.

Shawn Wang was appointed as an independent director on WuXi PharmaTech's (WX) Shanghai provider of outsourced drug-development services) Board of Directors in July, 2007 when the company was preparing its public listing on the New York Stock Exchange. During his brief tenure and as the Chairman of the Audit Committee, Shawn advised the company's management in particular areas of financial reporting, Sarbanes-Oxley Act compliance and investor relations.

He died in an accident while vacationing in China on Thursday December 27, 2007.


Analysts said Baidu would be able to avoid a crisis in the absence of Wang. "While employee sentiment is likely to be negative in the near term, we believe the business impact is not significant," said Dick Wei, a technology strategist with JPMorgan in Hong Kong, in a note Monday. Wei cited Baidu.com's senior management team and "solid" finance and human resources departments as capable of steering the firm through the transition ahead. "We remain positive of Baidu, the dominant market leader in China's online search market, which is still in an early growth stage," Wei said.

Baidu went public on Nasdaq in August 2005 at $27 a share; less than 21/2 years later, it's trading at nearly 15 times its IPO price.

Baidu said the CFO's duties would be assumed by the company's senior managers for now. JPMorgan said Haoyu Shen, Baidu's vice president of business operations, would likely succeed Wang in overseeing finance operations. JP Morgan maintained Baidu's share price target at $400.

Sohu[edit]

For the fiscal year ended 31 December 2007, Sohu.com Inc.'s revenues increased 41% to $188.9M. Net income increased 31% to $35M. [3]

  1. ^ CNN Money, June 2007, "50 people who matter now"
  2. ^ http://www.latimes.com/news/printedition/front/la-fi-baidu10dec10,1,7585164.story?ctrack=1&cset=true LA Times, 10 Dec 2007, "Baidu search yields success in China"
  3. ^ Company Profile reuters.com