Keyword Services Platform
The Keyword Services Platform (KSP) is a keyword research tool available through Microsoft adCenter, which contains a set of algorithms for providing information about keywords used in search engine queries.
The KSP was originally conceived by ZhaoHui Tang, Dylan Huang, Wayne Guan, Jiong Feng, Li Luo, Ken Kwok, Fred Nie at Microsoft adCenter Labs in May 2006. It underwent a major overhaul in 2011 and the platform as we see today was developed by Nimeesh Patel, Shravana Aadith Ramia Bapulal and Vivek Vinodchandra Pradhan. The platform aims to provide a core set of data and technology to empower search engine marketing and keyword research efforts. The KSP uniquely delivers a standardized set of keyword technologies through a Web services model, accessible via an application programming interface (API) and a Microsoft Excel add-in.
KSP API beta access is available for researchers and developers upon request from the Keyword Services Platform[dead link ] feedback link.
Architecture
The following components comprise the Keyword Service Platform architecture:
- Keyword API. Set of standard Web services for various keyword tasks. These services are based on the Windows Communications Foundation and can be consumed by client applications (e.g., Microsoft Excel add-in) or mashups.
- Provider Plug-in Framework. System that allows researchers to incorporate new algorithms or data mining models within the Service Container. Each provider brings a specific keyword technology to the KSP architecture—for instance, keyword association, keyword extraction, or keyword classification.
- Security. Component that handles permissions that pertain to provider procedures (methods) and Stored Procedure implementation.
- Server Object Model. Data model that allows developers to leverage different services.
- Service Container. Set of providers that support various keyword technologies. Host to all service providers and Stored Procedures, which allows parallel execution.
- Shared Services. Core components, consisting of a crawler, in-memory data structures, word stemming algorithms, etc. These services are used by different providers and executed by stored procedures.
- Stored Procedures. Procedures for consolidating and centralizing the logic behind applications. Selected sets of these procedures are made available to users.
Developers may use .NET programming languages to create procedures that combine the use of different providers, or implement additional business logic processing based on the output from a provider.
Keyword API
The Keyword Service Platform has defined a set of APIs for each class of keyword services. These interfaces for Web services include keyword extraction (ITermExtraction), keyword categorization (ITermCategorization), keyword suggestion (ITermSuggestion), keyword forecast (ITermForecast), keyword monetization (ITermMonetization), and several others. The APIs define the signatures of each Web service.
Keyword suggestion
Keyword suggestions are handled via the ITermSuggestion interface. To find the five most closely related keywords to "BMW", the following method call may be used: GetTermSuggestion("BMW",5)
. The query result is shown in the following table, and by default, sorted by confidence:
OriginalTerm | Term |
---|---|
BMW | Auto |
BMW | Car |
BMW | Lexus |
BMW | BMW cars |
BMW | BMW Z4 |
To view the five suggested five terms with the corresponding confidence score, a third parameter can be used to indicate that statistics should be returned: GetTermSuggestion("BMW",5,true)
. The query result is shown in the following table along with columns for score and support. The results are similar to those available through the Data Mining Extensions (DMX) in SQL. Score represents the confidence or probability; support represents the number of cases supporting the rule in the training dataset.
OriginalTerm | Term | Score | Support |
---|---|---|---|
BMW | Auto | 0.96 | 10000 |
BMW | Car | 0.89 | 9000 |
BMW | Lexus | 0.89 | 11000 |
BMW | BMW cars | 0.83 | 12000 |
BMW | BMW Z4 | 0.78 | 12800 |
To return only those terms with a high confidence score, a filter can be used on the Score column with the following method call: GetTermSuggestion("BMW",5,true,"Score>0.8")
. The query result is shown in the following table. In this case, only four rows are returned, as these are the only terms that meet the criterion of the filter.
OriginalTerm | Term | Score | Support |
---|---|---|---|
BMW | Auto | 0.96 | 10000 |
BMW | Car | 0.89 | 9000 |
BMW | Lexus | 0.89 | 11000 |
BMW | BMW cars | 0.83 | 12000 |
When the table of terms possibly includes thousands of keywords, batch query syntax can be used. For example, suppose that the keywords are stored in myInputTermTable, and only the two most relevant terms for each keyword should be returned: GetTermSuggestion(myInputTermTable,2)
. The query result is shown in the following table.
OriginalTerm | Term |
---|---|
BMW | Auto |
BMW | Car |
Honda | Lexus |
Honda | Sedan |
Ford | Pickup |
Ford | Truck |
Keyword demographics
Keyword demographics are handled via the ITermDemographics interface. To obtain the demographic distribution for the keyword "Minivan", the following method call could be used: GetTermDemographics("minivan")
. The query result is shown in the following table.
Term | Male | Female | 0-13 | 13-18 | 18-25 | 25-35 | 35-50 | 50-65 | 65+ |
---|---|---|---|---|---|---|---|---|---|
Minivan | 0.40 | 0.60 | 0 | 0 | 0.1 | 0.2 | 0.4 | 0.2 | 0.1 |
Keyword monetization
Keyword monetization values specific to paid search are handled via the ITermMonetization interface. The following method call returns the KPIs for the keyword "Online bank" based on the previous week's paid search data, in the third position of sponsored listings: GetTermKPIs("online bank",TimeInterval.LastWeek,3)
. The result of the query is shown below, containing the input keyword, the number of clicks in the sponsored link for "Online bank", overall impressions for the keyword, position, average click-through rate (CTR), and average cost per click (CPC).
Term | Clicks | Impressions | Position | CTR | CPC |
---|---|---|---|---|---|
Online bank | 42 | 2915 | 3 | 0.014 | 1.325 |
Keyword extraction
Keyword extraction is handled via the ITermExtraction interface. The following method call extracts the eight most relevant keywords from the webpage "autos.msn.com", and provides the corresponding statistics: GetTermExtraction("autos.msn.com",8,true)
. The result of the query is shown below, where the Score column represents the relevance of the extracted keyword to the page content, while the Support column represents the number of occurrences of a keyword on the page.
URL | Term | Score | Support |
---|---|---|---|
autos.msn.com | auto reviews | 0.62 | 3 |
autos.msn.com | MSN autos | 0.54 | 2 |
autos.msn.com | cars | 0.48 | 5 |
autos.msn.com | sport cars | 0.39 | 2 |
autos.msn.com | used cars | 0.38 | 1 |
autos.msn.com | compare car | 0.34 | 1 |
autos.msn.com | new cars | 0.32 | 1 |
autos.msn.com | luxury cars | 0.30 | 1 |
Sample code
The following code fragment connects to the Keyword Services Platform server and uses the keyword term forecast Web service.
using (KeywordServer server = new KeywordServer("https://ksp.microsoft.com")) {
server.UserName = "username";
server.Password = "********";
ITermForecast provider = null;
try {
server.Open();
// Context can be set if needed. It will remain during the following calls.
provider = server.GetProviderByImplementation<ITermForecast>("Microsoft.adCenterLabs.Providers.KeywordForecastProvider");
if (provider != null) {
// Single mode API
DataTable result = provider.GetTermForecast(term, -5, 3);
DisplayResults(result);
// Batch mode API
result = provider.GetTermForecast(terms, -5, 3);
DisplayResults(result);
}
}
catch (FaultException) {
// Handle fault returned from calling the proxy method
}
catch (CommunicationException) {
// Handle lost network connection error
}
catch (TimeoutException) {
// Handle time-out error
}
finally {
if (provider != null)
server.ReleaseService(provider);
}
}
Providers
Each Keyword Services Platform provider supplies a specific type of keyword technology by implementing one class of a specific keyword interface (e.g., ITermSuggestion, ITermForecast, ITermExtraction). The API defines the signature of each Web service and the format of the returned data. The KSP provider is a server-side object encapsulating a particular implementation of a keyword technology. This provider exposes its functionality through service contracts in the Windows Communication Foundation (WCF). The WCF is Microsoft's unified programming model for building service-oriented applications, which enables developers to build secure, reliable, transacted solutions that integrate across platforms and interoperate with existing investments. To enable seamless integration of a provider into the KSP, and correspondingly seamless integration with third-party tools and applications, the providers must meet several conditions:
- custom configuration settings stored in configuration files, rather than the source code;
- standard .NET tracing and message logging to enable service monitoring and diagnostics;
- standard Windows Management Interface performance counters for performance monitoring; and
- document service contracts included using a service description language for better understanding and testing of the Keyword Service Provider.
Stored procedures
Developers can write stored procedures (sprocs) using any .NET programming language. These procedures are executed on the Keyword Services Platform server, which hosts the Common Language Runtime (CLR). Similar to a database sprocs, a KSP sproc is designed to enable developers to implement several types of business logic on the server side after retrieving result data from providers. KSP sprocs do not require configuration management or setup requirements.
Two types of stored procedures are supported: Managed Assembly Stored Procedure (MASP) and Common Language Runtime Stored Procedure (CLRSP). A MASP consists of a compiled .NET assembly containing a public interface exposed through the KSP as well as any dependent files. Once the MASP is uploaded to the KSP through its management interface, it becomes callable by KSP client programs. A CLRSP consists of a source file written in one of the supported CLR programming languages (C#, Visual Basic .NET, Managed Extensions for C++, and others). The functionalities of the CLRSPs are exposed through a public interface defined in the source file. Once the CLRSP is deployed to KSP through its management interface, it is compiled on-demand by KSP and becomes callable by KSP client programs. Compared to database sprocs, KSP sprocs are object-oriented. A sproc may contain a set of related functions, or even identically named functions with different signatures.
Server Object Model and Shared Services
Keyword Services Platform Server Object Models and Shared Services enable KSP Service Providers and stored procedure developers to access server-side objects and functionalities easily and consistently. The object model consists of the following three collections:
- Service providers: This collection enables callers to access server-side Service Provider objects by name, implementation interface, and/or class name. Once callers obtain the Service Provider object, all of the functionalities of the service provider are accessible through its public interface.
- Stored procedures: This collection enables callers to access server-side Stored Procedure objects by name, implementation interface, and/or class name. Once callers obtain the Stored Procedure object, all of the functionalities of the stored procedure are accessible through its public interface.
- Services: This collection enables callers to access server-side shared services by name, by implementation interface, and/or class name. Once callers obtain the shared service object, all of the functionalities of the shared service provider are accessible through its public interface.
Cloud server model
The Microsoft adCenter Keyword Services Platform server farm provides a scalable platform for keyword technologies. Each server in the farm can have different configuration to suit a variety of service providers and stored procedures. A dynamic service load balance server, a cloud server, is the hub of the KSP server farm. When a KSP server is added to the server farm via the cloud server, all available keyword service providers and stored procedures are dynamically discovered and registered with the server. Any changes in the availability of the KSP server, as well as all its running service providers and stored procedures, are discovered and registered automatically with the server.
The cloud server distributes accesses to services running on a KSP server farm through its load balancer provider. The default implementation of the load balancer provider uses a round-robin scheduling approach. Over time, the server accumulates usage patterns and statistics of various service providers and stored procedures running on each KSP server in the farm. This information is used by the server to determine how to automatically deploy additional service providers and stored procedures. For example, if the Keyword Forecast provider is being used heavily in the server farm and the providers running on machine "A" are used lightly, the server will automatically deploy the Keyword Forecast provider to machine "A" and route requests to that machine to balance the load for the Keyword Forecast provider.
When a client application calls a service provider or stored procedure through the server, a KSP server with a matching service provider or stored procedure is selected by the load balancer provider, and the request is routed to the appropriate KSP server. If a server, service provider, or stored procedure in the KSP server farm is unavailable, it will be taken out of rotation by the load balancer automatically.
Data mart
A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. Many Keyword Services Platform providers require real-time database access. The database may contain a list of reference keywords, their corresponding traffic, most recent click-through data, and data mining model contents. This data is updated through ETL data pipelines on a regular basis based on the provider's requirements.
Technology transfer
Keyword Services Platform's architecture permits agile development and rapid technology transfer by providing a platform for researchers to ship their research results to a live system quickly. The API defines the standard contract between the research models and developers. Researchers simply need to implement providers and deploy the providers into the selected set of KSP cloud server machines. The scope is limited, and thus very easy to use for live testing. Once the provider is live-tested and proven, KSP can switch to the default provider without any changes on the application side. This infrastructure enables researchers at Microsoft and other academic settings to speed up innovation in keyword technology and deploy the latest research results to KSP consumers.
KSP data access with Microsoft Excel 2007
Microsoft adCenter released an add-in for Microsoft Excel 2007 that allows users to consume the Keyword Services Platform data directly via Excel rather than through the API. The add-in makes much of the keyword technology available directly through Excel. Essentially it is an example of the type of mashup and creative use of data that can be associated with the KSP. The add-in delivers features such as keyword extraction, suggestion, forecasting, monetization, etc.
Applications of the KSP
The Keyword Services Platform incorporates keyword technologies from Microsoft adCenter Labs and other Microsoft Research groups. Keyword APIs can be consumed by third-party business applications from paid search, content advertisements, behavioral targeting, presale business intelligence apps, and so on.
The KSP can be used in advertising campaign creation and management:
- The Keyword Association provider can help advertisers generate a set of the most relevant keywords for a campaign, leading to more efficient planning and improved return on investment.
- The Keyword Forecasting provider can help advertisers to understand traffic history and trends, and eventually help to manage an integrated campaign budget that makes seasonal allowances.
- The Keyword Extraction provider can extract the important keywords on a publisher's webpage, helping to identify what advertisements should be served for that page, thus facilitating landing page analysis.
The KSP can also be used in behavioral targeting and display advertising:
- The Keyword Demographic and Geographic Distribution providers can help advertisers understand various customer segments and their keyword usage patterns, leading to more effectively targeted advertising and a decreased overall spend.
- Keyword Association providers can help to expand existing customer segments to include other customers with similar interests based on language patterns.
References
- Microsoft Announces New Keyword Platform at SIS
- adCenter Add-in Released Review of keyword technologies from the KSP
Further reading
- Wen-tau Yih, Joshua Goodman, Vitor R. Carvalho: Finding advertising keywords on web pages. WWW 2006: 213-222
- Ning Liu, Shuzhen Nong, Jun Yan, Benyu Zhang, Zheng Chen, Ying Li: Similarity of Temporal Query Logs Based on ARIMA Model. ICDM 2006: 975-979
- Honghua (Kathy) Dai, Lingzhi Zhao, Zaiqing Nie, Ji-Rong Wen, Lee Wang, Ying Li: Detecting online commercial intention (OCI). WWW 2006: 829-837
- Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li: Detecting dominant locations from search queries. SIGIR 2005: 424-431
- ZhaoHui Tang, Jamie Maclennan, Pyungchul (Peter) Kim: Building data mining solutions with OLE DB for DM and XML for analysis. SIGMOD Record 34(2): 80-85 (2005)
- ZhaoHui Tang, Jamie Maclennan: Data Mining with SQL Server 2008, Wiley, 2008.