TREX search engine
TREX is a search engine in the SAP NetWeaver integrated technology platform produced by SAP AG using columnar storage. The TREX engine is a standalone component that can be used in a range of system environments but is used primarily as an integral part of such SAP products as Enterprise Portal, Knowledge Warehouse, and Business Intelligence (BI, formerly SAP Business Information Warehouse). In SAP NetWeaver BI, the TREX engine powers the BI Accelerator, which is a plug-in appliance for enhancing the performance of online analytical processing. The name "TREX" stands for Text Retrieval and information EXtraction, but it is not a registered trade mark of SAP and is not used in marketing collateral.
TREX supports various kinds of text search, including exact search, boolean search, wildcard search, linguistic search (grammatical variants are normalized for the index search) and fuzzy search (input strings that differ by a few letters from an index term are normalized for the index search). Result sets are ranked using term frequency-inverse document frequency (tf-idf) weighting, and results can include snippets with the search terms highlighted.
TREX supports text mining and classification using a vector space model. Groups of documents can be classified using query based classification, example based classification, or a combination of these plus keyword management.
TREX supports structured data search not only for document metadata but also for mass business data and data in SAP Business Objects. Indexes for structured data are implemented compactly using data compression and the data can be aggregated in linear time, to enable large volumes of data to be processed entirely in memory.
Recent developments include:
- A join engine to join structured data from different fields in business objects
- A fast update capability to write a delta index beside a main index and to merge them offline while a second delta index takes updates
- A data mining feature pack for advanced mathematical analysis
The first code for the engine was written in 1998 and TREX became an SAP component in 2000. The SAP NetWeaver BI Accelerator was first rolled out in 2005. As of Q1 2013, the current release of TREX is SAP NW 7.1.
- Daniel Abadi; Peter Boncz; Stavros Harizopoulos; Stratos Idreos; Samuel Madden (2012). "The Design and Implementation of Modern Column-Oriented Database Systems" (PDF). Foundations and Trends in Databases. 5 (3): 197–280. doi:10.1561/1900000024.