TREX search engine
TREX is a search engine in the SAP NetWeaver integrated technology platform produced by SAP SE using columnar storage. The TREX engine is a standalone component that can be used in a range of system environments but is used primarily as an integral part of SAP products such as Enterprise Portal, Knowledge Warehouse, and Business Intelligence (BI, formerly SAP Business Information Warehouse). In SAP NetWeaver BI, the TREX engine powers the BI Accelerator, which is a plug-in appliance for enhancing the performance of online analytical processing. The name "TREX" stands for Text Retrieval and information EXtraction, but it is not a registered trademark of SAP and is not used in marketing collateral.
TREX supports various kinds of text search, including exact search, boolean search, wildcard search, linguistic search (grammatical variants are normalized for the index search) and fuzzy search (input strings that differ by a few letters from an index term are normalized for the index search). Result sets are ranked using term frequency-inverse document frequency (tf-idf) weighting, and results can include snippets with the search terms highlighted.
TREX supports text mining and classification using a vector space model. Groups of documents can be classified using query based classification, example based classification, or a combination of these plus keyword management.
TREX supports structured data search not only for document metadata but also for mass business data and data in SAP Business Objects. Indexes for structured data are implemented compactly using data compression and the data can be aggregated in linear time, to enable large volumes of data to be processed entirely in memory.
Recent developments include:
- A join engine to join structured data from different fields in business objects
- A fast update capability to write a delta index beside a main index and to merge them offline while a second delta index takes updates
- A data mining feature pack for advanced mathematical analysis
The first code for the engine was written in 1998 and TREX became an SAP component in 2000. The SAP NetWeaver BI Accelerator was first rolled out in 2005. As of Q1 2013, the current release of TREX is SAP NW 7.1.
A security vulnerability in TREX was first identified and fixed in 2015 (see SAP Security Note 2234226). The vulnerability occurred due to lack of authentication in TREXnet, an internal communication protocol. The aforementioned patch fixed the problem by removing some critical functionality.
Later on, ERPScan head of threat intelligence Mathieu Geli continued to look into the vulnerability and found that the vulnerability was still exploitable . Moreover, in case of successful attack, the vulnerability would allow a remote attacker to get full control over the server without authorization . The vulnerability has been finally patched via SAP Security Note 2419592.
- Daniel Abadi; Peter Boncz; Stavros Harizopoulos; Stratos Idreos; Samuel Madden (2012). "The Design and Implementation of Modern Column-Oriented Database Systems" (PDF). Foundations and Trends in Databases. 5 (3): 197–280. doi:10.1561/1900000024.