List of text mining software

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Text mining computer programs are available from many commercial and open source companies and sources.


  • AeroText – a suite of text mining applications for content analysis. Content used can be in multiple languages.
  • Angoss – Angoss Text Analytics provides entity and theme extraction, topic categorization, sentiment analysis and document summarization capabilities via the embedded Lexalytics Salience Engine. The software provides the unique capability of merging the output of unstructured, text-based analysis with structured data to provide additional predictive variables for improved predictive models and association analysis.
  • Attensity – hosted, integrated and stand-alone text mining (analytics) software that uses natural language processing technology to address collective intelligence in social media and forums; the voice of the customer in surveys and emails; customer relationship management; e-services; research and e-discovery; risk and compliance; and intelligence analysis.
  • AUTINDEX - is a commercial text mining software package based on sophisticated linguistics by IAI (Institute for Applied Information Sciences), Saarbrücken.
  • Autonomy – text mining, clustering and categorization software
  • Averbis – provides text analytics, clustering and categorization software, as well as terminology management and enterprise search
  • Basis Technology – provides a suite of text analysis modules to identify language, enable search in more than 20 languages, extract entities, and efficiently search for and translate entities.
  • Clarabridge – text analytics (text mining) software, including natural language (NLP), machine learning, clustering and categorization. Provides SaaS, hosted and on-premises text and sentiment analytics that enables companies to collect, listen to, analyze, and act on the Voice of the Customer (VOC) from both external (Twitter, Facebook, Yelp!, product forums, etc.) and internal sources (call center notes, CRM, Enterprise Data Warehouse, BI, surveys, emails, etc.).
  • Complete Discovery Source - provides software and services for data discovery and data analytics via Nytrix CIY and other proprietary tools.
  • Endeca Technologies – provides software to analyze and cluster unstructured text.
  • Expert System S.p.A. – suite of semantic technologies and products for developers and knowledge managers.
  • FICO Score – leading provider of analytics.
  • General Sentiment - Social Intelligence platform that uses natural language processing to discover affinities between the fans of brands with the fans of traditional television shows in social media. Stand alone text analytics to capture social knowledge base on billions of topics stored to 2004.
  • IBM LanguageWare - the IBM suite for text analytics (tools and Runtime).
  • IBM SPSS - provider of Modeler Premium (previously called IBM SPSS Modeler and IBM SPSS Text Analytics), which contains advanced NLP-based text analysis capabilities (multi-lingual sentiment, event and fact extraction), that can be used in conjunction with Predictive Modeling. Text Analytics for Surveys provides the ability to categorize survey responses using NLP-based capabilities for further analysis or reporting.
  • Idibon - provider of text analytics and natural language processing in any language. Idibon's cloud-based natural language processing services enable organizations to efficiently structure and organize their language data.
  • Inxight – provider of text analytics, search, and unstructured visualization technologies. (Inxight was bought by Business Objects that was bought by SAP AG in 2008).
  • LanguageWare – text analysis libraries and customization software from IBM.
  • Language Computer Corporation – text extraction and analysis tools, available in multiple languages.
  • Lexalytics - provider of a text analytics engine used in Social Media Monitoring, Voice of Customer, Survey Analysis, and other applications.
  • LexisNexis – provider of business intelligence solutions based on an extensive news and company information content set. LexisNexis acquired DataOps to pursue search
  • Luminoso – enterprise feedback and text analytics solutions developed over a decade of natural language processing (NLP), machine learning and artificial intelligence research at MIT Media Lab. Enables clients to understand, measure and act on large amounts of consumer feedback, across multiple channels.[1][2]
  • Mathematica – provides built in tools for text alignment, pattern matching, clustering and semantic analysis.
  • Medallia - offers one system of record for survey, social, text, written and online feedback.
  • Megaputer Intelligence - derives actionable knowledge from large volumes of text and structured data, including natural language processing (NLP), machine learning, sentiment analysis, entity extraction, clustering, and categorization.
  • NetOwl – suite of multilingual text and entity analytics products, including entity extraction, link and event extraction, sentiment analysis, geotagging, name translation, name matching, and identity resolution, among others.
  • RapidMiner with its Text Processing Extension – data and text mining software.
  • SAS – SAS Text Miner and Teragram; commercial text analytics, natural language processing, and taxonomy software used for Information Management.
  • Semantria - offers its services via API and Excel plugin. It is a spinoff of text-analysis software Lexalytics, but differs in that it is offered via API and Excel plugin, and in that it incorporates a bigger knowledge base and uses deep learning.
  • Smartlogic – Semaphore; Content Intelligence platform containing commercial text analytics, natural language processing, rule-based classification, ontology/taxonomy modelling and information vizualization software used for Information Management.
  • StatSoft – provides STATISTICA Text Miner as an optional extension to STATISTICA Data Miner, for Predictive Analytics Solutions.
  • Sysomos - provider social media analytics software platform, including text analytics and sentiment analysis on online consumer conversations.
  • Textalytics - Meaning as a Service: a set of text analytics APIs that offer vertical, high-level functionality targeted at specific usage scenarios: Semantic Publishing, Media Analysis, Voice of the Customer.
  • WordStat - Content analysis and text mining add-on module of QDA Miner for analyzing large amounts of text data.
  • Xpresso - XPRESSO, an engine developed by the Abzooba’s core technology group, is focused on the automated distillation of expressions in social media conversations.[3]

Commercial and Research[edit]

  • RxNLP API for Text Mining and NLP – text mining APIs for both research and commercial use. APIs includes n-gram generation, sentence clustering, opinion summarization, and others

Open source[edit]

  • Carrot2 – text and search results clustering framework.
  • GATE – General Architecture for Text Engineering, an open-source toolbox for natural language processing and language engineering
  • Gensim - large-scale topic modelling and extraction of semantic information from unstructured text (Python)
  • OpenNLP - natural language processing
  • Natural Language Toolkit (NLTK) – a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language.
  • Text Mechanic – Simple, single task, browser based, text manipulation tools.[4]
  • The programming language R provides a framework for text mining applications in the package tm.[5] The Natural Language Processing task view contains tm and other text mining library packages.[6]
  • The KNIME Text Processing extension.
  • KH Coder - For content analysis, text mining or corpus linguistics.
  • The PLOS Text Mining Collection[7]


  1. ^ Alba, Davet (12 February 2015). "The Startup That Helps You Analyze Twitter Chatter in Real Time". Wired. Retrieved 4 March 2015. 
  2. ^ Lohr, Steve (27 June 2014). "The U.S.-Germany Match Through a Social Media Lens". New York Times. Retrieved 4 March 2015. 
  3. ^ ":: Welcome to Abzooba ::". Retrieved 2013-10-13. 
  4. ^ "Text Mechanic". TextMechanic. 
  5. ^ Introduction to the tm Package: Text Mining in R
  6. ^ CRAN Task View: Natural Language Processing
  7. ^ "Table of Contents: Text Mining". PLOS. 

External links[edit]