List of text mining software

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Text mining computer programs are available from many commercial and open source companies and sources.


  • Amenity Analytics – develops cloud-based text analytics solutions using natural language processing and machine learning to draw insights at scale from any source of unstructured data.
  • Angoss – Angoss Text Analytics provides entity and theme extraction, topic categorization, sentiment analysis and document summarization capabilities via the embedded
  • AUTINDEX - is a commercial text mining software package based on sophisticated linguistics by IAI (Institute for Applied Information Sciences), Saarbrücken.
  • Autonomy – text mining, clustering and categorization software
  • Averbis – provides text analytics, clustering and categorization software, as well as terminology management and enterprise search
  • Basis Technology – provides a suite of text analysis modules to identify language, enable search in more than 20 languages, extract entities, and efficiently search for and translate entities.
  • Clarabridge – text analytics (text mining) software, including natural language (NLP), machine learning, clustering and categorization. Provides SaaS, hosted and on-premises text and sentiment analytics that enables companies to collect, listen to, analyze, and act on the Voice of the Customer (VOC) from both external (Twitter, Facebook, Yelp!, product forums, etc.) and internal sources (call center notes, CRM, Enterprise Data Warehouse, BI, surveys, emails, etc.).
  • DigitalMR - social media listening & text+image analytics tool for market research
  • Endeca Technologies – provides software to analyze and cluster unstructured text.
  • FICO Score – leading provider of analytics.
  • General Sentiment - Social Intelligence platform that uses natural language processing to discover affinities between the fans of brands with the fans of traditional television shows in social media. Stand alone text analytics to capture social knowledge base on billions of topics stored to 2004.
  • IBM LanguageWare - the IBM suite for text analytics (tools and Runtime).
  • IBM SPSS - provider of Modeler Premium (previously called IBM SPSS Modeler and IBM SPSS Text Analytics), which contains advanced NLP-based text analysis capabilities (multi-lingual sentiment, event and fact extraction), that can be used in conjunction with Predictive Modeling. Text Analytics for Surveys provides the ability to categorize survey responses using NLP-based capabilities for further analysis or reporting.
  • Inxight – provider of text analytics, search, and unstructured visualization technologies. (Inxight was bought by Business Objects that was bought by SAP AG in 2008).
  • Language Computer Corporation – text extraction and analysis tools, available in multiple languages.
  • Lexalytics - provider of a text analytics engine used in Social Media Monitoring, Voice of Customer, Survey Analysis, and other applications. Salience Engine. The software provides the unique capability of merging the output of unstructured, text-based analysis with structured data to provide additional predictive variables for improved predictive models and association analysis.
  • LexisNexis – provider of business intelligence solutions based on an extensive news and company information content set. LexisNexis acquired DataOps to pursue search
  • Linguamatics – provider of natural language processing (NLP) based enterprise text mining and text analytics software, I2E, for high-value knowledge discovery and decision support.
  • Luminoso – enterprise feedback and text analytics solutions developed over a decade of natural language processing (NLP), machine learning and artificial intelligence research at MIT Media Lab. Enables clients to understand, measure and act on large amounts of consumer feedback, across multiple channels.[1][2]
  • Mathematica – provides built in tools for text alignment, pattern matching, clustering and semantic analysis. See Wolfram Language, the programming language of Mathematica.
  • MATLAB offers Text Analytics Toolbox for importing text data, converting it to numeric form for use in machine and deep learning, sentiment analysis and classification tasks.[3]
  • MeaningCloud - formerly known as Textalytics: a set of text analytics APIs offered both in SaaS mode and on-premises that are totally customizable to obtain the highest accuracy and very easy to integrate in any system or environment thanks to its SDKs and plug-ins.
  • Medallia - offers one system of record for survey, social, text, written and online feedback.
  • Megaputer Intelligence - derives actionable knowledge from large volumes of text and structured data, including natural language processing (NLP), machine learning, sentiment analysis, entity extraction, clustering, and categorization.
  • NetOwl – suite of multilingual text and entity analytics products, including entity extraction, link and event extraction, sentiment analysis, geotagging, name translation, name matching, and identity resolution, among others.
  • PoolParty Semantic Suite lets you develop a knowledge graph – thus structure and represent your prioritised knowledge domains. The highly performative PoolParty service extracts entities and terms following a sophisticated text mining algorithm.
  • RapidMiner with its Text Processing Extension – data and text mining software.
  • SAS – SAS Text Miner and Teragram; commercial text analytics, natural language processing, and taxonomy software used for Information Management.
  • Semantria - offers its services via API and Excel plugin. It is a spinoff of text-analysis software Lexalytics, but differs in that it is offered via API and Excel plugin, and in that it incorporates a bigger knowledge base and uses deep learning.
  • Sketch Engine - a corpus manager and analysis software which providing creating text corpora from uploaded texts or the Web including part-of-speech tagging and lemmatization or detecting a particular website.[4]
  • Smartlogic – Semaphore; Content Intelligence platform containing commercial text analytics, natural language processing, rule-based classification, ontology/taxonomy modelling and information visualization software used for Information Management.
  • StatSoft – provides STATISTICA Text Miner as an optional extension to STATISTICA Data Miner, for Predictive Analytics Solutions.
  • Sysomos - provider social media analytics software platform, including text analytics and sentiment analysis on online consumer conversations.
  • WordStat - Content analysis and text mining add-on module of QDA Miner for analyzing large amounts of text data.

Open source[edit]

  • Carrot2 – text and search results clustering framework.
  • Coding Analysis Toolkit – CAT is a free, web-based, and open source text analysis service. Load, code, and annotate text data in teams. Measure inter-rater reliability and adjudicate differences between coders. Report on the accuracy of codes and coders over time. Train better coders through systematic iterations.
  • GATE – general Architecture for Text Engineering, an open-source toolbox for natural language processing and language engineering.
  • Gensim - large-scale topic modelling and extraction of semantic information from unstructured text (Python).
  • Natural Language Toolkit (NLTK) – a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for the Python programming language.
  • OpenNLP - natural language processing.
  • Orange with its text mining add-on.
  • Stanbol - an open source text mining engine targeted at semantic content management.
  • The programming language R provides a framework for text mining applications in the package tm.[5] The Natural Language Processing task view contains tm and other text mining library packages.[6]
  • The KNIME Text Processing extension.
  • The PLOS Text Mining Collection.[7]
  • Voyant Tools - a web-based text analysis environment, created as a scholarly project.
  • spaCy - open-source Natural Language Processing library for Python


  1. ^ Alba, Davet (12 February 2015). "The Startup That Helps You Analyze Twitter Chatter in Real Time". Wired. Retrieved 4 March 2015.
  2. ^ Lohr, Steve (27 June 2014). "The U.S.-Germany Match Through a Social Media Lens". New York Times. Retrieved 4 March 2015.
  3. ^ "Text Analytics Toolbox". Retrieved 2019-07-10.
  4. ^ "Text analysis with Sketch Engine". Sketch Engine. LEXICAL COMPUTING CZ s.r.o. Retrieved 17 January 2018.
  5. ^ Introduction to the tm Package: Text Mining in R
  6. ^ CRAN Task View: Natural Language Processing
  7. ^ "Table of Contents: Text Mining". PLOS.

External links[edit]