Spark NLP
Appearance
This article, Spark NLP, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |
Original author(s) | John Snow Labs |
---|---|
Initial release | October 2017[1] |
Stable release | 2.0
/ March 2019 |
Repository | github |
Written in | Python, Scala |
Operating system | Linux, Windows, macOS, OS X |
Type | Natural language processing |
License | Apache licence |
Website | www |
Spark NLP is an open-source text processing library built on top of Apache Spark and its Spark ML library.[2][3][4][5][6] Its goal is to provide an API for natural language processing annotations allowing a scalable approach within a distributed large scale environment.
Main features
Several annotators are provided out of the box for both Python and Scala:
- Tokenizer: Word tokens
- Normalizer: Text cleaning
- Stemmer: Hard stems
- Lemmatizer: Lemmas
- RegexMatcher: Rule matching
- TextMatcher: Phrase matching
- Chunker: Meaningful phrase matching
- DateMatcher: Date-time parsing
- SentenceDetector: Sentence Boundary Detector
- DeepSentenceDetector: Sentence Boundary Detector with Machine Learning
- POSTagger: Part of speech tagger
- ViveknSentimentDetector: Sentiment analysis
- SentimentDetector: Sentiment analysis
- Named Entity Recognition CRF annotator
- Named Entity Recognition Deep Learning annotator
- SpellChecker: Norvig algorithm
- SpellChecker: Symmetric delete
- Dependency Parser: Unlabeled grammatical relation
- Typed Dependency Parser: Labeled grammatical relation
References
- ^ Talby, David. "Introducing the Natural Language Processing Library for Apache Spark". databricks.com. databricks. Retrieved 29 March 2019.
- ^ Team, Editorial (2018-09-04). "The Use of NLP to Extract Unstructured Medical Data From Text". insideBIGDATA. Retrieved 2019-03-29.
- ^ "John Snow Labs' Natural Language Understanding Software Gets "State of the Art" Recognition in Three Industry Events". StartUp Beat. 2018-07-19. Retrieved 2019-03-29.
- ^ Ellafi, Saif Addin (2018-02-28). "Comparing production-grade NLP libraries: Running Spark-NLP and spaCy pipelines". O'Reilly Media. Retrieved 2019-03-29.
- ^ Ellafi, Saif Addin (2018-02-28). "Comparing production-grade NLP libraries: Accuracy, performance, and scalability". O'Reilly Media. Retrieved 2019-03-29.
- ^ Ewbank, Kay. "Spark Gets NLP Library". www.i-programmer.info.
{{cite web}}
: Cite has empty unknown parameter:|dead-url=
(help)
Category:Software Category:Open-source artificial intelligence
This article, Spark NLP, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |