spaCy

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

spaCy
SpaCy logo.svg
Original author(s)Matthew Honnibal
Developer(s)Explosion AI,various
Initial releaseFebruary 2015[1]
Stable release
2.1.8 / 8 August 2019; 6 days ago (2019-08-08)[2]
Repository Edit this at Wikidata
Written inPython, Cython
Operating systemLinux, Windows, macOS, OS X
Platformcross-platform
TypeNatural language processing
LicenseMIT
Websitespacy.io

spaCy (/spˈs/ spay-SEE) is an open-source software library for advanced Natural Language Processing, written in the programming languages Python and Cython.[3][4] The library is published under the MIT license and currently offers statistical neural network models for English, German, Spanish, Portuguese, French, Italian, Dutch and multi-language NER, as well as tokenization for various other languages.[5]

Unlike NLTK, which is widely used for teaching and research, spaCy focuses on providing software for production usage.[6][7] As of version 1.0, spaCy also supports deep learning workflows[8] that allow connecting statistical models trained by popular machine learning libraries like TensorFlow, Keras, Scikit-learn or PyTorch.[9] spaCy's machine learning library, Thinc, is also available as a separate open-source Python library.[10] It features convolutional neural network models for part-of-speech tagging, dependency parsing and named entity recognition, as well as API improvements around training and updating models, and constructing custom processing pipelines.

Main features[edit]

Extensions and visualizers[edit]

Dependency parse tree visualization generated with the displaCy visualizer
Dependency parse tree visualization generated with the displaCy visualizer

spaCy comes with several extensions and visualizations that are available as free, open-source libraries:

See also[edit]

References[edit]

  1. ^ "Introducing spaCy". explosion.ai. Retrieved 2016-12-18.
  2. ^ "Releases - explosion/spaCy". Retrieved 12 August 2019 – via GitHub.
  3. ^ Choi et al. (2015). It Depends: Dependency Parser Comparison Using A Web-based Evaluation Tool.
  4. ^ "Google's new artificial intelligence can't understand these sentences. Can you?". Washington Post. Retrieved 2016-12-18.
  5. ^ "Models & Languages | spaCy Usage Documentation". spacy.io. Retrieved 2017-11-08.
  6. ^ "Facts & Figures - spaCy". spacy.io. Retrieved 2017-11-08.
  7. ^ Bird, Steven; Klein, Ewan; Loper, Edward; Baldridge, Jason (2008). "Multidisciplinary instruction with the Natural Language Toolkit" (PDF). Proceedings of the Third Workshop on Issues in Teaching Computational Linguistics, ACL.
  8. ^ "explosion/spaCy". GitHub. Retrieved 2016-12-18.
  9. ^ "Facts & Figures | spaCy Usage Documentation". spacy.io. Retrieved 2017-11-08.
  10. ^ "explosion/thinc". GitHub. Retrieved 2016-12-30.
  11. ^ "Models & Languages - spaCy". spacy.io. Retrieved 2017-11-08.
  12. ^ "Models & Languages | spaCy Usage Documentation". spacy.io. Retrieved 2017-11-08.
  13. ^ Trask et al. (2015). sense2vec - A Fast and Accurate Method for Word Sense Disambiguation In Neural Word Embeddings.

External links[edit]