From Wikipedia, the free encyclopedia
Jump to: navigation, search
Original author(s) Douglas Lenat
Developer(s) Cycorp, Inc.
Initial release 1984; 33 years ago (1984)
Stable release
4.0 / 13 June 2012; 5 years ago (2012-06-13)
Written in Lisp, CycL
Type Ontology and Inference engine

Cyc (/ˈsk/) is an artificial intelligence project that attempts to assemble a comprehensive ontology and knowledge base of everyday common sense knowledge, with the goal of enabling AI applications to perform human-like reasoning.

The project was started in 1984 by Douglas Lenat at MCC and is developed by the Cycorp company. Until early 2017, parts of the project were released as OpenCyc, which provided an API, RDF endpoint, and data dump under an open source license.


The project was started in 1984 as part of (former Central Intelligence Agency deputy director) Bobby Ray Inman's US Government sponsored Microelectronics and Computer Technology Corporation in order "to counter a then ominous Japanese effort in AI, the "fifth-generation" project."[1]

The objective was to codify, in machine-usable form, millions of pieces of knowledge that compose human common sense. CycL presented a proprietary knowledge representation schema that utilized first-order relationships.[2] In 1986, Doug Lenat estimated the effort to complete Cyc would be 250,000 rules and 350 man-years of effort.[3] The Cyc Project was spun off into Cycorp, Inc. in Austin, Texas in 1994.

The name "Cyc" (from "encyclopedia", pronounced [saɪk] like syke) is a registered trademark owned by Cycorp. The original knowledge base is proprietary, but a smaller version of the knowledge base, intended to establish a common vocabulary for automatic reasoning, had been available until early 2017 as OpenCyc under an open source (Apache) license. More recently, Cyc has been made available to AI researchers under a research-purposes license as ResearchCyc.

Typical pieces of knowledge represented in the database are "Every tree is a plant" and "Plants die eventually". When asked whether trees die, the inference engine can draw the obvious conclusion and answer the question correctly. The Knowledge Base (KB) contains over one million human-defined assertions, rules or common sense ideas. These are formulated in the language CycL, which is based on predicate calculus and has a syntax similar to that of the Lisp programming language.

Much of the current work on the Cyc project continues to be knowledge engineering, representing facts about the world by hand, and implementing efficient inference mechanisms on that knowledge. Increasingly, however, work at Cycorp involves giving the Cyc system the ability to communicate with end users in natural language, and to assist with the knowledge formation process via machine learning.

Like many companies, Cycorp has ambitions to use Cyc's natural language processing [4] to parse the entire internet to extract structured data.[5]

In 2008, Cyc resources were mapped to many Wikipedia articles,[6] potentially easing connecting with other open datasets like DBpedia and Freebase.

Knowledge base[edit]

The concept names in Cyc are known as constants. Constants start with an optional "#$" and are case-sensitive. There are constants for:

  • Individual items known as individuals, such as #$BillClinton or #$France.
  • Collections, such as #$Tree-ThePlant (containing all trees) or #$EquivalenceRelation (containing all equivalence relations). A member of a collection is called an instance of that collection.
  • Functions, which produce new terms from given ones. For example, #$FruitFn, when provided with an argument describing a type (or collection) of plants, will return the collection of its fruits. By convention, function constants start with an upper-case letter and end with the string "Fn".
  • Truth Functions which can be applied to one or more other concepts and return either true or false. For example, #$siblings is the sibling relationship, true if the two arguments are siblings. By convention, truth function constants start with a lower-case letter. Truth functions may be broken down into logical connectives (such as #$and, #$or, #$not, #$implies), quantifiers (#$forAll, #$thereExists, etc.) and predicates.

The most important predicates are #$isa and #$genls. The first one describes that one item is an instance of some collection, the second one that one collection is a subcollection of another one. Facts about concepts are asserted using certain CycL sentences. Predicates are written before their arguments, in parentheses:

 (#$isa #$BillClinton #$UnitedStatesPresident)

"Bill Clinton belongs to the collection of U.S. presidents" and

 (#$genls #$Tree-ThePlant #$Plant)

"All trees are plants".

 (#$capitalCity #$France #$Paris)

"Paris is the capital of France."

Sentences can also contain variables, strings starting with "?". These sentences are called "rules". One important rule asserted about the #$isa predicate reads

     (#$isa ?OBJ ?SUBSET)
     (#$genls ?SUBSET ?SUPERSET))
   (#$isa ?OBJ ?SUPERSET))

with the interpretation "if OBJ is an instance of the collection SUBSET and SUBSET is a subcollection of SUPERSET, then OBJ is an instance of the collection SUPERSET". Another typical example is

 (#$relationAllExists #$biologicalMother #$ChordataPhylum #$FemaleAnimal)

which means that for every instance of the collection #$ChordataPhylum (i.e. for every chordate), there exists a female animal (instance of #$FemaleAnimal) which is its mother (described by the predicate #$biologicalMother).

The knowledge base is divided into microtheories (Mt), collections of concepts and facts typically pertaining to one particular realm of knowledge. Unlike the knowledge base as a whole, each microtheory is required to be free from contradictions. Each microtheory has a name which is a regular constant; microtheory constants contain the string "Mt" by convention. An example is #$MathMt, the microtheory containing mathematical knowledge. The microtheories can inherit from each other and are organized in a hierarchy: one specialization of #$MathMt is #$GeometryGMt, the microtheory about geometry.

Inference engine[edit]

An inference engine is a computer program that tries to derive answers from a knowledge base. The Cyc inference engine performs general logical deduction (including modus ponens, modus tollens, universal quantification and existential quantification).[7]



The first version of OpenCyc was released in spring 2002 and contained only 6,000 concepts and 60,000 facts. The knowledge base was released under the Apache License. Cycorp stated its intention to release OpenCyc under parallel, unrestricted licences to meet the needs of its users. The CycL and SubL interpreter (the program that allows users to browse and edit the database as well as to draw inferences) was released free of charge, but only as a binary, without source code. It was made available for Linux and Microsoft Windows. The open source Texai[8] project released the RDF-compatible content extracted from OpenCyc.[9] The latest version of OpenCyc, 4.0, was released in June 2012. OpenCyc 4.0 included the entire Cyc ontology containing hundreds of thousands of terms, along with millions of assertions relating the terms to each other; however, these are mainly taxonomic assertions, not the complex rules available in Cyc. The knowledge base contained 239,000 concepts and 2,093,000 facts.

As of 2017, OpenCyc is no longer available.


In July 2006, Cycorp released the executable of ResearchCyc 1.0, a version of Cyc aimed at the research community, at no charge. (ResearchCyc was in beta stage of development during all of 2004; a beta version was released in February 2005.) In addition to the taxonomic information contained in OpenCyc, ResearchCyc includes significantly more semantic knowledge (i.e., additional facts) about the concepts in its knowledge base, and includes a large lexicon, English parsing and generation tools, and Java based interfaces for knowledge editing and querying. In addition it contains a system for Ontology-based data integration.


Terrorism Knowledge Base[edit]

The comprehensive Terrorism Knowledge Base is an application of Cyc in development that will try to ultimately contain all relevant knowledge about "terrorist" groups, their members, leaders, ideology, founders, sponsors, affiliations, facilities, locations, finances, capabilities, intentions, behaviors, tactics, and full descriptions of specific terrorist events. The knowledge is stored as statements in mathematical logic, suitable for computer understanding and reasoning.[10]


Cyclopedia is being developed; it superimposes Cyc keywords on pages taken from Wikipedia pages.[11][12]

Cleveland Clinic Foundation[edit]

The Cleveland Clinic has used Cyc to develop a natural language query interface of biomedical information.[13] A query is parsed into a set of CycL (higher-order logic) fragments with open variables, then various constraints are applied (medical domain knowledge, common sense, discourse pragmatics, syntax), then those fragments are fit together into one semantically meaningful formal query.[14]


The Cyc project has been described as "one of the most controversial endeavors of the artificial intelligence history".[15] Machine-learning scientist Pedro Domingos refers to the project as a "catastrophic failure" for several reasons, including the unending amount of data required to produce any viable results and the inability for Cyc to evolve on its own.[16]

Notable employees[edit]

This is a list of notable people who work or have worked on Cyc either as employees of MCC (where Cyc was first started) or Cycorp.

See also[edit]


  1. ^ "The World in a Box". Scientific American. Retrieved 2017-06-08. 
  2. ^ Lenat, Douglas. "Hal's Legacy: 2001's Computer as Dream and Reality. From 2001 to 2001: Common Sense and the Mind of HAL". Cycorp, Inc. Archived from the original on 2006-10-06. Retrieved 2006-09-26. 
  3. ^ The Editors of Time-Life Books (1986). Understanding Computers: Artificial Intelligence. Amsterdam: Time-Life Books. p. 84. ISBN 0-7054-0915-5. 
  4. ^ "Cyc's Natural Language". 
  5. ^ "Cyc R&D". Archived from the original on 2009-02-20. Retrieved 2009-02-19. 
  6. ^ "Integrating Cyc and Wikipedia: Folksonomy meets rigorously defined common-sense" (PDF). Retrieved 2013-05-10. 
  7. ^ "cyc Inference engine". Retrieved 2015-06-04. 
  8. ^ The open source Texai project
  9. ^ Texai SourceForge project files
  10. ^ "The Comprehensive Terrorism Knowledge Base in Cyc". CiteSeerX accessible. 
  11. ^ "DBpedia and (Open-)Cyc". Archived from the original on 2007-10-11. Retrieved 2009-06-09. 
  12. ^ Cyclopedia Sampleshowing cyc highlighted cyc concept for family Archived 2010-07-09 at the Wayback Machine.
  13. ^
  14. ^
  15. ^ Bertino, Piero & Zarri 2001, p. 275
  16. ^ Domingos, Pedro (2015). The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. ISBN 978-0465065707. 

Further reading[edit]

External links[edit]