The Wikipedia Library
OCLC, the Online Computer Library Center, is a "nonprofit, membership, computer library service and research organization dedicated to the public purposes of furthering access to the world’s information and reducing information costs". Founded in 1967, OCLC is a worldwide library cooperative, owned, governed and sustained by its members. OCLC serves 72,000 institutions, archives and museums in 170 countries. OCLC provides economical services to libraries to help manage their collections and services in a cost-effective way that scales. OCLC is motivated by the idea that libraries should go to the people, to keep track of the world's information to best serve researchers and scholars and the public.
OCLC's mission is to "work together to improve access to the information held in libraries around the globe and find ways to reduce costs for libraries through collaboration..." Towards that end, OCLC aims to "establish, maintain and operate a computerized library network and to promote the evolution of library use, of libraries themselves and of librarianship, and to provide processes and products for the benefit of library users and libraries, including such objectives as increasing availability of library resources to individual library patrons and reducing the rate-of-rise of library per-unit costs, all for the fundamental public purpose of furthering ease of access to and use of the ever-expanding body of worldwide scientific, literary and educational knowledge and information."
OCLC also conducts research for the library community, and makes its research outcomes known through various publications. The organization advocates for “advancing research, scholarship, education, community development, information access, and global cooperation.” OCLC connects libraries in a community dedicated to the values of librarianship: cooperation, resource sharing and universal access.
The OCLC network links members to its online infrastructure providing intelligent databases and a cooperative platform to collectively innovate and drive efficiency in metadata creation, interlibrary loan, digitization, discovery and delivery. OCLC provides bibliographic, abstract and full-text information to anyone. The Open WorldCat program makes records of library-owned materials in OCLC's WorldCat database available to Web users on popular Internet search, bibliographic, and bookselling sites.
OCLC and its member libraries cooperatively produce and maintain WorldCat—the OCLC Online Union Catalog, the largest online public access catalog (OPAC) in the world. WorldCat has holding records from public and private libraries worldwide; there are 300 million bibliographic records in the catalog and 2 billion holdings from around the globe. Over half of the items in that catalog are non-English and over half are non-book.
Areas for collaboration
Full text reliable sources
OCLC will publish an API in fall 2013 to connect editors to electronic full text available online through affiliated libraries. This fulfillment service sends a query via the API to OCLC, which processes the query, affiliates the editor with a library, and checks WorldCat to see if the requested citation is available within the library's collection. If there is a match, the API returns a link to Wikipedia that either connects directy to the requested full text or to the library's OpenURL resolver. Simply put, OCLC can deliver full text sources directly to editors if their IP address alone would give them access to the source. For open access full-text sources, these can be displayed for anyone who finds them on the Web.
The WorldCat Search API can help editors to 1) check to see is a book cited in an article is available in a local library, and 2) conduct broader research within WorldCat to locate further resources from nearby institutions. This production service is currently used within Wikipedia's Book sources page among other resources. Simply put, OCLC can allow editors to search the library catalogues of nearby institutions without leaving their computer or even logging into a separate website.
The WorldCat Search API can also be used for editors to discover collections that libraries have digitized and registered in WorldCat. These e-collections can be searched to find and access original records which may be useful as primary or secondary sources, and also to identify materials such as images that may be copyright compatible to add to an article or image collection.
WorldCat already serves several institutions that are active in the Wikipedia Loves Libraries (WLL) program. By working with OCLC, we can reach out to these institutions to provide greater access for Wikipedia editors and also try to bring in many more OCLC members as WLL partners.
Mission alignment and mutual benefit
OCLC and Wikipedia are both non-profit organizations that seek the distribution of knowledge to humanity. Wikipedia targets individuals through articles, while OCLC targets libraries and those who would benefit from better research. As an encyclopedia bedrocked on reliable sources, access to the most comprehensive and up-to-date information on articles, books, and digital collections would be a powerful tool. Meanwhile, OCLC seeks to bring as many library institutions on board into its services and community, and sharing those institutions' collections with Wikipedia editors or readers amplifies the reach of each member library. Libraries have the mission of sharing their collections with as many people as possible; Wikipedia is where the majority of the world's readers are getting their information. The fit is natural and mutually beneficial.
OCLC has numerous relationships with publishers. These publishers are exploring access models which are more open and which we have leveraged in The Wikipedia Library's account donations several times previously. OCLC may be able to introduce us to a magnitude more such partners and help us to form relationships with them.
We also already have some nice connections with OCLC through their Wikipedian-in-Residence Max Klein, who has been working with Merrilee Proffitt at OCLC.
- Privacy. Wikipedia editors' private information must be fully respected and never used/held without consent. APIs can match up IP addresses with institutions; this address information is tightly controlled on Wikipedia and any sharing of information with a third party such as a library or university would have to be fully disclosed and opt-in only. OCLC has agreed to build in whatever privacy protections we require. Hosting the API on Wikimedia Labs could increasingly permit IP information to be used without ever being disclosed or visible to anyone at Labs, even administrators, as Labs builds in that capability. Ideologically, OCLC comes from the vigilantly privacy-conscious library field and brings the same ethic to protecting privacy as Wikipedia does. One precedent in this area is the Forward to libraries navigation box which John Mark Ockerbloom (User:JohnMarkOckerbloom) set up. This feature dealt with any related WMF and Labs privacy issues--collaboration OCLC would not present any more significant challenges than that.
- Branding. Any collaboration with Wikipedia has to respect Wikipedia's tremendous organizational reputation. Partnerships, even informal ones, cannot detract from that in any way. OCLC has offered to provide services informally, non-exclusively, free of charge, and without any branding whatsoever. What that might look like is a link on some Wikipedia page that says, "Find a library" or "Full text source". OCLC need not ever be mentioned, and they are fine with that arrangement.
- Effectiveness. While access to digital and library catalogues would be useful, it's not a use case that is highly valuable or in demand among Wikipedians. The gold standard use case, and what we need to maximize and focus on is the situation where a Wikipedia editor is shown a link to a full text source only when that source is available without any extra authentication. To the extent that OCLC can provide direct full text access where it was otherwise unavailable (or only with difficulty), this collaboration becomes far more useful to us.
- Politics. OCLC is a non-profit organization held in excellent regard among its members. However, it competes with for-profit library service providers, and it has not been immune from controversy. A lawsuit was brought against OCLC by Innovative Interfaces that accused OCLC of "monopolistic practices"; the lawsuit was later dismissed without any findings. Even viewed in a more cynical light, the appearance of a collaboration with OCLC will have to be taken into account in any work that we do with them going forward. Generally speaking, OCLC, unlike its competitors, is application neutral and content neutral. There is no pay-for-placement in their database and there is no requirement that certain programs be used. OCLC has also recently built more positive relationships with competitors EBSCO, GALE, and ProQuest--as they serve the same users despite their differing models of non-profit versus for-profit. OCLC also has a "mandate to innovate", and their research has been a source of positive disruption in the industry. This is a feature not a bug, but it puts OCLC in the spotlight when new features threaten established business models. Wikipedia is no stranger to that dynamic.
- Community approval. All decisions on Wikipedia come from the community and nothing can happen without community support. Work with OCLC will begin in an open phase of discussion to find areas of potential collaboration, then OCLC would demo their services. If what they show us seems useful, we can look into setting up API access on Wikimedia Labs, or on a Wikipedia-space page. Further integration will require extended community discussions and appropriate consensus.
- Address each of the above concerns fully
- Build a demo which can access the APIs
- Test data to measure percentage of "gold standard" hits, direct full text source access as a percentage of tested citations
- Map access to determine geographic impact
- Configure the APIs to maximally respect privacy
- Set up access through Wikimedia Labs
- Hold an on-wiki discussion about hosting a page on English Wikipedia which could access the API
- Research citation template integration with WorldCat Id (OCLC #s), similar to PMID, DOI, and ISBN resolvers.
- Explore further areas for integration
- Jake Orlowitz, User:Ocaasi - project organization and integration with The Wikipedia Library
- Merrilee Proffitt User:Merrilee - OCLC Research. Has been active and engaged with library community about the importance of Wikipedia/Wikimedia.
- Doug Loynes - OCLC member development
- Cindy Cunningham - OCLC member relations
- Bruce Washburn - OCLC API development and research
- Roy Tennant - OCLC API development and research
- Jeff Penka - OCLC API development
- Max Klein, User:Maximilianklein - OCLC Wikipedian-in-Residence (working on VIAF bots and authority control for infoboxes and wikidata)