California Digital Newspaper Collection

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The California Digital Newspaper Collection (CDNC) is a freely-available, archive of digitized California Newspapers; it is accessible through the project's website. The collection contains 433,033 issues comprising 4,976,984 pages and 32,437,924 articles.[1] The project is part of the Center for Bibliographical Studies and Research (CBSR) at the University of California Riverside.


The Center for Bibliographical Studies and Research, was one of six initial participants, in the National Digital Newspaper Program (NDNP);[2] a newspaper digitization project established from a partnership between, the Library of Congress and the National Endowment for the Humanities. Between 2005 and 2011, the CBSR received three, 2-year grants, and contributed around 300,000 pages to Chronicling America,[3] the public face of the NDNP. Published newspaper titles submitted include, the San Francisco Call, Los Angeles Daily Herald, Amador Ledger, and the Imperial Valley Press. In 2015, the CBSR received a 4th grant from the National Digital Newspaper Project. Between 2015 and 2017, the project contributed another 100,000 pages from the Gold Rush Era, as well as, Foreign Language newspapers.

The California Digital Newspaper Collection was officially launched in 2007, and contained the initial 100,000 pages produced for the National Digital Newspaper Project from 2005 to 2007. Another 50,000 pages were created, with support from the Institute of Museum and Library Services, under the provisions of the Library Services and Technology Act, (LSTA), administered in California by the State Librarian. All content contributed to NDNP is also hosted in the CDNC, with important differences, noted below in Digitization. Between 2007 and 2013, the CDNC digitized roughly 300,000 pages through the LSTA program, administered by the California State Library. In 2014, the project announced a 5-Year Plan, supported by LSTA, to digitize one title per county, up through 1923.[4]

In 2010, the CDNC initiated the Born Digital Project, with the goal to collect and host contemporary PDFs from newspaper publishers. Roughly a dozen publishers have or do participate in the project. See California Digital Newspaper Collection for more information.


The California Digital Newspaper Collection follows standards established by the National Digital Newspaper Program. Microfilm or newsprint is scanned to create TIFF images; whenever possible, master negative film is used. The CBSR manages an archive of approximately 100,000 reels of negative film. These are stored and maintained by the California Newspaper Microfilm Archive.[5] When negative film isn't available positive can be used, but image quality and OCR will not be as good.

The TIFF images are then processed or "digitized" to create derivative files, including a JP2, PDF, and METS/ALTO XML for each page.

Unlike NDNP, the CDNC has traditionally digitized to article-level rather than just page-level. Individual "segments" on a page—articles, illustrations, advertisements, etc.--are identified during digitization and can be retrieved by the researcher. For an illustration of the difference between page- and article-level, compare the San Francisco Call in the CDNC to the same title in Chronicling America.

Recently the CDNC has begun digitizing some titles to page-level, but most are still article-level. The main advantage of page-level is lower cost when done in an automated fashion, without human input.

Papers covered[edit]


  1. ^ "California Digital Newspaper Collection". Retrieved 2019-11-12.
  2. ^ "National Digital Newspaper Program | Library of Congress".
  3. ^ Humanities, National Endowment for the. "Chronicling America | Library of Congress".
  4. ^ "California Digital Newspaper Collection". Retrieved 2019-11-12.
  5. ^ "California Newspaper Microfilm Archive".

External links[edit]