Jump to content

Digital preservation: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 220: Line 220:
* [[Digital curation]]
* [[Digital curation]]
* [[Digital Continuity]]
* [[Digital Continuity]]
* [[Digital dark age]]
* [[Digital library]]
* [[Digital library]]
* [[Digital obsolescence]]
* [[Digital obsolescence]]

Revision as of 23:31, 6 April 2014

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable.[1] It involves planning, resource allocation, and application of preservation methods and technologies,[2] and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time.[3]

Challenges of digital preservation

Society's heritage has been presented on many different materials, including stone, vellum, bamboo, silk, and paper. Now a large quantity of information exists in digital forms, including emails, blogs, social networking websites, national elections websites, web photo albums, and sites which change their content over time. With digital media it is easier to create content and keep it up-to-date, but at the same time there are many challenges in the preservation of this content, both technical and economic.

Unlike traditional analog objects such as books or photographs where the user has unmediated access to the content, a digital object always needs a software environment to render it. These environments keep evolving and changing at a rapid pace, threatening the continuity of access to the content.[4] Physical storage media, data formats, hardware, and software all become obsolete over time, posing significant threats to the survival of the content.[3] This process can be referred to as digital obsolescence.

In the case of born-digital content (e.g., institutional archives, Web sites, electronic audio and video content, born-digital photography and art, research data sets, observational data), the enormous and growing quantity of content presents significant scaling issues to digital preservation efforts. Rapidly changing technologies can hinder digital preservationists work and techniques due to outdated and antiquated machines or technology. This has become a common problem and one that is a constant worry for a digital archivist—how to prepare for the future.

Digital content can also present challenges to preservation because of its complex and dynamic nature, e.g., interactive Web pages, virtual reality and gaming environments,[5] learning objects, social media sites.[6] In many cases of emergent technological advances there are substantial difficulties in maintaining the authenticity, fixity, and integrity of objects over time deriving from the fundamental issue of experience with that particular digital storage medium and while particular technologies may prove to be more robust in terms of storage capacity, there are issues in securing a framework of measures to ensure that the object remains fixed while in stewardship.[2]

For the preservation of software as digital content, a specific challenge is the typically non-availability of the source code as commercial software is normally distributed only in compiled binary form. Without the source code an adaption (Porting) on modern computing hardware or operating system is most often impossible, therefore the original hardware and software context needs to be emulated. Another potential challenge for software preservation can be the copyright who prohibits often the bypassing of copy protection mechanisms (Digital Millennium Copyright Act) incase software has become a orphaned work (Abandonware). An exemption from the United States Digital Millennium Copyright Act to permit to bypass copy protection was approved in 2003 for a period of 3 years to the Internet Archive who created an archive of "vintage software", as a way to preserve them.[7][8] The exemption was renewed in 2006, and as of 27 October 2009, has been indefinitely extended pending further rulemakings[9] "for the purpose of preservation or archival reproduction of published digital works by a library or archive."[10]

Another challenge surrounding preservation of digital content resides in the issue of scale. The amount of digital information being created along with the "proliferation of format types" [2] makes creating trusted digital repositories with adequate and sustainable resources a challenge. The Web is only one example of what might be considered the "data deluge".[2] For example, the Library of Congress currently amassed 170 billion tweets between 2006 and 2010 totaling 133.2 terabytes[11] and each Tweet is composed of 23 fields of metadata.[12]

The economic challenges of digital preservation are also great. Preservation programs require significant up front investment to create, along with ongoing costs for data ingest, data management, data storage, and staffing. One of the key strategic challenges to such programs is the fact that, while they require significant current and ongoing funding, their benefits accrue largely to future generations.[13]

Intellectual foundations of digital preservation

"Preserving Digital Information (1996)"

The challenges of long-term preservation of digital information have been recognized by the archival community for years.[14] In December 1994, the Research Libraries Group (RLG) and Commission on Preservation and Access (CPA) formed a Task Force on Archiving of Digital Information with the main purpose of investigating what needed to be done to ensure long-term preservation and continued access to the digital records. The final report published by the Task Force (Garrett, J. and Waters, D., ed. (1996). “Preserving digital information: Report of the task force on archiving of digital information.”[15]) became a fundamental document in the field of digital preservation that helped set out key concepts, requirements, and challenges.[14][16]

The Task Force proposed development of a national system of digital archives that would take responsibility for long-term storage and access to digital information; introduced the concept of trusted digital repositories and defined their roles and responsibilities; identified five features of digital information integrity (content, fixity, reference, provenance, and context) that were subsequently incorporated into a definition of Preservation Description Information in the Open Archival Information System Reference Model; and defined migration as a crucial function of digital archives. The concepts and recommendations outlined in the report laid a foundation for subsequent research and digital preservation initiatives.[17][18]


To standardize digital preservation practice and provide a set of recommendations for preservation program implementation, the Reference Model for an Open Archival Information System (OAIS) was developed. OAIS is concerned with all technical aspects of a digital object’s life cycle: ingest, archival storage, data management, administration, access and preservation planning.[19] The model also addresses metadata issues and recommends that five types of metadata be attached to a digital object: reference (identification) information, provenance (including preservation history), context, fixity (authenticity indicators), and representation (formatting, file structure, and what "imparts meaning to an object’s bitstream").[20]

Trusted Digital Repository Model

In March 2000, the Research Libraries Group (RLG) and Online Computer Library Center (OCLC) began a collaboration to establish attributes of a digital repository for research organizations, building on and incorporating the emerging international standard of the Reference Model for an Open Archival Information System (OAIS). In 2002, they published “Trusted Digital Repositories: Attributes and Responsibilities.” In that document a “Trusted Digital Repository” (TDR) is defined as "one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future." The TDR must include the following seven attributes: compliance with the reference model for an Open Archival Information System (OAIS), administrative responsibility, organizational viability, financial sustainability, technological and procedural suitability, system security, procedural accountability. The Trusted Digital Repository Model outlines relationships among these attributes. The report also recommended the collaborative development of digital repository certifications, models for cooperative networks, and sharing of research and information on digital preservation with regard to intellectual property rights.[21]

In 2004 Henry M. Gladney proposed another approach to digital object preservation that called for the creation of “Trustworthy Digital Objects” (TDOs). TDOs are digital objects that can speak to their own authenticity since they incorporate a record maintaining their use and change history, which allows the future users to verify that the contents of the object are valid.[22]


International Research on Permanent Authentic Records in Electronic Systems (InterPARES) is a collaborative research initiative led by the University of British Columbia that is focused on addressing issues of long-term preservation of authentic digital records. The research is being conducted by focus groups from various institutions in North America, Europe, Asia, and Australia, with an objective of developing theories and methodologies that provide the basis for strategies, standards, policies, and procedures necessary to ensure the trustworthiness, reliability, and accuracy of digital records over time.[23]

The project began in 1999 with the first phase, InterPARES 1, which ran to 2001 and focused on establishing requirements for authenticity of inactive records generated and maintained in large databases and document management systems created by government agencies.[24] InterPARES 2 (2002–2007) concentrated on issues of reliability, accuracy and authenticity of records throughout their whole life cycle, and examined records produced in dynamic environments in the course of artistic, scientific and online government activities.[25] The third five-year phase (InterPARES 3) was initiated in 2007. Its goal is to utilize theoretical and methodological knowledge generated by InterPARES and other preservation research projects for developing guidelines, action plans, and training programs on long-term preservation of authentic records for small and medium-sized archival organizations.[26]


In 2006, the Online Computer Library Center developed a four-point strategy for the long-term preservation of digital objects that consisted of:

  • Assessing the risks for loss of content posed by technology variables such as commonly used proprietary file formats and software applications.
  • Evaluating the digital content objects to determine what type and degree of format conversion or other preservation actions should be applied.
  • Determining the appropriate metadata needed for each object type and how it is associated with the objects.
  • Providing access to the content.[27]

There are several additional strategies that individuals and organizations may use to actively combat the loss of digital information.


Refreshing is the transfer of data between two types of the same storage medium so there are no bitrot changes or alteration of data.[20] For example, transferring census data from an old preservation CD to a new one. This strategy may need to be combined with migration when the software or hardware required to read the data is no longer available or is unable to understand the format of the data. Refreshing will likely always be necessary due to the deterioration of physical media.


Migration is the transferring of data to newer system environments (Garrett et al., 1996). This may include conversion of resources from one file format to another (e.g., conversion of Microsoft Word to PDF or OpenDocument) or from one operating system to another (e.g., Windows to GNU/Linux) so the resource remains fully accessible and functional. Two significant problems face migration as a plausible method of digital preservation in the long terms. Due to the fact that digital objects are subject to a state of near continuous change, migration may cause problems in relation to authenticity and migration has proven to be time-consuming and expensive for "large collections of heterogeneous objects, which would need constant monitoring and intervention.[2]


Creating duplicate copies of data on one or more systems is called replication. Data that exists as a single copy in only one location is highly vulnerable to software or hardware failure, intentional or accidental alteration, and environmental catastrophes like fire, flooding, etc. Digital data is more likely to survive if it is replicated in several locations. Replicated data may introduce difficulties in refreshing, migration, versioning, and access control since the data is located in multiple places.


Emulation is the replicating of functionality of an obsolete system. According to van der Hoeven, "Emulation does not focus on the digital object, but on the hard- and software environment in which the object is rendered. It aims at (re)creating the environment in which the digital object was originally created.".[28] Examples are having the ability to replicate or imitate another operating system.[29] Examples include emulating an Atari 2600 on a Windows system or emulating WordPerfect 1.0 on a Macintosh. Emulators may be built for applications, operating systems, or hardware platforms. Emulation has been a popular strategy for retaining the functionality of old video game systems, such as with the MAME project. The feasibility of emulation as a catch-all solution has been debated in the academic community. (Granger, 2000)

Raymond A. Lorie has suggested a Universal Virtual Computer (UVC) could be used to run any software in the future on a yet unknown platform.[30] The UVC strategy uses a combination of emulation and migration. The UVC strategy has not yet been widely adopted by the digital preservation community.

Jeff Rothenberg, a major proponent of Emulation for digital preservation in libraries, working in partnership with Koninklijke Bibliotheek and National Archief of the Netherlands, developed a software program called Dioscuri, a modular emulator that succeeds in running MS-DOS, WordPerfect 5.1, DOS games, and more.[31]

Another example of emulation as a form of digital preservation can be seen in the example of Emory University and the Salman Rushdie's papers. Rushdie donated an outdated computer to the Emory University library, which was so old that the library was unable to extract papers from the harddrive. In order to procure the papers, the library emulated the old software system and was able to take the papers off his old computer.[32]


This method maintains that preserved objects should be self-describing, virtually "linking content with all of the information required for it to be deciphered and understood".[2] The files associated with the digital object would have details of how to interpret that object by using "logical structures called "containers" or "wrappers" to provide a relationship between all information components[33] that could be used in future development of emulators, viewers or converters through machine readable specifications.[34] The method of encapsulation is usually applied to collections that will go unused for long periods of time[34]

Persistent Archives Concept

Developed by the San Diego Supercomputing Center and funded by the National Archives and Records Administration, this method requires the development of comprehensive and extensive infrastructure that enables "the preservation of the organisation of collection as well as the objects that make up that collection, maintained in a platform independent form".[2] A persistent archive includes both the data constituting the digital object and the context that the defines the provenance, authenticity, and structure of the digital entities.[35] This allows for the replacement of hardware or software components with minimal effect on the preservation system. This method can be based on virtual data grids and resembles OAIS Information Model (specifically the Archival Information Package).

Metadata attachment

Metadata is data on a digital file that includes information on creation, access rights, restrictions, preservation history, and rights management.[36] Metadata attached to digital files may be affected by file format obsolescence. ASCII is considered to be the most durable format for metadata [37] because it is widespread, backwards compatible when used with Unicode, and utilizes human-readable characters, not numeric codes. It retains information, but not the structure information it is presented in. For higher functionality, SGML or XML should be used. Both markup languages are stored in ASCII format, but contain tags that denote structure and format.

Digital sustainability

Digital sustainability encompasses a range of issues and concerns that contribute to the longevity of digital information.[38] Unlike traditional, temporary strategies, and more permanent solutions, digital sustainability implies a more active and continuous process. Digital sustainability concentrates less on the solution and technology and more on building an infrastructure and approach that is flexible with an emphasis on interoperability, continued maintenance and continuous development.[39] Digital sustainability incorporates activities in the present that will facilitate access and availability in the future.[40][41]

Preservation repository assessment and certification

A few of the major frameworks for digital preservation repository assessment and certification are described below. A more detailed list is maintained by the U.S. Center for Research Libraries.[42]

Specific tools and methodologies


In 2007, CRL/OCLC published Trustworthy Repositories Audit & Certification: Criteria & Checklist (TRAC), a document allowing digital repositories to assess their capability to reliably store, migrate, and provide access to digital content. TRAC is based upon existing standards and best practices for trustworthy digital repositories and incorporates a set of 84 audit and certification criteria arranged in three sections: Organizational Infrastructure; Digital Object Management; and Technologies, Technical Infrastructure, and Security.[43]

TRAC "provides tools for the audit, assessment, and potential certification of digital repositories, establishes the documentation requirements required for audit, delineates a process for certification, and establishes appropriate methodologies for determining the soundness and sustainability of digital repositories".[44]


Digital Repository Audit Method Based On Risk Assessment (DRAMBORA), introduced by the Digital Curation Centre (DCC) and Digital Preservation Europe (DPE) in 2007, offers a methodology and a toolkit for digital repository self-assessment.

The DRAMBORA process is arranged in six stages and concentrates on evaluation of likelihood and potential impact of risks on the repository. The auditor is required to describe and document the repository’s role, objectives, policies, activities and assets, in order to identify and assess the risks associated with these activities and assets and define appropriate measures to manage them.[45]

European Framework for Audit and Certification of Digital Repositories

The European Framework for Audit and Certification of Digital Repositories was defined in a memorandum of understanding signed in July 2010 between Consultative Committee for Space Data Systems (CCSDS), Data Seal of Approval (DSA) Board and German Institute for Standardization (DIN) "Trustworthy Archives – Certification" Working Group.

The framework is intended to help organizations in obtaining appropriate certification as a trusted digital repository and establishes three increasingly demanding levels of assessment:

  1. Basic Certification: self-assessment using 16 criteria of the Data Seal of Approval (DSA).
  2. Extended Certification: Basic Certification and additional externally reviewed self-audit against ISO 16363 or DIN 31644 requirements.
  3. Formal Certification: validation of the self-certification with a third-party official audit based on ISO 16363 or DIN 31644.[46]

nestor Catalogue of Criteria

A German initiative, nestor (the Network of Expertise in Long-Term Storage of Digital Resources) sponsored by the German Ministry of Education and Research, developed a catalogue of criteria for trusted digital repositories in 2004. In 2008 the second version of the document was published. The catalogue, aiming primarily at German cultural heritage and higher education institutions, establishes guidelines for planning, implementing, and self-evaluation of trustworthy long-term digital repositories.[47]

The nestor catalogue of criteria conforms to the OAIS reference model terminology and consists of three sections covering topics related to Organizational Framework, Object Management, and Infrastructure and Security.[48]


In 2002 the Preservation and Long-term Access through Networked Services (PLANETS) project, part of the EU Framework Programmes for Research and Technological Development 6, addressed core digital preservation challenges. The primary goal for Planets was to build practical services and tools to help ensure long-term access to digital cultural and scientific assets. The outputs of the project are now sustained by the follow-on organisation, the Open Planets Foundation.[49][50]


Planning Tool for Trusted Electronic Repositories (PLATTER) is a tool released by DigitalPreservationEurope (DPE) to help digital repositories in identifying their self-defined goals and priorities in order to gain trust from the stakeholders.[51]

PLATTER is intended to be used as a complementary tool to DRAMBORA, NESTOR, and TRAC. It is based on ten core principles for trusted repositories and defines nine Strategic Objective Plans, covering such areas as acquisition, preservation and dissemination of content, finance, staffing, succession planning, technical infrastructure, data and metadata specifications, and disaster planning. The tool enables repositories to develop and maintain documentation required for an audit.[45]: 49 

Audit and Certification of Trustworthy Digital Repositories (ISO 16363)

Audit and Certification of Trustworthy Digital Repositories (ISO 16363:2012), developed by the Consultative Committee for Space Data Systems (CCSDS), was approved as a full international standard in March 2012. Extending the OAIS Reference Model and based largely on the TRAC checklist, the standard is designed for all types of digital repositories. It provides a detailed specification of criteria against which the trustworthiness of a digital repository should be evaluated.[52]

The CCSDS Repository Audit and Certification Working Group has also developed and submitted for approval a second standard, Requirements for Bodies Providing Audit and Certification of Candidate Trustworthy Digital Repositories (ISO 16919), that defines the external auditing process and requirements for organizations responsible for assessment and certification of digital repositories.[53]

Digital preservation best practices

Although preservation strategies vary for different types of materials and between institutions, adhering to nationally and internationally recognized standards and practices is a crucial part of digital preservation activities. Best or recommended practices define strategies and procedures that may help organizations to implement existing standards or provide guidance in areas where no formal standards have been developed.[54]

Best practices in digital preservation continue to evolve and may encompass processes that are performed on content prior to or at the point of ingest into a digital repository as well as processes performed on preserved files post-ingest over time. Best practices may also apply to the process of digitizing analog material and may include the creation of specialized metadata (such as technical, administrative and rights metadata) in addition to standard descriptive metadata. The preservation of born-digital content may include format transformations to facilitate long-term preservation or to provide better access.[55]

Audio preservation

Various best practices and guidelines for digital audio preservation have been developed, including:

  • Capturing Analog Sound for Digital Preservation: Report of a Roundtable Discussion of Best Practices for Transferring Analog Discs and Tapes (2006),[56] which defined procedures for reformatting sound from analog to digital and provided recommendations for best practices for digital preservation
  • Digital Audio Best Practices (2006) prepared by the Collaborative Digitization Program Digital Audio Working Group, which covers best practices and provides guidance both on digitizing existing analog content and on creating new digital audio resources[57]
  • Sound Directions: Best Practices for Audio Preservation (2007) published by the Sound Directions Project,[54] which describes the audio preservation workflows and recommended best practices and has been used as the basis for other projects and initiatives[58][59]
  • Documents developed by the International Association of Sound and Audiovisual Archives (IASA), the European Broadcasting Union (EBU), the Library of Congress, and the Digital Library Federation (DLF).

The Audio Engineering Society (AES) also issues a variety of standards and guidelines relating to the creation of archival audio content and technical metadata.[60]

Moving Image Preservation

The term “moving images” includes analog film and video and their born-digital forms: digital video, digital motion picture materials, and digital cinema. As analog videotape and film become obsolete, digitization has become a key preservation strategy, although many archives do continue to perform photochemical preservation of film stock.[61][62]

"Digital preservation" has a double meaning for audiovisual collections: analog originals are preserved through digital reformatting, with the resulting digital files preserved; and born-digital content is collected, most often in proprietary formats that pose problems for future digital preservation.

There is currently no broadly accepted standard target digital preservation format for analog moving images.[63]

The following resources offer information on analog to digital reformatting and preserving born-digital audiovisual content.

  • The Library of Congress tracks the sustainability of digital formats, including moving images.[64]
  • The Digital Dilemma 2: Perspectives from Independent Filmmakers, Documentarians and Nonprofit Audiovisual Archives (2012).[63] The section on nonprofit archives reviews common practices on digital reformatting, metadata, and storage. There are four case studies.
  • Federal Agencies Digitization Guidelines Initiative (FADGI). Started in 2007, this is a collaborative effort by federal agencies to define common guidelines, methods, and practices for digitizing historical content. As part of this, two working groups are studying issues specific to two major areas, Still Image and Audio Visual.[65]
  • PrestoCenter publishes general audiovisual information and advice at a European level. Its online library has research and white papers on digital preservation costs and formats.[66]
  • The Association of Moving Image Archivists (AMIA) sponsors conferences, symposia, and events on all aspects of moving image preservation, including digital. The AMIA Tech Review contains articles reflecting current thoughts and practices from the archivists’ perspectives. Video Preservation for the Millennia (2012), published in the AMIA Tech Review, details the various strategies and ideas behind the current state of video preservation.[67]

Email preservation

Email poses special challenges for preservation: email client software varies widely; there is no common structure for email messages; email often communicates sensitive information; individual email accounts may contain business and personal messages intermingled; and email may include attached documents in a variety of file formats. Email messages can also carry viruses or have spam content. While email transmission is standardized, there is no formal standard for the long-term preservation of email messages.[68]

Approaches to preserving email may vary according to the purpose for which it is being preserved. For businesses and government entities, email preservation may be driven by the need to meet retention and supervision requirements for regulatory compliance and to allow for legal discovery. (Additional information about email archiving approaches for business and institutional purposes may be found under the separate article, Email archiving.) For research libraries and archives, the preservation of email that is part of born-digital or hybrid archival collections has as its goal ensuring its long-term availability as part of the historical and cultural record.[69]

Several projects developing tools and methodologies for email preservation have been conducted based on various preservation strategies: normalizing email into XML format, migrating email to a new version of the software and emulating email environments: Memories Using Email (MUSE), Collaborative Electronic Records Project (CERP), E-Mail Collection And Preservation (EMCAP), PeDALS Email Extractor Software (PeDALS), XML Electronic Normalizing of Archives tool (XENA).

Some best practices and guidelines for email preservation can be found in the following resources:

  • Curating E-Mails: A Life-cycle Approach to the Management and Preservation of E-mail Messages (2006) by Maureen Pennock.[70]
  • Technology Watch Report 11-01: Preserving Email (2011) by Christopher J Prom.[69]
  • Best Practices: Email Archiving by Jo Maitland.[71]

Video game preservation

In 2007 the Keeping Emulation Environments Portable (KEEP) project, part of the EU Framework Programmes for Research and Technological Development 7, developed tools and methodologies to keep digital software objects available in their original context. Digital software objects as video games might get lost because of digital obsolescence and non-availability of required legacy hardware or operating system software; such software is referred to as abandonware. Because the source code is often not available any longer,[5] emulation is the only preservation opportunity. KEEP provided an emulation framework to help the creation of such emulators. KEEP was developed by Vincent Joguin, first launched in February 2009 and was coordinated by Elisabeth Freyre of the French National Library.[72]

Personal Archiving

There are many things consumers and artists can do themselves to help care for their collections at home.

  • "Resource Center: Caring For Your Treasures" by American Institute for Conservation of Historic and Artistic Works details simple strategies for artists and consumers to care for and preserve their work themselves.[73]

The Library of Congress also hosts a list for the self-preserver which includes direction toward programs and guidelines from other institutions that will help the user preserve social media, email, and formatting general guidelines (such as caring for CDs).[74] Some of the programs listed include:

  • HTTrack Website Copier: A site which allows the user to download a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to their computer.
  • Muse: Muse (short for Memories Using Email) is a program that helps users revive memories, using their long-term email archives, run by Stanford University.

Education for digital preservation

Digital Preservation Outreach and Education (DPOE)

The Digital Preservation Outreach and Education (DPOE), as part of the Library of Congress, serves to foster preservation of digital content through a collaborative network of instructors and collection management professionals working in cultural heritage institutions. Composed of Library of Congress staff, the National Trainer Network, the DPOE Steering Committee, and a community of Digital Preservation Education Advocates, as of 2013 the DPOE has 24 working trainers across the six regions of the United States.[75]

In 2010 the DPOE conducted an assessment, reaching out to archivists, librarians, and other information professionals around the country. A working group of DPOE instructors then developed a curriculum [76] based on the assessment results and other similar digital preservation curricula designed by other training programs, such as LYRASIS, Educopia Institute, MetaArchive Cooperative, University of North Carolina, DigCCurr (Digital Curation Curriculum) and Cornell University-ICPSR Digital Preservation Management Workshops. The resulting core principles are also modeled on the principles outlined in "A Framework of Guidance for Building Good Digital Collections" by the National Information Standards Organization (NISO).[77]

Examples of digital preservation initiatives

Digitization at the British Library of a Dunhuang manuscript for the International Dunhuang Project

A number of open source products have been developed to assist with digital preservation, including DSpace, Fedora, EPrints and Research-Output Repository Platform. The commercial sector also offers digital preservation software tools, such as Ex Libris Ltd.'s Rosetta, Tessella Ltd.'s Safety Deposit Box and cloud based Preservica, CONTENTdm, Digital Commons, Equella, intraLibrary, Open Repository and Vital.[78]

Large-scale digital preservation initiatives (LSDIs)

Many research libraries and archives have begun or are about to begin Large-Scale digital preservation initiatives (LSDIs). The main players in LSDIs are cultural institutions, commercial companies such as Google and Microsoft, and non-profit groups including the Open Content Alliance (OCA), the Million Book Project (MBP), and HathiTrust. The primary motivation of these groups is to expand access to scholarly resources.

LSDIs: library perspective

Approximately 30 cultural entities, including the 12-member Committee on Institutional Cooperation (CIC), have signed digitization agreements with either Google or Microsoft. Several of these cultural entities are participating in the Open Content Alliance (OCA) and the Million Book Project (MBP). Some libraries are involved in only one initiative and others have diversified their digitization strategies through participation in multiple initiatives. The three main reasons for library participation in LSDIs are: Access, Preservation and Research and Development. It is hoped that digital preservation will ensure that library materials remain accessible for future generations. Libraries have a perpetual responsibility for their materials and a commitment to archive their digital materials. Libraries plan to use digitized copies as backups for works in case they go out of print, deteriorate, or are lost and damaged.

See also


  1. ^ Digital Preservation Coalition (2008). "Introduction: Definitions and Concepts". Digital Preservation Handbook. York, UK. Retrieved 24 February 2012. Digital preservation refers to the series of managed activities necessary to ensure continued access to digital information for as long as necessary.
  2. ^ a b c d e f g Day, Michael. “The long-term preservation of Web content”. Web archiving (Berlin: Springer, 2006), pp. 177-199. ISBN 3-540-23338-5.
  3. ^ a b Evans, Mark; Carter, Laura. (December 2008). The Challenges of Digital Preservation. Presentation at the Library of Parliament, Ottawa.
  4. ^ Becker,C., Christoph; Kulovits, Hannes; Guttenbrunner, Mark; Strodl, Stephan; Rauber, Andreas; Hofman, Hans; et al. (2009). "Systematic planning for digital preservation". International Journal on Digital Libraries. 10 (10): pp.133–157. doi:10.1007/s00799-009-0057-1. {{cite journal}}: |pages= has extra text (help); Explicit use of et al. in: |last= (help)
  5. ^ a b Andersen, John (2011-01-27). "Where Games Go To Sleep: The Game Preservation Crisis, Part 1". Gamasutra. Retrieved 2013-01-10. The existence of decaying technology, disorganization, and poor storage could in theory put a video game to sleep permanently -- never to be played again. Troubling admissions have surfaced over the years concerning video game preservation. When questions concerning re-releases of certain game titles are brought up during interviews with developers, for example, these developers would reveal issues of game production material being lost or destroyed. Certain game titles could not see a re-release due to various issues. One story began to circulate of source code being lost altogether for a well-known RPG, preventing its re-release on a new console.
  6. ^ Arora, Jagdish (2009). "Digital Preservation, an Overview.". Proceedings of the National Seminar on Open Access to Textual and Multimedia Content: Bridging the Digital Divide, January 29–30, 2009. p. 111. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  7. ^ "The Internet Archive Classic Software Preservation Project". Internet Archive. Archived from the original on 19 October 2007. Retrieved October 21, 2007. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  8. ^ "Internet Archive Gets DMCA Exemption To Help Archive Vintage Software". Archived from the original on 20 October 2007. Retrieved October 21, 2007. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  9. ^ Library of Congress Copyright Office (28 October 2009). "Exemption to Prohibition on Circumvention of Copyright Protection Systems for Access Control Technologies" (PDF). Federal Register. 27 (206): 55137–55139. Archived from the original (PDF) on 2 December 2009. Retrieved December 17, 2009. {{cite journal}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  10. ^ Library of Congress Copyright Office (2006-11-27). "Exemption to Prohibition on Circumvention of Copyright Protection Systems for Access Control Technologies". Federal Register. 71 (227): 68472–68480. Archived from the original on 2007-11-01. Retrieved 2007-10-21. Computer programs and video games distributed in formats that have become obsolete and that require the original media or hardware as a condition of access, when circumvention is accomplished for the purpose of preservation or archival reproduction of published digital works by a library or archive. A format shall be considered obsolete if the machine or system necessary to render perceptible a work stored in that format is no longer manufactured or is no longer reasonably available in the commercial marketplace. {{cite journal}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
  11. ^ http://www.loc.gov:8081/today/pr/2013/files/twitter_report_2013jan.pdf
  12. ^ http://articles.forensicfocus.com/2012/04/25/key-twitter-and-facebook-metadata-fields-forensic-investigators-need-to-be-aware-of/
  13. ^ Blue Ribbon Task Force on Sustainable Digital Preservation and Access (2010). "Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information, final report" (PDF). La Jolla, Calif. p. 35. Retrieved July 5, 2012.
  14. ^ a b Tibbo, Helen R. (2003). "On the Nature and Importance of Archiving in the Digital Age". Advances in Computers. Advances in Computers. 57: p.26. doi:10.1016/S0065-2458(03)57001-2. ISBN 9780120121571. {{cite journal}}: |pages= has extra text (help)
  15. ^ Donald Waters (1996). Preserving digital information: Report of the task force on archiving of digital information. CLIR. ISBN 1-88733450-5. Retrieved November 15, 2012. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  16. ^ Inter-university Consortium for Political and Social Research (ICPSR) (2009). "Principles and Good Practice for Preserving Data". International Household Survey Network, IHSN Working Paper No 003. pp. 5–6. Retrieved September 17, 2012.
  17. ^ Harvey, Ross (2012). Preserving Digital Materials. Berlin, K. G. Saur. pp. 97, 156. ISBN 9783110253689.
  18. ^ Conway, Paul (2010). "Preservation in the Age of Google: Digitization, Digital Preservation, and Dilemmas". The Library Quarterly. 80 (1): pp.66–67. doi:10.1086/648463. JSTOR 648463. {{cite journal}}: |pages= has extra text (help)
  19. ^ Harvey, Ross (2010). Digital Curation. NY: Neal-Schuman Publishers. p. 39. ISBN 9781555706944.
  20. ^ a b Cornell University Library. (2005) Digital Preservation Management: Implementing Short-term Strategies for Long-term Problems
  21. ^ Research Libraries Group. (2002). Trusted Digital Repositories: Attributes and Responsibilities
  22. ^ Gladney, H. M. (2004). "Trustworthy 100-year digital objects: Evidence after every witness is dead". ACM Transactions on Information Systems. 22 (3): 406–436. doi:10.1145/1010614.1010617.
  23. ^ Suderman, Jim (2010). "Principle-based concepts for the long-term preservation of digital records". Proceedings of the 1st International Digital Preservation Interoperability Framework Symposium: 1. doi:10.1145/2039263.2039270. ISBN 9781450301107.
  24. ^ Duranti, Luciana (2001). "The Long-Term Preservation of Authentic Electronic Record" (PDF). Proceedings of the 27th VLDB Conference, Roma, Italy. Retrieved September 21, 2012.
  25. ^ Hackett, Yvette (2003). "InterPARES: The Search for Authenticity in Electronic Records". The Moving Image. 3 (2): p.106. {{cite journal}}: |pages= has extra text (help)
  26. ^ Laszlo, Krisztina (2008). "The InterPARES 3 Project: Implementing Digital Records Preservation in a Contemporary Art Gallery and Ethnographic Museum" (PDF). Annual conference of the International Documentation Committee of the International Council of Museums (CIDOC), 15–18 September 2008, Athens, Greece: p.4. Retrieved September 21, 2012. {{cite journal}}: |pages= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  27. ^ Online Computer Library Center, Inc. (2006). OCLC Digital Archive Preservation Policy and Supporting Documentation, p. 5
  28. ^ http://www.ijdc.net/index.php/ijdc/article/view/50/35
  29. ^ Rothenberg, Jeff (1998). Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation. Washington, DC, USA: Council on Library and Information Resources. ISBN 1-887334-63-7.
  30. ^ Lorie, Raymond A. (2001). "Long Term Preservation of Digital Information". Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL '01). Roanoke, Virginia, USA. pp. 346–352. {{cite conference}}: External link in |title= (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  31. ^ Hoeven, J. (2007). "Dioscuri: emulator for digital preservation". D-Lib Magazine. 13 (11/12). doi:10.1045/november2007-inbrief.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  32. ^ http://marbl.library.emory.edu/innovations/salman-rushdie
  33. ^ Digital Preservation: Planning, Process and Approaches for Libraries Teena KapoorJaypee Institute of Information TechnologyA-10, Sector-62, Noida UP
  34. ^ a b SOLUTIONS WALKTHROUGH REPORT José Miguel Araújo Ferreira Department of Information Systems University of Minho 4800-058 Guimarães, Portugal
  35. ^ Moore, Reagan W., Andre Merzky. Persistent Archive Research Group. Dec. 25, 2003.
  36. ^ NISO Framework Advisory Group. (2007). A Framework of Guidance for Building Good Digital Collections, 3rd edition, p. 57,
  37. ^ National Initiative for a Networked Cultural Heritage. (2002). NINCH Guide to Good Practice in the Digital Representation and Management of Cultural Heritage Materials
  38. ^ Bradley, K. (Summer 2007). Defining digital sustainability. Library Trends v. 56 no 1 p. 148-163.
  39. ^ Sustainability of Digital Resources. (2008). TASI: Technical Advisory Service for Images.
  40. ^ Towards a Theory of Digital Preservation. (2008). International Journal of Digital Curation
  41. ^ Electronic Archives Preservation Policy
  42. ^ "Center for Research Libraries – Other Assessment Tools". Retrieved Sep 6, 2012.
  43. ^ OCLC and CRL (2007). "Trustworthy Repository Audit & Certification: Criteria & Checklist" (PDF). Retrieved April 16, 2012.
  44. ^ Phillips, Stephen C (2010). "Service level agreements for storage and preservation, p.13". Retrieved May 1, 2012.
  45. ^ a b Ball, Alex (2010). "Preservation and Curation in Institutional Repositories (version 1.3)" (PDF). Edinburgh, UK: Digital Curation Centre. p. 48. Retrieved June 24, 2012.
  46. ^ APARSEN Project (2012). "Report on Peer Review of Digital Repositories" (PDF): 10. Retrieved October 8, 2012. {{cite journal}}: Cite journal requires |journal= (help)
  47. ^ Dobratz, Susanne (2007). "Trustworthy Digital Long-Term Repositories: The Nestor Approach in the Context of International Developments". Research and Advanced Technology for Digital Libraries. Springer Berlin / Heidelberg. pp. 210–222. ISBN 978-3-540-74850-2. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  48. ^ Horstkemper, Gregor (2009). "Assessment of Trustworthiness of Digital Archives" (PDF). Proceedings of the Sino-German Symposium on Development of Library and Information Services. pp. 74–75. Retrieved October 2, 2012. {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  49. ^ "Planets project". web site. 2009. Retrieved 7 December 2011.
  50. ^ "The Open Planets Foundation". web site. 2010. Retrieved 7 December 2011.
  51. ^ DigitalPreservationEurope (2008). "DPE Repository Planning Checklist and Guidance DPED3.2" (PDF). Retrieved 2012-06-24.
  52. ^ CCSDS (2011). "Audit and Certification of Trustworthy Digital Repositories, Recommended Practice" (PDF). CCSDS 652.1-M-1. Issue 1. Washington, DC: CCSDS, September 2011. pp. 1–1. Retrieved October 10, 2012.
  53. ^ Ruusalepp, Raivo (2012). "Standards Alignment". In McGovern, Nancy Y (ed.). Aligning National Approaches to Digital Preservation. Atlanta, GA: Educopia Institute. pp. 115–165 [124]. ISBN 978-0-9826653-1-2. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  54. ^ a b Casey, M. (2007). "Sound Directions: Best Practices for Audio Preservation" (PDF). Bloomington: Indiana University and Cambridge: Harvard University. p. 5. Retrieved 30 October 2012. {{cite web}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  55. ^ Verheul, I. (2006). "Networking for Digital Preservation: Current Practice in 15 National Libraries" (PDF). K.G. Saur, Munich. Retrieved 30 October 2012.
  56. ^ Council on Library and Information Resources (2006). "Publication 137: Capturing Analog Sound for Digital Preservation: Report of a Roundtable Discussion of Best Practices for Transferring Analog Discs and Tapes". Retrieved 6 September 2012.
  57. ^ Digital Audio Working Group. Collaborative Digitization Program (2006). "Digital Audio Best Practices (Version 2.1)" (PDF). Aurora, Colorado: 4. Retrieved 30 October 2012. {{cite journal}}: Cite journal requires |journal= (help)
  58. ^ Columbia University Libraries (2010). "Preserving Historic Audio Content: Developing Infrastructures and Practices for Digital Conversion. Final Report to the Andrew W. Mellon Foundation" (PDF). p. 5. Retrieved 30 October 2012.
  59. ^ Beers, Shane (2011). "Hathi Trust and the Challenge of Digital Audio" (PDF). IASA Journal (36): p.39. Retrieved 5 November 2012. {{cite journal}}: |page= has extra text (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  60. ^ Audio Engineering Society. "Publications". Retrieved 5 November 2012.
  61. ^ The Digital Dilemma: Strategic Issues in Archiving and Accessing Digital Motion Picture Materials. Academy of Motion Picture Arts and Sciences Science and Technology Council. 2007. p. 19.
  62. ^ Commission Staff Working Document on the challenges for European film heritage from the analogue and the digital era : Third implementation report of the 2005 EP and Council Recommendation on Film Heritage (PDF). Brussels. 2012. pp. 11, 17, 93–114.{{cite book}}: CS1 maint: location missing publisher (link)
  63. ^ a b The Digital Dilemma 2: Perspectives from Independent Filmmakers, Documentarians and Nonprofit Audiovisual Archives : Nonprofit Audiovisual Archives section. Science and Technology Council, the Academy of Motion Picture Arts and Sciences. 2012.
  64. ^ http://www.digitalpreservation.gov/formats/content/video.shtml
  65. ^ Federal Agencies Digitization Guidelines Initiative (2013). "Federal Agencies Digitization Guidelines Initiative". Retrieved 5 March 2013.
  66. ^ https://www.prestocentre.org/library
  67. ^ Tadic, Linda (2012). "Video Preservation for the Millennia" (PDF). AMIA Tech Review Journal (4). Retrieved 21 March 2013.
  68. ^ Goethals, Andrea (2010). "Reshaping the Repository: The Challenge of Email Archiving." (PDF). 7th International Conference on Preservation of Digital Objects (iPRES2010). {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
  69. ^ a b Prom, Christopher J (2011). "Technology Watch Report 11-01: Preserving Email": 5. Retrieved 18 February 2013. {{cite journal}}: Cite journal requires |journal= (help)
  70. ^ Pennock, Maureen (2006). "Curating E-Mails: A Life-cycle Approach to the Management and Preservation of E-mail Messages" (PDF). DCC Digital Curation Manual. Retrieved 18 February 2013.
  71. ^ Maitland, Jo (2008). Best Practices: Email Archiving (PDF). Forrester Research, Inc.
  72. ^ "7th Framework Programm [ICT-2007.4.3 Digital libraries and technology-enhanced learning]" (in Englisch). 2009. Retrieved 2009-11-30.{{cite web}}: CS1 maint: unrecognized language (link)
  73. ^ American Institute for Conservation of Historic and Artistic Works (2013). "Resource Center :Caring For Your Treasures". Retrieved 5 March 2013.
  74. ^ http://www.digitalpreservation.gov/personalarchiving/padKit/resources.html
  75. ^ Library of Congress. "Digital Preservation Outreach & Education". Website. Library of Congress. Retrieved 6 March 2013.
  76. ^ DPOE Curriculum. 2013
  77. ^ DPOE Background. 2013
  78. ^ Fojtu, Andrea (2009). "Open Source versus Commercial Solutions for a Long-term Preservation in Digital Repositories" (PDF). CASLIN 2009. Institutional Online Repositories and Open Access. University of West Bohemia. pp. 79–80. Retrieved 25 October 2012.


Template:Link GA