e-scriptorum: iPRES 2009: the Sixth International Conference on Preservation of Digital Objects

7 Απριλίου 2010

iPRES 2009: the Sixth International Conference on Preservation of Digital Objects

Στις 05-ο6 Οκρωβρίου του 2009 πραγματοποιήθηκε στο San Francisco το 6ο Διεθνές Συνέδριο Διατήρησης Ψηφιακών Αντικειμένων. Παρακάτω παραθέτουμε τις περιλήψεις των 31 εισηγήσεων που παρουσιάστηκαν στο συνέδριο και παραπέμπουμε στο πλήρες κέιμενο τους.

An Emergent Micro-Services Approach to Digital Curation Infrastructure

In order better to meet the needs of its diverse University of California constituencies, the California Digital Library UC Curation Center is re-envisioning its approach to digital curation infrastructure by devolving function into a set of granular, independent, but interoperable micro-services. Since each of these services is small and self-contained, they are more easily developed, deployed, maintained, and enhanced; at the same time, complex curation function can emerge from the strategic combination of atomistic services. The emergent approach emphasizes the persistence of content rather than the systems in which that management occurs, thus the paradigmatic archival culture is not unduly coupled to any particular technological context. This results in a curation environment that is comprehensive in scope, yet flexible with regard to local policies and practices and sustainable despite the inevitability of disruptive change in technology and user expectation.  

e-Infrastructure and Digital Preservation: Challenges and Outlook

Undoubtedly, long-term preservation has raised a great deal of attention worldwide, but in a broader perspective the attention is out of proportion compared to the number of real operational solutions – not to mention the big picture of a comprehensive digital preservation infrastructure. Up to now, the existing digital preservation infrastructure mainly consists of a small number of scattered trusted long-term archiving data repositories, which have the control of and the responsibility for our digital heritage. The ongoing discussion on e-Infrastructure points out that we are still lacking reliable structures which support the expected integrated digital preservation infrastructure provided jointly by cultural heritage organizations, data centers and data producers. The slow development towards a global integrated digital preservation infrastructure in combination with adjacent pressing questions for the information infrastructure in general has led to an extended strategic discussion within the last years in Europe and the US: In studies, position papers and evaluation approaches we find elaborated building blocks for a roadmap and definitions for a more advanced landscape. With a focus especially on the German situation and on existing and ongoing practical experiences, the paper discusses reasons and strategic aspects which maybe can illuminate the prohibitive factors for the unassertive progress. 

The National Digital Stewardship Alliance Charter: Enabling Collaboration to Achieve National Digital Preservation

The Library of Congress proposes extending the success of the NDIIPP  (National Digital Information Infrastructure and Preservation                               Program) network by forming a national  stewardship alliance of committed digital preservation partners.

The Human Face of Digital Preservation: Organizational and Staff Challenges, and Initiatives at the Bibliothèque nationale de France

The process of setting up a digital preservation repository in  compliancy with the OAIS model is not only a technical challenge:                               libraries also need to develop and  maintain appropriate skills and organizations. Digital activities,  including digital preservation,                               are nowadays moving into the mainstream  activity of the Library and are integrated in its workflows. The  Bibliothèque nationale                               de France (BnF) has been working on the  definition of digital preservation activities since 2003. This paper  aims at presenting                               the organizational and human resources  challenges that have been faced by the library in this context, and  those that are                               still awaiting us. The library has been  facing these challenges through a variety of actions at different  levels: organizational                               changes, training sessions, dedicated  working group and task forces, analysis of skills and processes, etc.  The results of                               these actions provide insights on how a  national library is going digital, and what is needed to reach this  longstanding goal. 

Born Broken: Fonts and Information Loss in Legacy Digital Documents

For millions of legacy documents, correct rendering depends upon  resources such as fonts that are not generally embedded within                               the document structure. Yet there is  significant risk of information loss due to missing or incorrectly  substituted fonts.                               In this paper we use a collection of  230,000 Word documents to assess the difficulty of matching font  requirements with a                               database of fonts. We describe the  identifying information contained in common font formats, font  requirements stored in Word                               documents, the API provided by Windows to  support font requests by applications, the documented substitution  algorithms used                               by Windows when requested fonts are not  available, and the ways in which support software might be used to  control font substitution                               in a preservation environment. 

Towards Interoperable Preservation Repositories (TIPR)

TIPR, Towards Interoperable Preservation Repositories, is a project funded by the Institute of Museum and Library Services to create and test the Repository eXchange Package (RXP). The package will make it possible to transfer complex digital objects between dissimilar preservation repositories. For reasons of redundancy, succession planning and software migration, such repositories must be able to exchange copies of archival information packages with each other. Every different repository design, however, describes and structures its archival packages differently. Therefore each type produces dissemination packages that are rarely understandable or usable as submission packages by other repositories. The RXP is an answer to that mismatch. Other solutions for transferring packages between repositories focus either on transfers between repositories of the same type, such as DSpace-to-DSpace transfers, or on processes that translate a specific dissemination format into a specific submission package. Rather than build translators between many dissimilar repository types, the TIPR project has defined a standards-based package of metadata files that can act as an intermediary information package, the RXP, a lingua franca all repositories can read and write. In this paper we present the assumptions and principles underlying the TIPR concept of repository-to-repository exchange, and proceed to describe three aspects of the TIPR project: the RXP format itself; the tests we are conducting to prove and improve the use of the RXP; and finally, issues that have arisen in the course of the project so far. 

Curating Scientific Research Data for the Long Term: A Preservation Analysis Method in Context

The challenge of digital preservation of scientific data lies in the  need to preserve not only the dataset itself but also                               the ability it has to deliver knowledge to  a future user community. A true scientific research asset allows future  users to                               reanalyze the data within new contexts.  Thus, in order to carry out meaningful preservation we need to ensure  that future                               users are equipped with the necessary  information to re-use the data. This paper presents an overview of a  preservation analysis                               methodology which was developed in  response to that need on the CASPAR and Digital Curation Centre SCARP  projects. We intend                               to place it in relation to other digital  preservation practices discussing how they can interact to provide  archives caring                               for scientific data sets with the full  arsenal of tools and techniques necessary to rise to this challenge. 

Implementing Metadata that Guides Digital Preservation Services

Effective digital preservation depends on a set of preservation services that work together to ensure that digital objects can be preserved for the long-term. These services need digital preservation metadata, in particular, descriptions of the properties that digital objects may have and descriptions of the requirements that guide digital preservation services. This paper analyzes how these services interact and use this metadata and develops a data dictionary to support them.  

A Translation  Layer to Convey Preservation Metadata 

The long term preservation is a responsibility to share with other organizations, even adopting different preservation methods and tools. The overcoming of the interoperability issues, by means of the achievement of a flawless exchange of digital assets to preserve, enables the feasibility of applying distributed digital preservation policies. The Archives Ready To AIP Transmission a PREMIS Based Project (ARTAT-PBP) aims to experiment with the adoption of a common preservation metadata standard as interchange language in a network of cooperating organizations that need to exchange digital resources with the mutual objective of preserving them in the long term.  

Significant Properties, Authenticity, Provenance, Representation Information and OAIS Information

The term "Significant Properties" has been given a variety of definitions and used in various ways over the past several years. The relationship between Significant Properties and the OAIS term Representation Information has been a puzzle. This paper proposes a definition of Significant Properties which provides a way to clarify this relationship and indicates how the concept can be used in a coherent way. We believe that this approach is consistent with the actual use of the concept and does not invalidate the previous pieces of work but rather provides a clear and consistent view of the concept. It also links together Authenticity and Provenance which are also key concepts in digital preservation. 

Tools for Preservation and Use of Complex and Diverse Digital Resources

This paper will describe the tools and infrastructure components which have been implemented by the CASPAR project to support repositories in their task of long term preservation of digital resources. We address also the capture and preservation of digital rights management and evidence of authenticity associated with digital objects. Moreover examples of ways to evaluate a variety of preservation strategies will be discussed as will examples of integrating the use of these infrastructure components and tools into existing repository systems. Examples will be given of a rich selection of digital objects which encode information from a variety of disciplines including science, cultural heritage and also contemporary performing arts. 

Integrating Metadata Standards to Support Long-Term Preservation of Digital Assets: Developing Best Practices for Expressing Preservation Metadata in a Container Format

This paper explores the purpose and development of best practice  guidelines for the use of preservation metadata as detailed                               in the PREMIS Data Dictionary for  Preservation Metadata within documents conforming to the Metadata  Encoding and Transmission                               Standard (METS). METS is an XML schema  that provides a container format integrating various forms of metadata  with digital                               objects or links to digital objects.  Because of the flexibility of METS to serve many different functions  within digital systems                               and to support many different metadata  structures, integration guidelines will facilitate common practices  among institutions.                               There is constant tension between tighter  control over the METS package to support object exchange versus each  implementation's                               unique preservation metadata requirements  given the different contexts and implementation models among PREMIS  implementers.                               The PREMIS in METS Guidelines serve  primarily as a standard for submission and dissemination information  packages. This paper                               details the issues encountered in using  the standards together, and how the METS document changes as events  pertaining to                               the lifecycle of digital assets are  recorded for future preservation purposes. The guidelines have enabled  the implementation                               of an exchange format and  creation/validation tools based on the PREMIS in METS guidelines. 

Digital  Archeology: Recovering Digital Objects from Audio Waveforms 

Specimens of early computer systems stop working every day. One storage  medium that was popular for home computers in the                               1980s was the audio tape. The first home  computer systems allowed the use of standard cassette players to record  and replay                               data. Audio tapes are more durable than  old home computers when properly stored. Devices playing this medium  (i.e. tape recorders)                               can be found in working condition or can  be repaired as they are made out of standard components. By  re-engineering the format                               of the waveform the data on such media can  then be extracted from a digitized audio stream. This work presents a  case study                               of extracting data created on an early  home computer system, the Philips G7400. Results show that with some  error correction                               methods parts of the tapes are still  readable, even without the original system. It also becomes clear, that  it is easier                               to build solutions now when the original  systems are still available. 

Cost Model for Digital Curation: Cost of Digital Migration

The Danish Ministry of Culture is currently funding a project to set up a  model for costing preservation of digital materials                               held by national cultural heritage  institutions. The overall objective of the project is to provide a basis  for comparing                               and estimating future financial  requirements for digital preservation and to increase cost effectiveness  of digital preservation                               activities. In this study we describe an  activity based costing methodology for digital preservation based on the  OAIS Reference                               Model. In order to estimate the cost of  digital migrations we have identified cost critical activities by  analysing the OAIS                               Model, and supplemented this analysis with  findings from other models, literature and own experience. To verify  the model                               it has been tested on two sets of data  from a normalisation project and a migration project at the Danish  National Archives.                               The study found that the OAIS model  provides a sound overall framework for cost breakdown, but that some  functions, especially                               when it comes to performing and evaluating  the actual migration, need additional detailing in order to cost  activities accurately.  

Digital Materiality: Preserving Access to Computers as Complete Environments

This paper addresses a particular domain within the sphere of activity that is coming to be known as personal digital papers or personal digital archives. We are concerned with contemporary writers of belles-lettres (fiction, poetry, and drama), and the implications of the shift toward word processing and other forms of electronic text production for the future of the cultural record, in particular literary scholarship. The urgency of this topic is evidenced by the recent deaths of several high-profile authors, including David Foster Wallace and John Updike, both of whom are known to have left behind electronic records containing unpublished and incomplete work alongside of their more traditional manuscript materials. We argue that literary and other creatively-oriented originators offer unique challenges for the preservation enterprise, since the complete digital context for individual records is often of paramount importance—what Richard Ovenden, in a helpful phrase (in conversation) has termed “the digital materiality of digital culture.” We will therefore discuss preservation and access scenarios that account for the computer as a complete artifact and digital environment, drawing on examples from the born-digital materials in literary collections at Emory University, the Harry Ransom Center at The University of Texas at Austin, and the University of Maryland. 

Mainstreaming Preservation through Slicing and Dicing of Digital Repositories: Investigating Alternative Service and Resource Options for ContextMiner Using Data Grid Technology

A digital repository can be seen as a combination of services, resources, and policies. One of the fundamental design questions for digital repositories is how to break down the services and resources: who will have responsibility, where they will reside, and how they will interact. There is no single, optimal answer to this question. The most appropriate arrangement depends on many factors that vary across repository contexts and are very likely to change over time. This paper reports on our investigation and testing of various repository "slicing and dicing" scenarios, their potential benefits, and implications for implementation, administration, and service offerings. Vital considerations for each option (1) efficiencies of resource use, (2) management of dependencies across entities, and (3) the repository business model most appropriate to the participating organizations. 

Memento Mundi: Are Virtual Worlds History?

In this paper, I consider whether virtual worlds are history in two senses of the word. The first explores the implications of the life-cycle of virtual worlds, especially of their extinction, for thinking about the history of computerbased technologies, as well as their use. The moment when a virtual world “is history” – when it shuts down – reminds us that every virtual world has a history. Histories of individual virtual worlds are inextricably bound up with the intellectual and cultural history of virtual world technologies and communities. The second sense of the virtual world as history brings us directly to issues of historical documentation, digital preservation and curation of virtual worlds. I consider what will remain of virtual worlds after they close down, either individually or perhaps even collectively.

Into the Archive: Potential and Limits of Standardizing the Ingest

The ingest and its preparation are crucial steps and of strategical importance for digital preservation. If we want to move digital preservation into the mainstream we have to make them as easy as possible. The aim of the NESTOR guide "Into The Archive" is to help streamlining the planning and execution of ingest projects. The main challenge for such a guide is to provide help for a broad audience with heterogeneous use cases and without detailed background knowledge on the producer side. This paper will introduce the guide, present first experiences and discuss the challenges.

Towards a Methodology for Software Preservation

Only a small part of the research which has been carried out to date on the preservation of digital objects has looked specifically at the preservation of software. This is because the preservation of software has been seen as a less urgent problem than the preservation of other digital objects, and also the complexity of software artifacts makes the problem of preserving them a daunting one. Nevertheless, there are good reasons to want to preserve software. In this paper we consider some of the motivations behind software preservation, based on an analysis of software preservation practice. We then go on to consider what it means to preserve software, discussing preservation approaches, and developing a performance model which determines how the adequacy of the a software preservation method. Finally we discuss some implications for preservation analysis for the case of software artifacts. 

Chronopolis: Preserving our Digital Heritage

The Chronopolis Digital Preservation Initiative, one of the Library of  Congress' latest efforts to collect and preserve atrisk                               digital information, has completed its  first year of service as a multi-member partnership to meet the archival  needs of a                               wide range of cultural and social domains.  In this paper we will explore the major themes within Chronopolis. 

ArchivePress: A Really Simple Solution to Archiving Blog Content

Blog archiving and preservation is not a new challenge. Current solutions are commonly based on typical web archiving activities, whereby a crawler is configured to harvest a copy of the blog and return the copy to a web archive. Yet this is not the only solution, nor is it always the most appropriate. We propose that in some cases, an approach building on the functionality provided by web feeds offers more potential. This paper describes research to develop such an approach, suitable for organisations of varying size and which can be implemented with relatively little resource and technical know-how: the ArchivePress project.  

Novel Workflows for Abstract Handling of Complex Interaction Processes in Digital Preservation

The creation of most digital objects occurs solely in interactive  graphical user interfaces which were available at the particular                               time period. Archiving and preservation  organizations are posed with large amounts of such objects of various  types. At some                               point they will need to, if possible,  automatically process these to make them available to their users or  convert them to                               a valid format. A substantial problem in  creating an automated process is the availability of suitable tools. We  are suggesting                               a new method, which uses an operating  system and application independent interactive workflow for the  migration of digital                               objects using an emulated environment.  Success terms for the conception and functionality of emulation  environments are therefore                               devised which should be applied to future  long-term archiving methods. 

A Framework for Distributed Preservation Workflows 

The Planets project is developing a service-oriented environment for the  definition and evaluation of preservation strategies                               for human-centric data. It focuses on the  question of logically preserving digital materials, as opposed to the  physical preservation                               of content bit-streams. This includes the  development of preservation tools for the automated characterization,  migration,                               and comparison of different types of  digital objects as well as the emulation of their original runtime  environment in order                               to ensure longtime access and  interpretability. The Planets integrated environment provides a number  of end-user applications                               that allow data curators to execute and  scientifically evaluate preservation experiments based on composable  preservation                               services. In this paper, we focus on the  middleware and programming model and show how it can be utilized in  order to create                               complex preservation workflows. 

Lessons Learned: Moving a Digital Preservation Network from Project Organization to Sustainability 

nestor, the German network of expertise in digital preservation started  as a time-limited project in 2003. Besides the establishment                               of a network of expertise with an  information platform, working groups, and training opportunities, a  central goal of the                               project phase was to prepare a sustainable  organization model for the network's services. In July 2009, nestor  transformed                               into a sustainable organization with 6 of  the 7 project partners and 2 additional organizations entering into a  consortium                               agreement. The preparation of the  sustainable organization was a valuable experience for the project  partners because vision                               and mission of the network were critically  discussed and refined for the future organization. Some more aspects  were identified                               that also need further refinement in order  to make nestor fit for the future. These aspects shall be discussed in  the paper. 

Are You Ready? Assessing Whether Organisations are Prepared for Digital Preservation 

In the last few years digital preservation has started to transition  from a theoretical discipline to one where real solutions                               are beginning to be used. The Planets  project has analyzed the readiness of libraries, archives and related  organizations                               to begin to use the outputs of various  digital preservation initiatives (and, in particular, the outputs of the  Planets project).                               This talk will discuss the outcomes of  this exercise which have revealed an increasing understanding the  problem. It has also                               shown concerns about its scale (in terms  of data volumes and types of data) and on the maturity of existing  solutions (most                               people are only aware of piecemeal  solutions). It also shows that there is a new challenge emerging: moving  from running digital                               preservation projects to embedding the  processes within their organisations. 

Preserving the Digital Memory of the Government of Canada: Influence and Collaboration with Records Creators 

Library and Archives Canada has a wide mandate to preserve and provide  access to Canadian published heritage, records of national                               significance, as well as to acquire the  records created by the Government of Canada, deemed to be of historical  importance.                               To address this mandate, Library and  Archives Canada has undertaken the development of a digital preservation  infrastructure                               covering policy, standards and enterprise  applications which will serve requirements for ingest, metadata  management, preservation                               and access. The purpose of this paper is  to focus on the efforts underway to engage digital recordkeeping  activities in the                               Government of Canada and to influence and  align those processes with LAC digital preservation requirements. The  LAC strategy                               to implement preservation considerations  early in the life cycle of the digital record is to establish a  mandatory legislative                               and policy framework for recordkeeping in  government. This includes a Directive on Recordkeeping, Core Digital  Records Metadata                               Standard for archival records, Digital  File Format Guidance, as well as Web 2.0 and Email Recordkeeping  Guidelines. The expected                               success of these initiatives, and  collaborative approach should provide a model for other digital heritage  creators in Canada

Where the Semantic Web and Web 2.0 Meet Format Risk Management: P2 Registry 

The Web is increasingly becoming a platform for linked data. This means making connections and adding value to data on the Web. As more data becomes openly available and more people are able to use the data, it becomes more powerful. An example is file format registries and the evaluation of format risks. Here the requirement for information is now greater than the effort that any single institution can put into gathering and collating this information. Recognising that more is better, the creators of PRONOM, JHOVE, GDFR and others are joining to lead a new initiative, the Unified Digital Format Registry. Ahead of this effort a new RDF-based framework for structuring and facilitating file format data from multiple sources including PRONOM has demonstrated it is able to produce more links, and thus provide more answers to digital preservation questions - about format risks, applications, viewers and transformations - than the native data alone. This paper will describe this registry, P2, and its services, show how it can be used, and provide examples where it delivers more answers than the contributing resources.

MIXED: Repository of Durable File Format Conversions

DANS (Data Archiving and Networked Services), the Dutch scientific data archive for the social sciences and humanities is engaged in the MIXED project to develop open source software that implements the “smart migration” strategy concerning the long-term archiving of file formats. Smart migration concerns the conversion upon ingest of specific kinds of data formats, such as spreadsheets and databases, to an intermediate XML formatted file. It is assumed that the long-term curation of the XML files is much less problematic than the migration of binary source files and that the intermediate XML file can be converted in an efficient way to file formats that are common in the future. The features of the intermediate XML files are stored in the so-called SDFP schema (Standard Data Formats for Preservation). This XML schema can be considered as an umbrella as it contains existing formal descriptions of file formats developed by others. SDFP contains also a schemas developed by DANS, e.g. a schema for file oriented databases. It can be used e.g. for the binary "DataPerfect" format that was used on a large scale about twenty years ago and for which no existing XML schema could be found. The software developed in the MIXED project has been set up as a generic framework, together with a number of plug-ins. It can be considered as a repository of durable file format conversions. The MIXED project is at its ending phase and this paper contains an overview of the results. 

Distributed Digital Preservation: Technical, Sustainability, and Organizational Developments

Representatives from a variety of distributed digital preservation  initiatives will serve as panel members and discuss the                               technical adaptability, economics, and  functionally compelling benefits of using cooperative distributed  digital preservation                               networks to preserve the vast array of  at-risk digital content produced by our societies and their  institutions. 

LIFE3: Predicting Long Term Digital Preservation Costs 

This paper will provide an overview of developments from the two phases of the LIFE (Lifecycle Information for E-Literature) project, LIFE1 and LIFE2, before describing the aims and latest progress from the third phase. Emphasis will be placed on the various approaches to estimate preservation costs including the use of templates to facilitate user interaction with the costing tool. The paper will also explore how the results of the Project will help to inform preservation planning and collection management decisions with a discussion of scenarios in which the LIFE costing tool could be applied. This will be supported by a description of how adopting institutions are already utilising LIFE tools and techniques to analyse and refine their existing preservation activity as well as to enhance their collection management decision making. 

Towards Support for Long-Term Digital Preservation in Product Life Cycle Management

Important legal and economic motivations exist for the design and engineering industry to address and integrate digital long-term preservation into product life cycle management (PLM). Investigations revealed that it is not sufficient to archive only the product design data which is created in early PLM phases, but preservation is needed for data that is produced during the entire product lifecycle including early and late phases. Data that is relevant for preservation consists of requirements analysis documents, design rationale, data that reflects experiences during product operation and also metadata like social collaboration context. In addition, also the engineering environment itself that contains specific versions of all tools and services is a candidate for preservation. This paper takes a closer look at engineering preservation use case scenarios as well as PLM characteristics and workflows that are relevant for long-term preservation. Resulting requirements for a long-term preservation system lead to an OAIS (Open Archival Information System) based system architecture and a proposed preservation service interface that respects the needs of the engineering industry. 

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου