Semantic Model: Functional Requirements

Collections submitted to HOPE are as varied and complex as the institutional settings in which they have been managed and described. HOPE social history institutions tend to hold foreign language or multilingual collections, and materials collected—including publications, personal papers, organizational records, grey literature, paraphernalia, films, and visual materials—tend to cross traditional information domains. The HOPE Content Providers Survey uncovered the following general characteristics of social history metadata:

  • ''Domain specific:'' Metadata is domain specific both in the sense that descriptions include information specific to the type of material (e.g. work of art, book, series of letters) and in the sense that they reflect the institutional context in which they have been created (museum, library, archive).  Archival metadata tends to contain detailed information on the provenance of the document but little information about physical characteristics. Visual metadata, e.g. on photographs or posters, tends to contain extensive information about the materials and techniques used for the production of the work.
  • ''Idiosyncratic:'' Half of the metadata is encoded using published standards, but the other half is encoded idiosyncratically, using XML schemas specific to HOPE partners' local systems.
  • ''Hierarchical:'' Though only 7.3 percent of the metadata has been explicitly identified in the survey as having a multilevel structure, based on the initial inventory it is likely that 40 percent of the metadata has a multilevel structure. This particularly applies to bibliographic metadata on periodicals.
  • ''Multilingual:'' Metadata has been recorded in eight different languages. Almost half of the descriptions are in German and 14.2 percent are in English. More than 80 percent of the metadata records proposed for submission are unilingual. The remaining metadata is bilingual, mostly Dutch and English.
  • ''Compound objects:'' For more than half of the HOPE collections (68 out of 129 collections covering 41.2 percent of the metadata), the number of digital files exceeds the number of metadata records. This likely indicates that a significant part of the metadata relates to compound objects, i.e. digital objects that each consist of multiple digital files such as a multi-page document.
Corresponding Characteristics of the HOPE Common Metadata Structure

An analysis of the key characteristics of content providers metadata (domain-specific, idiosyncratic, hierarchical, compound objects, multilingual), and the aim providing a standardized interoperability framework (ensure interoperability, encourage standards, reduce dependencies), have led HOPE to define the HOPE Common Metadata Structure according to the following principles:

  • ''Use of open, well-established metadata standards.'' The HOPE Common Metadata Structure depends on well-established metadata standards: 1) relying on web-standards for the exchange of data, specifically the protocols endorsed by the W3C; 2) supporting the use of encoding standards already in wide use by HOPE content providers but also verified against current best practices in the domain; 3) reinforcing the use of cataloging rules currently employed by content providers in an effort to improve the semantic and syntactic consistency of the HOPE metadata; and 4) based on a standardized but flexible data model that can easily accommodate metadata from the identified domains as well as from new domains of future HOPE collections.
  • ''Accommodate domain-specific metadata.'' The HOPE Common Metadata Structure supports specific metadata elements in a single information space for each of the following identified domains: archival, library, visual/museum, and audio-visual. The HOPE Common Metadata Structure enables the cross searching of the entire HOPE Metadata while retaining domain-specific context of submitted metadata.
  • ''Accommodate hierarchically structured descriptions.'' The HOPE Common Metadata Structure has been created to accommodate hierarchically structured descriptions for each of the four domains. Description levels include: fonds, series, files, documents for multilevel archival descriptions; titles and issues of periodicals; monographs and series of monographs; object groupings or single objects; broadcast (radio or TV) series and single episodes; films or documentaries.
  • ''Accommodate multilingual metadata.'' The HOPE Common Metadata Structure has been developed to capture the language of the metadata record as well as the language of translated metadata elements.
  • ''Accommodate compound objects.'' The HOPE Common Metadata Structure supports one-to-many relationships between metadata records and digital files. Collections with compound object(s) may include ''inter alia'': archival files with multiple digitized documents; publications digitized page by page; or cinematographic works that, for technical reasons, consist of multiple audio-visual files.
Discovery-to-Delivery Needs of Designated Community

Based on the four target social history user groups identified—social history researchers and curators, professional researchers, informed European citizens, and global users (See: Designated Community)—, a range of discovery-to-delivery use cases were developed and helped HOPE to identify the following basic dissemination, discovery, and delivery principles for social history metadata and content.

  • ''Access to content from multiple domains:'' The system must provide users with access to digital objects and metadata within the limits specified by HOPE content providers. Metadata for archival, bibliographic, audiovisual, and visual units must be supported and presented together in a meaningful way, unambiguously identifying the context of each tem and the parts of each collection which have digitized objects and are accessible online. End users should be clearly informed of intellectual property rights (IPR) and access rights, including use conditions applicable for each item of the collections made available.
  • ''Search and browse functionality:'' Users must be provided with both simple and advanced search functions. The search should be diacritic insensitive and support boolean logic operators. Users must be able to filter result sets by date, language, and media type or format. Users must also be able to browse and filter content by institution, collection, and relevant themes. Added contextual information and other enhancements of base metadata, e.g. timelines, themes, or tags, should be presented as part of the search and discovery interface.
  • ''Access to and delivery of objects:'' The system should support common access formats, which allow the user to interact with digital objects or reuse them in another environment. Access formats may include digital objects in different media types for viewing/listening (via online reader/player), printing, and downloading. Users must be able to request high-resolution and, when applicable, print copies of digital objects from local content providers.

From these principles, base functional requirements for the HOPE Common Metadata Structure were defined. The structure needed to capture hierarchical relationships between metadata records as well as the level of description of each metadata record. It likewise had to capture the relationship between metadata records and the corresponding digital content. In addition the structure had to contain metadata elements to record access restrictions on metadata and digital objects as well as use rights for the digital objects and physical material. To facilitate cross-domain search and discovery, the following elements were also included: associated dates/time spans; free-text fields for identifying and describing items (e.g. title, name, summary, physical description); and controlled fields on institution/depository, collection, media type, theme, and language of material. To support search across multilingual metadata, fields were needed to capture transliterated values and the language of particular metadata elements. Finally, to support the delivery of objects the metadata structure needed to record the media type of digital content and the types of available derivatives.

These principles equally apply to the search and retrieval of the HOPE metadata and digital content via the HOPE Search API. Currently, the Social History Portal is the primary site dedicated specifically to the social history research community that has been identified by HOPE. At this point, it is likewise the only site for which the XML format used to store metadata in the HOPE Aggregator serves as an application profile (API). For this reason, the HOPE metadata supplied to the Social History Portal will be much larger and more robust than that supplied to Europeana and other discovery services; this will enable IALHI to take full advantage of the metadata, indexes, and authority lists specific to the HOPE Social History Resource. The portal will eventually include the metadata records of non-digitized collections belonging to the HOPE partners, which will also be aggregated using the HOPE Common Metadata Structure. HOPE has taken on the responsibility to develop the Social History Portal into a comprehensive discovery service with functionality and enhancements tailored to the needs of the designated community and to the specific nature of the HOPE Social History Resource. It is an added advantage the all HOPE content providers are members of IALHI and thus have a say in the development of the discovery and delivery interface.

As it currently stands, all other discovery services harvest or import more generic sets of metadata as laid out in HOPE dissemination profiles; dissemination profiles define which metadata records will be supplied as well as the specific elements to be supplied. The main service harvesting metadata from the HOPE Aggregator is the Europeana portal. The HOPE Aggregator transforms HOPE metadata into Europeana-compliant metadata using the Europeana Data Model (EDM) schema. The EDM format provides: a cross-domain encoding format; the possibility to query metadata using semantic search and browse tools developed for the Europeana portal; and some support for hierarchical descriptions. HOPE also aims to target a global audience through sites such as Flickr, YouTube, and Scribd. In this case, HOPE will focus on the dissemination of digital content with a concise set of metadata providing the basic context for the digital content. The HOPE Aggregator supports the export of metadata and digital objects to a range of social site APIs as well as the encoding of metadata in the formats specified by these sites. 

In the end, the goal of the HOPE Common Metadata Structure is to provide a semantic framework that allows the Social History Portal, the Europeana portal, and other discovery services to harvest the HOPE Metadata in the required format. Beyond this, HOPE aims to serve its designated community by knitting together its disparate metadata and content into a coherent catalog of works, the HOPE Social History Resource, a key component of the discovery-to-delivery experience.

Related Resources

Europeana. ''Definition of the Europeana Data Model Elements, Version 5.2.3''. February 2012.

Europeana. ''Europeana Data Model Primer''. July 2013. (…)

This section last updated July 2013. Content is no longer maintained.