The different types of files used in Overlays Capture Architecture

October 6, 2023

The different types of files used in Overlays Capture Architecture

Overlays Capture Architecture (OCA) is a structured way to describe data schemas. By making a schema a collection of separate functional parts bundled together, you can introduce the benefits of modular design. A modular schema means you can have multiple parties using their own expertise for separate features of a schema. For example, you could have someone with ontology experience annotate your schema to connect it to ontologies, you could have another person contributing the data validation rules, and you could have your subject matter expert describe in detail how to understand data for each attribute. All these parts come together into a single, useful schema that can perform as many (or as few) functions as a researcher needs. A modular design also means you can reuse and recombine parts from other schemas.

The OCA schema architecture can be remixed and redisplayed into multiple formats which have different functionalities.

OCA Excel Template – the first step in the development of the OCA standard, the Excel Template lets users write their schema using Excel. This Excel file is then read by the OCA parser to make the other OCA formats. While it can be convenient to write a schema in Excel, it requires a separate tutorial to learn the Excel syntax. As the OCA Ecosystem evolves we will be moving away from using the OCA Excel Template.

OCA Bundle – this format is a machine-readable and in a .zip format. If you open the zip file you can see the separate documents that together describe all the overlays and capture base of a schema. Each file is written in JSON, an open standard file format for data exchange. It looks a little tricky to understand because it doesn’t have any line breaks which would help people understand the text, but you can find lots of online tools for viewing the contents of a JSON file in a more human-readable way.

OCA Readme – the readme format takes the schema content and puts it into a human readable and archivable plain text format. OCA Readme represents the contents of a schema in a way that is accessible now and into the future because of its technological simplicity. It is a lengthy but complete document of schema features and eminently suitable for sharing and archiving alongside the machine-readable OCA Bundle.

OCA File – this is a somewhat hidden representation of OCA schema data. It is the core data format of an OCA schema ecosystem. It is the format stored in an OCA Repository and it tracks the history of the schema which is the key to enabling rich search. If OCA Bundle is like a compiled program, OCA File is like the source code.

These different representations of the content of an OCA Schema all serve different purposes. This highlights the flexibility of OCA for representing schemas. Information can be presented in either human- or machine-readable formats, or stored in a way to connect the history of the schema to enable better search. All are connected through the schema identifier (the SAID) and all represent the exact same content.

Logo for Overlays Capture Architecture

Written by Carly Huitema

The different types of files used in Overlays Capture Architecture

Resources

Quick Links

University of Guelph