FAIR Data Point Documentation

About FAIR Data Point

FAIRDataPoint is a REST API and Web Client for creating, storing, and serving FAIR metadata. The metadata contents are generated semi-automatically according to the FAIR Data Point software specification document.

Features

  • Store catalogs, datasets, and distributions

  • Manage users

  • Manage access rights to your catalogs, datasets, and distributions

Security

We have two levels of accessibility in FDP. All resources (e.g., catalogs, datasets,…) are publicly accessible. You don’t need to be logged in to browse them. If you want to upload your own resources, you need to be logged in. To get an account, you need to contact an administrator of the FDP. By default, all uploaded resources are publicly accessible by anyone. But if you want to allow someone to manage your resources (edit/delete), you need to allow it in the resource settings.

We have two types of roles in FDP - an administrator and a user. The administrator is allowed to manage users and all resources. The user can manage just the resources which he owns.

Installation

FAIR Data Point

FAIR Data Point is distributed as a Docker image. For a basic setup, you need to run just Mongo DB database. You can use Docker Compose to run FDP and Mongo DB together:

  1. Create a folder (e.g., /fdp) and enter it

  2. Copy docker-compose.yml provided below

  3. Run the FAIR Data Point with Docker compose docker-compose up -d

  4. After starting up, you will be able to open the FAIR Data Point in your browser on http://localhost

  5. You can use docker-compose logs to see the logs and docker-compose down to stop all the services

version: '3'
services:

  fdp:
    image: fairdata/fairdatapoint
    restart: always
    ports:
      - 80:80

  mongo:
      image: mongo:4.0.12
      restart: always
      ports:
        - 27017:27017
      command: mongod

Default users

Initially, migrations will fill the database with predefined data needed including users, all with password “password”:

You can use those accounts for testing or to initially made your own account admin and then delete them.

Danger

Having public instance with default accounts is a security risk. Delete or change default accounts (mainly Albert Einstein) if your FDP instance is public as soon as possible.

FDP Client

You can run FAIR Data Point without the client if you need the API only. If you want a user interface for browsing the metadata and administration of the FAIR Data Point metadata and users, you can deploy FAIR Data Point together with this client.

FDP Client works as a proxy before the FAIR Data Point itself. It decides which request should pass through to the FDP (e.g., API calls) and which should be handled by the client (requests from browsers). Therefore, you no longer need to have FAIR Data Point exposed publicly.

FAIR Data Point Client is distributed as a Docker image. It runs together with the FAIR Data Point.

Configuration

ENV Variable

Description

FDP_HOST

A hostname of the FAIR Data Point (within the Docker network).

PUBLIC_PATH

Use only if FDP is not running at the root, you need to specify the URL. For example, if you run FDP at https://example.com/fairdatapoint, PUBLIC_PATH should be /fairdatapoint.

Example

Here is an example Docker Compose configuration to run FDP and FDP client together:

version: '3'
services:
    server:
        image: fairdata/fairdatapoint
        # ... FDP configuration

    client:
        image: fairdata/fairdatapoint-client
        ports:
            - 80:80
        environment:
            - FDP_HOST=server  # using hostname within the Docker network

You can have a look at a complete example in FAIR Data Point Example repository.

Tip

It is recommended to run FDP and FDP client behind a reverse proxy with SSL certificates. See further in the docs examples how to do that.

Here is a diagram with the overview of different componets of the FAIR Data Point.

Overview

Setting up a reverse proxy

If you want to run publicly available FDP, you should use HTTPS protocol with valid certificates. It is easy to configure FDP to run behind a reverse proxy which takes care of the certificates. Here are some examples of how to configure nginx as a reverse proxy for FDP in different cases.

When setting proxy_pass, there is a <client_host> placeholder. Use the name of the Docker container in your deployment instead. Also, you need to set up Docker DNS resolver somewhere in the configuration.

resolver 127.0.0.11 valid=10s;

Running FDP on domain root

This is an example of running FDP as the root application on domain fairdatapoint.example.com.

server {
    listen 443 ssl;
    ssl_certificate /etc/letsencrypt/live/fairdatapoint.example.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/fairdatapoint.example.com/privkey.pem;

    server_name fairdatapoint.example.com;

    location / {
        proxy_pass http://<client_host>;
        proxy_set_header Host $host;
        proxy_pass_request_headers on;
    }
}

# Redirect to https
server {
    listen 80;
    server_name fairdatapoint.example.com;
    return 301 https://$host$request_uri;
}

Running FDP on a nested route

Sometimes, you might want to run FDP alongside other applications on the same domain. Here is an example of running FDP on example.com/faidatapoint. If you run FDP in this configuration, you have to set PUBLIC_PATH ENV variable, in this example to /fairdatapoint.

server {
    # Configruation for the server, certificates, etc.

    # Define the location FDP runs on
    location ~ /fairdatapoint(/.*)?$ {
        rewrite /fairdatapoint(/.*) $1 break;
        rewrite /fairdatapoint / break;
        proxy_pass http://<client_host>;
        proxy_set_header Host $host;
        proxy_pass_request_headers on;
    }
}

When running on nested route, don’t forget to change paths to all custom assets referenced in SCSS files.

Configuration

Application Configuration

You can override default settings in application-production.yml file. You can take inspiration in the default configuration and default production configuration. Create the application-production.yml in the /fdp folder and attach it into a docker container using volumes directive.

fdp:
    image: fairtools/fairdatapoint
    restart: always
    volumes:
      - ./application-production.yml:/fdp/application-production.yml
    ports:
      - 80:8080

Possible configuration

Here you can list possible configuration. The configuration marked as required should be addressed if you are intending to use the FDP professionally.

Customization

Level

Description

Application (instance) URL

Required

Override property instance.url (e.g., http://fdp-staging.fair-dtls.surf-hosted.nl)

Server Port

Optional

Override property server.port (e.g., 80)

JWT Token secret

Required

Override property security.jwt.token.secret-key

Metadata Properties

Optional

Override property metadataProperties with nested properties: rootSpecs, catalogSpecs, datasetSpecs, distributionSpecs, publisherURI, publisherName, language, license, accessRightsDescription

Metadata Metrics

Optional

Override property metadataMetrics. Nested properties are captured as Map with metric uri as a key (e.g., https://purl.org/fair-metrics/FM_F1A) and with its value (e.g., https://www.ietf.org/rfc/rfc3986.txt)

PID

Optional

Override property pid. You can choose between 2 types of persistent identifiers (default PIDSystem (1), purl.org PID System (2)). Select one of those and write the number of the type into type property. To configure the concrete PID System, create a property named by the type of the PID System and include the required information for that repository. For default, you don’t need to configure anything. For purl, you need to configure baseUrl.

Mongo DB

Required

Override property spring.data.mongodb.uri with connection string (e.g. mongodb://mongo:27017/fdp)

Triple Store

Required

Override property repository. You can choose between 5 types of triple stores (inMemoryStore (1), NativeStore (2), AllegroGraph (3), graphDB (4), blazegraph (5)). Select one of those and write the number of the type into type property. To configure the concrete repository, create a property named by the type of repository and include the required information for that repository. For agraph, you need to configure url, username and password. For graphDb, you need to configure url and repository. For blazegraph, you need to configure url and repository. And for native, you need to configure /tmp/fdp-store.

Customizations

You can customize the look and feel of FDP Client using SCSS. There are three files you can mount to /src/scss/custom. If there are any changes in these files, the styles will be regenerated when FDP Client starts.

Customization files

_variables.scss

A lot of values related to styles are defined as variables. The easiest way to customize the FDP Client is to define new values for these variables. To do so, you create a file called _variables.scss where you define the values that you want to change.

Here is an example of changing the primary color.

// _variables.scss

$color-primary: #087d63;

Have a look in src/scss/_variables.scss to see all the variables you can change.

_extra.scss

This file is loaded before all other styles. You can use it, for example, to define new styles or import fonts.

_overrides.scss

This file is loaded after all other styles. You can use it to override existing styles.

Usage

Here you can read how to use the metadata extension for OpenRefine to store FAIR data and create metadata in FAIR Data Point.

About metadata extension

The metadata extension for OpenRefine promotes FAIRness of the data by its integration with FAIR Data Point. With the extension you can easily FAIRify your data that you work on in directly in OpenRefine in two steps:

  1. Store FAIR data in configured storage.

  2. Create metadata in FAIR Data Point in selected FAIR Data Point.

It replaces the legacy project called FAIRifier.

Features

The extension provides the features only through FAIR Metadata extension menu located in top right corner above data table (typically next to Wikidata and others).

Store FAIR data

  1. Open the dialog for storing the data by clicking FAIR Metadata > Store data to FAIR storage

  2. Select the desired storage (see Storages)

  3. Select the desired format (the selection changes based on storage)

  4. Press Preview (download) to download the file to verify the contents

  5. Press Store to store the data in the storage

  6. You will see the URL to the file which you can easy copy to clipboard by clicking a button

Create metadata in FAIR Data Point

  1. Open the dialog for creating the metadata by clicking FAIRMetadata > Create metadata in FAIR Data Point

  2. Select pre-configured FAIR Data Point connection or select Custom FDP connection and fill information (if allowed, see Settings)

  3. Press Connect to connect to the selected FAIR Data Point

  4. Select a catalog from available or create a new one

    • For a new one, fill in the metadata form (see also the optional fields) and press Create catalog

  5. Select a dataset from available or create a new one

    • For a new one, fill in the metadata form (see also the optional fields) and press Create dataset

  1. Create a new distribution

  2. Fill in the metadata form (se also the optional fiels)

    • For the download URL you can easily access Store FAIR data feature and field will be filled after storing the data

  3. Check your new distribution (and/or other layers) listed

Setup

This part describes how to set up your own OpenRefine with the metadata extension and how to configure it according to your needs.

Installation

There are two ways of using our metadata extension for OpenRefine. You can have installed OpenRefine and add extension to it or use Docker with our prepared image.

Installed OpenRefine

This option requires you to have installed compatible version of OpenRefine, please check Compatibility. In case you need to install OpenRefine first, visit their documentation.

  • Get the desired version of the metadata extension from our GitHub releases page by downloading tgz or zip archive, e.g., metadata-1.0.0-OpenRefine-3.2.zip.

  • Extract the archive to extensions folder of your OpenRefine (see OpenRefine documentation).

unzip metadata-1.0.0-OpenRefine-3.2.zip path/to/openrefine-3.2/webapp/extensions

With Docker

If you want to use Docker, we provide a Docker image fairdata/openrefine-metadata-extension that combines the extension with OpenRefine of supported version. It is of course possible to use volume for the data directory (eventually data/extensions to include other extensions). All you need to have is Docker running and then:

docker run -p 3333:3333 -v /home/me/openrefine-data:/data:z fairdata/openrefine-metadata-extension

This will run the OpenRefine with metadata extension on port 3333 that will be exposed and mounts your folder /home/me/openrefine-data as OpenRefine data folder. You should be able to open OpenRefine in browser on localhost:3333. If there are some other extensions in /home/me/openrefine-data/extensions, those should be loaded as well. For more information, see OpenRefine documentation.

For configuration files you need to mount /webapp/extensions/metadata/module/config, see Configuration for more details.

Configuration

Configuration files of the metadata extension use the YAML format and are stored in extensions/metadata/module/config directory of the used OpenRefine installment. The configuration files are loaded when OpenRefine is started. Therefore, you are required to restart OpenRefine before changes in configuration files take effect. We provide examples of the configuration files that you can (re)use.

Settings

Settings configuration file serves for generic configuration options that adjust behaviour of the extension. The structure of the file is following:

  • allowCustomFDP (boolean) = should be user allowed to enter custom FAIR Data Point URI, username, and password (or use only the pre-configured)

  • metadata (map) = key-value specification of instance-wide pre-defined metadata, e.g., set license to http://purl.org/NET/rdflicense/cc-by3.0 and that URI will be pre-set in all metadata forms in field license (but can be overwritten by the user)

  • fdpConnections (list) = list of pre-configured FAIR Data Point connections that users can use, each is object with attributes:

    • name (string) = custom name identifying the connection

    • baseURI (string) = base URI of FAIR Data Point

    • email (string) = email address identifying user of FAIR Data Point

    • password (string) = password for authenticating the user of FAIR Data Point

    • preselected (boolean, optional) = flag if should be pre-selected in the form (in case that more connections have this set to true, only first one is applied)

    • metadata (map, optional) = similar to instance-wide but only for specific connection

Storages

Storages configuration file holds details about storages that are possible to use for Store FAIR data feature. In the file, list of storage object is expected where each of them has:

  • name (string) = custom name identifying the storage

  • type (string) = one of the allowed types (others are ignored): ftp, virtuso, tripleStoreHTTP

  • enabled (string) = flag if should be offered to the user

  • username (string, optional) = username for authentication

  • password (string, optional) = password for authentication

  • host (string) = URI of the storage server

  • directory (string) = directory or other location for storing the data

For FTP and Virtuoso, directory should containt absolute path where files should be stored. In case of triple stores, repository name is used to specify the target location.

Compatibility

metadata extension

OpenRefine

FAIR Data Point

vX.Y.Z

3.3-beta, 3.2

vX.Y.Z

Contributing

Roadmap