FAIR Data Point Reference Implementation Documentation¶
About FAIR Data Point¶
FAIRDataPoint is a REST API and Web Client for creating, storing, and serving FAIR metadata. The metadata contents are generated semi-automatically according to the FAIR Data Point software specification document.
Features¶
Store catalogs, datasets, and distributions
Manage users
Manage access rights to your catalogs, datasets, and distributions
Security¶
We have two levels of accessibility in FDP. All resources (e.g., catalogs, datasets,…) are publicly accessible. You don’t need to be logged in to browse them. If you want to upload your own resources, you need to be logged in. To get an account, you need to contact an administrator of the FDP. By default, all uploaded resources are publicly accessible by anyone. But if you want to allow someone to manage your resources (edit/delete), you need to allow it in the resource settings.
We have two types of roles in FDP - an administrator and a user. The administrator is allowed to manage users and all resources. The user can manage just the resources which he owns.
Users and Roles¶
There are different roles for different levels in the FAIR Data Point.
FAIR Data Point Roles¶
Admin¶
Admin can manage other user accounts and access everything in the FAIR Data Point.
User¶
User can create new catalogs and access existing catalogs where she was added.
Components¶
The deployment of the FAIR Data Point consists of a couple of components. See the following image for the overview:

Triple Store¶
Every FAIR Data Point needs to store the semantic data somewhere. A triple Store is a place where the data is. It is possible to configure different stores, see Triple Stores configuration for more details.
MongoDB¶
Besides semantic data, FDP needs information about user accounts and their roles. These data are stored in MongoDB database.
FAIRDataPoint¶
FAIRDataPoint is distributed in Docker image fairdata/fairdatapoint
. It is the core component which handles all the business logic and operations with the semantic data. It also provides API for working with data in different formats.
FAIRDataPoint-client¶
FDP client is distributed in Docker image fairdata/fairdatapoint-client
. It provides the user interface for humans. It works as a reverse proxy in front of the FAIR Data Point which decides whether the request is for machine-readable data and passes it to the FAIRDataPoint or from a web browser in which case it serves the interface for humans.
Reverse Proxy¶
In a production deployment, there is usually a reverse proxy that handles HTTPS certificates, so the connection to the FAIR Data Point is secured. See production deployment to learn how to configure one.
Local Deployment¶
FAIR Data Point is distributed in Docker images. For a simple local deployment, you need to run fairdatapoint
, fairdatapoint-client
and mongo
images. See the Components section to read more about what each image is for.
Here is an example of the simplest Docker Compose configuration to run FDP.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.7.0
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
Then you can run it using docker-compose up -d
. It might take a while to start. You can run docker-compose logs -f
to follow the output log. Once you see a message, that the application started, the FAIR Data Point should be working, and you can open http://localhost.
There are two default user accounts. See the Users and Roles section to read more about users and roles. The default accounts are
User name |
Role |
Password |
---|---|---|
admin |
password |
|
user |
password |
Danger
Using the default accounts is alright if you run FDP on your machine, but you should change them if you want to run FDP publicly.
Running locally on a different port¶
If you want to run the FAIR Data Point locally on a different port than the default 80
, additional configuration is necessary. First, we need to create a new file application.yml
and set the client URL to the actual URL we want to use.
# application.yml
instance:
clientUrl: http://localhost:8080
Then, we need to mount the application config into the FDP container and update the port which the FDP client runs on.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.7.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
ports:
- 8080:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
Persistence¶
We don’t have any data persistence with the previous configuration. Once we remove the containers, all the data will be lost. To keep it, we need to configure MongoDB volume and persistent triple store.
MongoDB volume¶
We use MongoDB to store information about user accounts and access permissions. We can configure a volume so that the data keep on our disk even if we delete MongoDB container.
We can also expose port 27017
so we can access MongoDB from our local computer using a client application like Robo 3T.
Here is the updated docker-compose file:
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.7.0
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- 27017:27017
volumes:
- ./mongo/data:/data/db
Persistent Repository¶
FAIR Data Point uses repositories to store the metadata. By default, it uses the in-memory store, which means that the data is lost after the FDP is stopped.
In this example, we will configure Blazegraph as a triple store. See Triple Stores for other repository options.
If we don’t have it already, we need to create a new file application.yml
. We will use this file to configure the repository and mount it as a read-only volume to the fdp
container. This file can be used for other configuration, see Advanced Configuration for more details.
# application.yml
# ... other configuration
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
We now need to update our docker-compose.yml
file, we add a new volume for the fdp
and add blazegraph
service. We can also expose port 8080
for Blazegraph so we can access its user interface.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.7.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- 27017:27017
volumes:
- ./mongo/data:/data/db
blazegraph:
image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6
ports:
- 8080:8080
volumes:
- ./blazegraph:/blazegraph-data
Production Deployment¶
If you want to run the FAIR Data Point in production it is recommended to use HTTPS protocol with valid certificates. You can easily configure FDP to run behind a reverse proxy which takes care of the certificates.
In this example, we will configure FDP to run on https://fdp.example.com
. We will see how to configure the reverse proxy in the same Docker Compose file. However, it is not necessary, and the proxy can be configured elsewhere.
First of all, we need to generate the certificates on the server where we want to run the FDP. You can use Let’s Encrypt and create the certificates with certbot. The certificates are generated in a standard location, e.g., /etc/letsencrypt/live/fdp.example.com
for fdp.example.com
domain. We will mount the whole letsencrypt
folder to the reverse proxy container later so that it can use the certificates.
As a reverse proxy, we will use nginx. We need to prepare some configuration, so create a new folder called nginx
with the following structure and files:
nginx/
├ nginx.conf
├ sites-available
│ └ fdp.conf
└ sites-enabled
└ fdp.conf -> ../sites-available/fdp.conf
The file nginx.conf
is the configuration of the whole nginx, and it includes all the files from sites-enabled
which contains configuration for individual servers (we can use one nginx, for example, to handle multiple servers on different domains). All available configurations for different servers are in the sites-available
, but only those linked to sites-enabled
are used.
Let’s see what should be the content of the configuration files.
# nginx/nginx.conf
# Main nginx config
user www-data www-data;
worker_processes 5;
events {
worker_connections 4096;
}
http {
# Docker DNS resolver
# We can then use docker container names as hostnames in other configurations
resolver 127.0.0.11 valid=10s;
# Include all the configurations files from sites-enabled
include /etc/nginx/sites-enabled/*.conf;
}
Then, we need to configure the FDP server.
# nginx/sites-available/fdp.conf
server {
listen 443 ssl;
# Generated certificates using certbot, we will mount these in docker-compose.yml
ssl_certificate /etc/letsencrypt/live/fdp.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/fdp.example.com/privkey.pem;
server_name fdp.example.com;
# We pass all the request to the fdp-client container, we can use HTTP in the internal network
# fdp-client_1 is the name of the client container in our configuration, we can use it as host
location / {
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass_request_headers on;
proxy_pass http://fdp-client_1;
}
}
# We redirect all request from HTTP to HTTPS
server {
listen 80;
server_name fdp.example.com;
return 301 https://$host$request_uri;
}
Finally, we need to create a soft link from sites-enabled to sites-available for the FDP configuration.
$ cd nginx/sites-enabled && ln -s ../sites-available/fdp.conf
We have certificates generated and configuration for proxy ready. Now we need to add the proxy to our docker-compose.yml
file so we can run the whole FDP behind the proxy.
# docker-compose.yml
version: '3'
services:
proxy:
image: nginx:1.17.3
ports:
- 80:80
- 443:443
volumes:
# Mount the nginx folder with the configuration
- ./nginx:/etc/nginx:ro
# Mount the letsencrypt certificates
- /etc/letsencrypt:/etc/letsencrypt:ro
fdp:
image: fairdata/fairdatapoint:1.7.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- "127.0.0.1:27017:27017"
volumes:
- ./mongo/data:/data/db
blazegraph:
image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6
volumes:
- ./blazegraph:/blazegraph-data
The last thing to do is to update our application.yml
file. We need to add clientUrl
so that FDP knows the actual URL even if hidden behind the reverse proxy. It’s a good practice to set up a persistent URL for the metadata too. We recommend using https://purl.org
. If you don’t specify persistentUrl
, the clientUrl
will be used instead. And we also need to set a random JWT token for security.
# application.yml
instance:
clientUrl: https://fdp.example.com
persistentUrl: https://purl.org/fairdatapoint/example
security:
jwt:
token:
secret-key: <random 128 characters string>
# repository settings (can be changed to different repository)
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
At this point, we should be able to run all the containers using docker-compose up -d
and after everything starts, we can access the FAIR Data Point at https://fdp.example.com. Of course, the domain you want to access the FDP on must be configured to the server where it runs.
Danger
Don’t forget to change the default user accounts as soon as your FAIR Data Point becomes publicly available.
Danger
Do not expose mongo port unless you secured the database with username and password.
Warning
In order to improve findability of itself and its content, the FAIR Data Point has a built-in feature that registers its URL into our server and pings it once a week. This feature facilitates the indexing of the metadata of each registered and active FAIR Data Point. If you do not want your FAIR Data Point to be included in this registry, add these lines to your application configuration:
# application.yml
ping:
enabled: false
Advanced Configuration¶
Triple Stores¶
FDP uses InMemory triple store by default. In previous examples, there is Blazegraph used. However, you can choose from 3 additional options.
List of possible triple stores:
In-Memory Store
Native Store
Allegro Graph Repository
GraphDB Repository
Blazegraph Repository
1. In-Memory Store¶
There is no need to configure additional properties to run FDP with In-Memory Store because it’s the default option. If you want to explicitly type in configuration provided in application.yml
, add following lines there:
# application.yml
repository:
type: 1
2. Native Store¶
With this option, FDP will simply save the data to the file system. If you want to use the Native Store, make sure that you have these lines in your application.yml
file:
# application.yml
repository:
type: 2
native:
dir: /tmp/fdp-store
where /tmp/fdp-store
is a path to a location where you want to keep your data stored.
3. Allegro Graph¶
For running Allegro Graph, you need to first set up your Allegro Graph instance. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 3
agraph:
url: http://agraph:10035/repositories/fdp
username: user
password: password
URL
, username
and password
should be configured according to your actual Allegro Graph setup.
4. GraphDB¶
For running GraphDB, you need to first set up your GraphDB instance and create the repository. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 4
graphDb:
url: http://graphdb:7200
repository: <repository-name>
URL
and repository
should be configured according to your actual GraphDB setup.
5. Blazegraph¶
For running Blazegraph, you need to first set up your Blazegraph instance. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
repository:
URL
and repository
should be configured according to your actual Blazegraph setup. Repository should be set only if you don’t use the default one.
Mongo DB¶
We store users, permissions, etc. in the MongoDB database. The default connection string is mongodb://mongo:27017/fdp
. If you want to modify it, add these lines to your application.yml
file:
# application.yml
spring:
data:
mongodb:
uri: mongodb://mongo:27017/fdp
The uri
should be adjusted by your actual MongoDB setup.
Default attached metadata¶
There are several default values that are attached to each created metadata. If you want to modify it, add the lines below to your application.yml
file. The default values are listed below, too:
# application.yml
metadataProperties:
language: http://id.loc.gov/vocabulary/iso639-1/en
license: http://rdflicense.appspot.com/rdflicense/cc-by-nc-nd3.0
accessRightsDescription: This resource has no access restriction
metadataMetrics:
https://purl.org/fair-metrics/FM_F1A: https://www.ietf.org/rfc/rfc3986.txt
https://purl.org/fair-metrics/FM_A1.1: https://www.wikidata.org/wiki/Q8777
Customizations¶
You can customize the look and feel of FDP Client using
SCSS. There are three files you can mount to
/src/scss/custom
. If there are any changes in these files, the
styles will be regenerated when FDP Client starts.
Customization files¶
_variables.scss¶
A lot of values related to styles are defined as variables. The easiest
way to customize the FDP Client is to define new values for these
variables. To do so, you create a file called _variables.scss
where
you define the values that you want to change.
Here is an example of changing the primary color.
// _variables.scss
$color-primary: #087d63;
Have a look in src/scss/_variables.scss to see all the variables you can change.
_extra.scss¶
This file is loaded before all other styles. You can use it, for example, to define new styles or import fonts.
_overrides.scss¶
This file is loaded after all other styles. You can use it to override existing styles.
Example of setting a custom logo¶
To change the logo, you need to do three steps:
Create
_variables.scss
with correct logo file name and dimensionsMount the new logo to the assets folder
Mount
_variables.scss
to SCSS custom folder
// _variables.scss
$header-logo-url: '/assets/my-logo.png'; // new logo file
$header-logo-width: 80px; // width of the new logo
$header-logo-height: 40px; // height of the new logo
# docker-compose.yml
version: '3'
services:
fdp:
# ... FDP configuration
fdp-client:
# ... FDP Client configuration
volumes:
# Mount new logo file to assets in the container
- ./my-logo.png:/usr/share/nginx/html/assets/my-logo.png:ro
# Mount _variables.scss so that styles are regenerated
- ./_variables.scss:/src/scss/custom/_variables.scss:ro
Running FDP on a nested route¶
Sometimes, you might want to run FDP alongside other applications on the
same domain. Here is an example of running FDP on
https://example.com/fairdatapoint
. If you run FDP in this configuration, you
have to set PUBLIC\_PATH
ENV variable, in this example to
/fairdatapoint
. Also, don’t forget to set correct client URL in the application config.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.7.0
volumes:
- ./application.yml:/fdp/application.yml:ro
# ... other volumes
fdp-client:
image: fairdata/fairdatapoint-client:1.7.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
- PUBLIC_PATH=/fairdatapoint
# application.yml
instance:
clientUrl: https://example.com/fairdatapoint
# Snippet for nginx configuration
server {
# Configruation for the server, certificates, etc.
# Define the location FDP runs on
location ~ /fairdatapoint(/.*)?$ {
rewrite /fairdatapoint(/.*) $1 break;
rewrite /fairdatapoint / break;
proxy_pass http://<client_host>;
}
}
When running on nested route, don’t forget to change paths to all custom assets referenced in SCSS files.
Usage¶
Here you can read how to use the metadata extension for OpenRefine to store FAIR data and create metadata in FAIR Data Point.
About metadata extension¶
The metadata extension for OpenRefine promotes FAIRness of the data by its integration with FAIR Data Point. With the extension you can easily FAIRify your data that you work on in directly in OpenRefine in two steps:
Store FAIR data in configured storage.
Create metadata in FAIR Data Point in selected FAIR Data Point.
It replaces the legacy project called FAIRifier.
Features¶
The extension provides the features only through FAIR Metadata extension menu located in top right corner above data table (typically next to Wikidata and others).
Store FAIR data¶
Open the dialog for storing the data by clicking FAIR Metadata > Store data to FAIR storage
Select the desired storage (see Storages)
Select the desired format (the selection changes based on storage)
Press Preview (download) to download the file to verify the contents
Press Store to store the data in the storage
You will see the URL to the file which you can easy copy to clipboard by clicking a button
Create metadata in FAIR Data Point¶
Open the dialog for creating the metadata by clicking FAIRMetadata > Create metadata in FAIR Data Point
Select pre-configured FAIR Data Point connection or select Custom FDP connection and fill information (if allowed, see Settings)
Press Connect to connect to the selected FAIR Data Point
Select a catalog from available or create a new one
For a new one, fill in the metadata form (see also the optional fields) and press Create catalog
Select a dataset from available or create a new one
For a new one, fill in the metadata form (see also the optional fields) and press Create dataset
Create a new distribution
Fill in the metadata form (se also the optional fiels)
For the download URL you can easily access Store FAIR data feature and field will be filled after storing the data
Check your new distribution (and/or other layers) listed
Setup¶
This part describes how to set up your own OpenRefine with the metadata extension and how to configure it according to your needs.
Installation¶
There are two ways of using our metadata extension for OpenRefine. You can have installed OpenRefine and add extension to it or use Docker with our prepared image.
Installed OpenRefine¶
This option requires you to have installed compatible version of OpenRefine, please check Compatibility. In case you need to install OpenRefine first, visit their documentation.
Get the desired version of the metadata extension from our GitHub releases page by downloading tgz or zip archive, e.g.,
metadata-X.Y.Z-OpenRefine-X.Y.zip
.Extract the archive to
extensions
folder of your OpenRefine (see OpenRefine documentation).
unzip metadata-X.Y.Z-OpenRefine-X.Y.zip path/to/openrefine-X.Y/webapp/extensions
With Docker¶
If you want to use Docker, we provide a Docker image fairdata/openrefine-metadata-extension that combines the extension with OpenRefine of supported version. It is of course possible to use volume for the data
directory (eventually data/extensions
to include other extensions). All you need to have is Docker running and then:
docker run -p 3333:3333 -v /home/me/openrefine-data:/data:z fairdata/openrefine-metadata-extension
This will run the OpenRefine with metadata extension on port 3333 that will be exposed and mounts your folder /home/me/openrefine-data
as OpenRefine data folder. You should be able to open OpenRefine in browser on localhost:3333
. If there are some other extensions in /home/me/openrefine-data/extensions
, those should be loaded as well. For more information, see OpenRefine documentation.
For configuration files you need to mount /webapp/extensions/metadata/module/config
, see Configuration for more details.
Configuration¶
Configuration files of the metadata extension use the YAML format and are stored in extensions/metadata/module/config
directory of the used OpenRefine installment. The configuration files are loaded when OpenRefine is started. Therefore, you are required to restart OpenRefine before changes in configuration files take effect. We provide examples of the configuration files that you can (re)use.
Settings¶
Settings configuration file serves for generic configuration options that adjust behaviour of the extension. The structure of the file is following:
allowCustomFDP
(boolean) = should be user allowed to enter custom FAIR Data Point URI, username, and password (or use only the pre-configured)metadata
(map) = key-value specification of instance-wide pre-defined metadata, e.g., setlicense
tohttp://purl.org/NET/rdflicense/cc-by3.0
and that URI will be pre-set in all metadata forms in fieldlicense
(but can be overwritten by the user)fdpConnections
(list) = list of pre-configured FAIR Data Point connections that users can use, each is object with attributes:name
(string) = custom name identifying the connectionbaseURI
(string) = base URI of FAIR Data Pointemail
(string) = email address identifying user of FAIR Data Pointpassword
(string) = password for authenticating the user of FAIR Data Pointpreselected
(boolean, optional) = flag if should be pre-selected in the form (in case that more connections have this set to true, only first one is applied)metadata
(map, optional) = similar to instance-wide but only for specific connection
For more information and further configuration options, see examples.
Storages¶
Storages configuration file holds details about storages that are possible to use for Store FAIR data feature. In the file, list of storage object is expected where each of them has:
name
(string) = custom name identifying the storagetype
(string) = one of the allowed types (others are ignored):ftp
,virtuso
,tripleStoreHTTP
details
(object) = configuration related to specific type of storage (see examples)
For FTP and Virtuoso, directory
should containt absolute path where files should be stored. In case of triple stores, repository name is used to specify the target location.
Compatibility¶
Check in-app “About” dialog for compatibility information.
Contributing¶
Development¶
Our projects are open source and you can contribute via GitHub (fork and pull request):
Changelog¶
Overview¶
Here we summarize the key features and changes for each FAIR Data Point release. For details including bugfixes and minor changes, see Detailed changelog.
1.7.0¶
Including FDP Index functionality into FAIR Data Point with harvesting metadata
Metadata search including RDF types
Possibility to change profile and password for current user
1.6.0¶
API keys for making integrations with FDP easier
State “draft” for created metadata
1.5.0¶
Support for editable resource definitions
Possibility to specify custom storage in OpenRefine using frontend
1.4.0¶
Ping service for call home functionality
Suggesting prefixes for namespaces
1.3.0¶
Introduced DASH and dynamic SHACL shapes
Audit log in OpenRefine extension to keep track of actions performed
1.2.0¶
Option to customize metamodel (metadata layers)
Possibility to delete and create metadata entities
1.1.0¶
New monitoring and configuration for client application
Several further improvements in terms of technical debt
Enhanced connecting to FDP from OpenRefine extension and update to OpenRefine 3.3
1.0.0¶
User management, enhanced security, and ACL
Huge refactoring and upgrades of previously accumulated features and technical debt
Separate project for FAIR Data Point Client (frontend application using FDP API)
New OpenRefine Metadata Extension as a replacement for the deprecated FAIRifier
Detailed changelog¶
Each of components developed has its own Changelog based on Keep a Changelog, and our projects adhere to Semantic Versioning. It is recommended to use matching versions of all components.