FAIR Data Point Reference Implementation Documentation¶
About FAIR Data Point¶
FAIRDataPoint is a REST API and Web Client for creating, storing, and serving FAIR metadata. The metadata contents are generated semi-automatically according to the FAIR Data Point software specification document.
Features¶
Store catalogs, datasets, and distributions
Manage users
Manage access rights to your catalogs, datasets, and distributions
Security¶
We have two levels of accessibility in FDP. All resources (e.g., catalogs, datasets,…) are publicly accessible. You don’t need to be logged in to browse them. If you want to upload your own resources, you need to be logged in. To get an account, you need to contact an administrator of the FDP. By default, all uploaded resources are publicly accessible by anyone. But if you want to allow someone to manage your resources (edit/delete), you need to allow it in the resource settings.
We have two types of roles in FDP - an administrator and a user. The administrator is allowed to manage users and all resources. The user can manage just the resources which he owns.
Users and Roles¶
There are different roles for different levels in the FAIR Data Point.
FAIR Data Point Roles¶
Admin¶
Admin can manage other user accounts and access everything in the FAIR Data Point.
User¶
User can create new catalogs and access existing catalogs where she was added.
Catalog Roles¶
Owner¶
Owner can update catalog details, add other users and upload new datasets.
Data Provider¶
Data Provider can create new data sets in the catalog.
Dataset Roles¶
Owner¶
Owner of the data set can update catalog details and add other users.
Components¶
The deployment of the FAIR Data Point consists of a couple of components. See the following image for the overview:

Triple Store¶
Every FAIR Data Point needs to store the semantic data somewhere. A triple Store is a place where the data is. It is possible to configure different stores, see Triple Stores configuration for more details.
MongoDB¶
Besides semantic data, FDP needs information about user accounts and their roles. These data are stored in MongoDB database.
FAIRDataPoint¶
FAIRDataPoint is distributed in Docker image fairdata/fairdatapoint
. It is the core component which handles all the business logic and operations with the semantic data. It also provides API for working with data in different formats.
FAIRDataPoint-client¶
FDP client is distributed in Docker image fairdata/fairdatapoint-client
. It provides the user interface for humans. It works as a reverse proxy in front of the FAIR Data Point which decides whether the request is for machine-readable data and passes it to the FAIRDataPoint or from a web browser in which case it serves the interface for humans.
Reverse Proxy¶
In a production deployment, there is usually a reverse proxy that handles HTTPS certificates, so the connection to the FAIR Data Point is secured. See production deployment to learn how to configure one.
Local Deployment¶
FAIR Data Point is distributed in Docker images. For a simple local deployment, you need to run fairdatapoint
, fairdatapoint-client
and mongo
images. See the Components section to read more about what each image is for.
Here is an example of the simplest Docker Compose configuration to run FDP.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.12.0
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
Then you can run it using docker-compose up -d
. It might take a while to start. You can run docker-compose logs -f
to follow the output log. Once you see a message, that the application started, the FAIR Data Point should be working, and you can open http://localhost.
There are two default user accounts. See the Users and Roles section to read more about users and roles. The default accounts are
User name |
Role |
Password |
---|---|---|
admin |
password |
|
user |
password |
Danger
Using the default accounts is alright if you run FDP on your machine, but you should change them if you want to run FDP publicly.
Running locally on a different port¶
If you want to run the FAIR Data Point locally on a different port than the default 80
, additional configuration is necessary. First, we need to create a new file application.yml
and set the client URL to the actual URL we want to use.
# application.yml
instance:
clientUrl: http://localhost:8080
Then, we need to mount the application config into the FDP container and update the port which the FDP client runs on.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.12.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
ports:
- 8080:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
Persistence¶
We don’t have any data persistence with the previous configuration. Once we remove the containers, all the data will be lost. To keep it, we need to configure MongoDB volume and persistent triple store.
MongoDB volume¶
We use MongoDB to store information about user accounts and access permissions. We can configure a volume so that the data keep on our disk even if we delete MongoDB container.
We can also expose port 27017
so we can access MongoDB from our local computer using a client application like Robo 3T.
Here is the updated docker-compose file:
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.12.0
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- 27017:27017
volumes:
- ./mongo/data:/data/db
Persistent Repository¶
FAIR Data Point uses repositories to store the metadata. By default, it uses the in-memory store, which means that the data is lost after the FDP is stopped.
In this example, we will configure Blazegraph as a triple store. See Triple Stores for other repository options.
If we don’t have it already, we need to create a new file application.yml
. We will use this file to configure the repository and mount it as a read-only volume to the fdp
container. This file can be used for other configuration, see Advanced Configuration for more details.
# application.yml
# ... other configuration
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
We now need to update our docker-compose.yml
file, we add a new volume for the fdp
and add blazegraph
service. We can also expose port 8080
for Blazegraph so we can access its user interface.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.12.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- 27017:27017
volumes:
- ./mongo/data:/data/db
blazegraph:
image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6
ports:
- 8080:8080
volumes:
- ./blazegraph:/blazegraph-data
Production Deployment¶
If you want to run the FAIR Data Point in production it is recommended to use HTTPS protocol with valid certificates. You can easily configure FDP to run behind a reverse proxy which takes care of the certificates.
In this example, we will configure FDP to run on https://fdp.example.com
. We will see how to configure the reverse proxy in the same Docker Compose file. However, it is not necessary, and the proxy can be configured elsewhere.
First of all, we need to generate the certificates on the server where we want to run the FDP. You can use Let’s Encrypt and create the certificates with certbot. The certificates are generated in a standard location, e.g., /etc/letsencrypt/live/fdp.example.com
for fdp.example.com
domain. We will mount the whole letsencrypt
folder to the reverse proxy container later so that it can use the certificates.
As a reverse proxy, we will use nginx. We need to prepare some configuration, so create a new folder called nginx
with the following structure and files:
nginx/
├ nginx.conf
├ sites-available
│ └ fdp.conf
└ sites-enabled
└ fdp.conf -> ../sites-available/fdp.conf
The file nginx.conf
is the configuration of the whole nginx, and it includes all the files from sites-enabled
which contains configuration for individual servers (we can use one nginx, for example, to handle multiple servers on different domains). All available configurations for different servers are in the sites-available
, but only those linked to sites-enabled
are used.
Let’s see what should be the content of the configuration files.
# nginx/nginx.conf
# Main nginx config
user www-data www-data;
worker_processes 5;
events {
worker_connections 4096;
}
http {
# Docker DNS resolver
# We can then use docker container names as hostnames in other configurations
resolver 127.0.0.11 valid=10s;
# Include all the configurations files from sites-enabled
include /etc/nginx/sites-enabled/*.conf;
}
Then, we need to configure the FDP server.
# nginx/sites-available/fdp.conf
server {
listen 443 ssl;
# Generated certificates using certbot, we will mount these in docker-compose.yml
ssl_certificate /etc/letsencrypt/live/fdp.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/fdp.example.com/privkey.pem;
server_name fdp.example.com;
# We pass all the request to the fdp-client container, we can use HTTP in the internal network
# fdp-client_1 is the name of the client container in our configuration, we can use it as host
location / {
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_pass_request_headers on;
proxy_pass http://fdp-client_1;
}
}
# We redirect all request from HTTP to HTTPS
server {
listen 80;
server_name fdp.example.com;
return 301 https://$host$request_uri;
}
Finally, we need to create a soft link from sites-enabled to sites-available for the FDP configuration.
$ cd nginx/sites-enabled && ln -s ../sites-available/fdp.conf
We have certificates generated and configuration for proxy ready. Now we need to add the proxy to our docker-compose.yml
file so we can run the whole FDP behind the proxy.
# docker-compose.yml
version: '3'
services:
proxy:
image: nginx:1.17.3
ports:
- 80:80
- 443:443
volumes:
# Mount the nginx folder with the configuration
- ./nginx:/etc/nginx:ro
# Mount the letsencrypt certificates
- /etc/letsencrypt:/etc/letsencrypt:ro
fdp:
image: fairdata/fairdatapoint:1.12.0
volumes:
- ./application.yml:/fdp/application.yml:ro
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
environment:
- FDP_HOST=fdp
mongo:
image: mongo:4.0.12
ports:
- "127.0.0.1:27017:27017"
volumes:
- ./mongo/data:/data/db
blazegraph:
image: metaphacts/blazegraph-basic:2.2.0-20160908.003514-6
volumes:
- ./blazegraph:/blazegraph-data
The last thing to do is to update our application.yml
file. We need to add clientUrl
so that FDP knows the actual URL even if hidden behind the reverse proxy. It’s a good practice to set up a persistent URL for the metadata too. We recommend using https://purl.org
. If you don’t specify persistentUrl
, the clientUrl
will be used instead. And we also need to set a random JWT token for security.
# application.yml
instance:
clientUrl: https://fdp.example.com
persistentUrl: https://purl.org/fairdatapoint/example
security:
jwt:
token:
secret-key: <random 128 characters string>
# repository settings (can be changed to different repository)
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
At this point, we should be able to run all the containers using docker-compose up -d
and after everything starts, we can access the FAIR Data Point at https://fdp.example.com. Of course, the domain you want to access the FDP on must be configured to the server where it runs.
Danger
Don’t forget to change the default user accounts as soon as your FAIR Data Point becomes publicly available.
Danger
Do not expose mongo port unless you secured the database with username and password.
Warning
In order to improve findability of itself and its content, the FAIR Data Point has a built-in feature that registers its URL into our server and pings it once a week. This feature facilitates the indexing of the metadata of each registered and active FAIR Data Point. If you do not want your FAIR Data Point to be included in this registry, add these lines to your application configuration:
# application.yml
ping:
enabled: false
Advanced Configuration¶
Triple Stores¶
FDP uses InMemory triple store by default. In previous examples, there is Blazegraph used. However, you can choose from 3 additional options.
List of possible triple stores:
In-Memory Store
Native Store
Allegro Graph Repository
GraphDB Repository
Blazegraph Repository
1. In-Memory Store¶
There is no need to configure additional properties to run FDP with In-Memory Store because it’s the default option. If you want to explicitly type in configuration provided in application.yml
, add following lines there:
# application.yml
repository:
type: 1
2. Native Store¶
With this option, FDP will simply save the data to the file system. If you want to use the Native Store, make sure that you have these lines in your application.yml
file:
# application.yml
repository:
type: 2
native:
dir: /tmp/fdp-store
where /tmp/fdp-store
is a path to a location where you want to keep your data stored.
3. Allegro Graph¶
For running Allegro Graph, you need to first set up your Allegro Graph instance. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 3
agraph:
url: http://agraph:10035/repositories/fdp
username: user
password: password
URL
, username
and password
should be configured according to your actual Allegro Graph setup.
4. GraphDB¶
For running GraphDB, you need to first set up your GraphDB instance and create the repository. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 4
graphDb:
url: http://graphdb:7200
repository: <repository-name>
URL
and repository
should be configured according to your actual GraphDB setup.
5. Blazegraph¶
For running Blazegraph, you need to first set up your Blazegraph instance. For configuring the connection from FDP, add these lines to your application.yml
file:
# application.yml
repository:
type: 5
blazegraph:
url: http://blazegraph:8080/blazegraph
repository:
URL
and repository
should be configured according to your actual Blazegraph setup. Repository should be set only if you don’t use the default one.
Mongo DB¶
We store users, permissions, etc. in the MongoDB database. The default connection string is mongodb://mongo:27017/fdp
. If you want to modify it, add these lines to your application.yml
file:
# application.yml
spring:
data:
mongodb:
uri: mongodb://mongo:27017/fdp
The uri
should be adjusted by your actual MongoDB setup.
Default attached metadata¶
There are several default values that are attached to each created metadata. If you want to modify it, add the lines below to your application.yml
file. The default values are listed below, too:
# application.yml
metadataProperties:
language: http://id.loc.gov/vocabulary/iso639-1/en
license: http://rdflicense.appspot.com/rdflicense/cc-by-nc-nd3.0
accessRightsDescription: This resource has no access restriction
metadataMetrics:
https://purl.org/fair-metrics/FM_F1A: https://www.ietf.org/rfc/rfc3986.txt
https://purl.org/fair-metrics/FM_A1.1: https://www.wikidata.org/wiki/Q8777
FDP Index¶
You can turn your FAIR Data Point instance into a FDP Index that can be contacted by other FDPs and harvests metadata from them.
Hosting FDP Index¶
To enable FDP Index mode on your FDP server, just simply adjust your application.yml
file:
# application.yml
fdp-index:
enabled: true
Then for the FDP client, you need to use fairdata/fairdatapoint-index-client
Docker image for browsing indexed FDPs and searching harvested metadata. In case you want to use your deployment both as FDP and FDP Index, you can deploy both FDP and FDP Index client applications. The configuration of both clients are identical.
# docker-compose.yml
version: '3'
services:
# ...
index_client:
image: fairdata/fairdatapoint-index-client:1.12.0
restart: always
# ...
Connecting to FDP Index¶
By default, FDPs use https://home.fairdatapoint.org as their primary FDP Index that they ping every 7 days. You can adjust that in your application.yml
file if needed:
# application.yml
ping:
endpoint: https://my-index.example.com
interval: 86400000 # milliseconds
You can also set multiple endpoints if needed:
# application.yml
ping:
endpoint: >
https://my-index1.example.com
https://my-index2.example.com
https://home.fairdatapoint.org
FDP Index behind proxy¶
FDP Index uses IP-based rate limits to avoid excessive communication caused by bots or misconfigured FDPs. If the FDP Index is deployed behind a proxy, it must correctly set header, e.g., X-Forwarded-For
. Furthermore, you need to add this to application.yml
:
# application.yml
server:
forward-headers-strategy: NATIVE
There may be differences based on you specific deployment. You should check in logs, which IP address is used when ping is received.
Customizations¶
You can customize the look and feel of FDP Client using
SCSS. There are three files you can mount to
/src/scss/custom
. If there are any changes in these files, the
styles will be regenerated when FDP Client starts.
Customization files¶
_variables.scss¶
A lot of values related to styles are defined as variables. The easiest
way to customize the FDP Client is to define new values for these
variables. To do so, you create a file called _variables.scss
where
you define the values that you want to change.
Here is an example of changing the primary color.
// _variables.scss
$color-primary: #087d63;
Have a look in src/scss/_variables.scss to see all the variables you can change.
_extra.scss¶
This file is loaded before all other styles. You can use it, for example, to define new styles or import fonts.
_overrides.scss¶
This file is loaded after all other styles. You can use it to override existing styles.
Example of setting a custom logo¶
To change the logo, you need to do three steps:
Create
_variables.scss
with correct logo file name and dimensionsMount the new logo to the assets folder
Mount
_variables.scss
to SCSS custom folder
// _variables.scss
$header-logo-url: '/assets/my-logo.png'; // new logo file
$header-logo-width: 80px; // width of the new logo
$header-logo-height: 40px; // height of the new logo
# docker-compose.yml
version: '3'
services:
fdp:
# ... FDP configuration
fdp-client:
# ... FDP Client configuration
volumes:
# Mount new logo file to assets in the container
- ./my-logo.png:/usr/share/nginx/html/assets/my-logo.png:ro
# Mount _variables.scss so that styles are regenerated
- ./_variables.scss:/src/scss/custom/_variables.scss:ro
Running FDP on a nested route¶
Sometimes, you might want to run FDP alongside other applications on the
same domain. Here is an example of running FDP on
https://example.com/fairdatapoint
. If you run FDP in this configuration, you
have to set PUBLIC\_PATH
ENV variable, in this example to
/fairdatapoint
. Also, don’t forget to set correct client URL in the application config.
# docker-compose.yml
version: '3'
services:
fdp:
image: fairdata/fairdatapoint:1.12.0
volumes:
- ./application.yml:/fdp/application.yml:ro
# ... other volumes
fdp-client:
image: fairdata/fairdatapoint-client:1.12.0
ports:
- 80:80
environment:
- FDP_HOST=fdp
- PUBLIC_PATH=/fairdatapoint
# application.yml
instance:
clientUrl: https://example.com/fairdatapoint
# Snippet for nginx configuration
server {
# Configruation for the server, certificates, etc.
# Define the location FDP runs on
location ~ /fairdatapoint(/.*)?$ {
rewrite /fairdatapoint(/.*) $1 break;
rewrite /fairdatapoint / break;
proxy_pass http://<client_host>;
}
}
When running on nested route, don’t forget to change paths to all custom assets referenced in SCSS files.
Usage¶
Resource definitions¶
The FAIR Data Point reference implementations introduces the concept of Resource Definitions. A resource definition captures housekeeping data about a metadata resource.
Resource definitions can be accessed from the reference implementation user interface, in the dashboard of an admin user. Here the resource definitions can be managed.
Managing resource definitions¶
Resource definitions can be created, modified, or deleted.
Creating resource definitions¶
When creating a new resource definition, the user interface presents you a form where a resource can be defined through a number of properties.
Name defines the human readable name for a resource definition. This name is used in the admin dashboard to identify the definitions, and does not affect a definition on the functional level. For example, for the default dcat:Dataset
resource, the human readable name would be "Dataset"
.
URL Prefix defines the URL path for a resource. This context path should be a unique identifier, unique in the scope of the other resource defined within the FAIR Data Point instance. For example, for the dcat:Dataset
resource the prefix is dataset
.
Target Class URIs links the shape definitions to a resource definition. In the current implementation, each shape that should be applied to the resource must be listed here. The expected URI value must match the sh:targetClass
value of the shape definition. A common example is to list the dcat:Resource
shape target class along with the specific subclass for the resource, like dcat:Dataset
.
Children defines child resources, if any. This applies when the resource acts as a parent resource for other types of resources. Children are defined by the following properties to provide directives for both the server and the client side.
Child Resource links to the child’s resource definition. The child resource must be defined beforehand.
Child Relation URI defines the predicate IRI that links the parent to the child on the metadata instance level. A common example is the link from a
dcat:Dataset
to adcat:Distribution
; these resources are linked by thedcat:distribution
predicate.Child List View Title defines a literal value to be displayed as a section header for the child resources in the user interface.
Child List View Tags URI defines the predicate IRI for values that are displayed in the user interface whenever the child resources are listed. A common example is
dcat:theme
fordcat:Dataset
resources.Child List View Metadata defines predicate IRIs for values that are listed in the child resource summary.
External links defines predicate IRIs and literal values to be displayed in the user interface for primary interaction with a resource. A common example is for the dcat:Distribution
resource, dcat:accessURL
is mapped to an Access URL
literal, and displayed prominently in the user interface.
Modifying resource definitions¶
When modifying a resource definition, not all properties are writable. Some properties are write-protected to ensure the internal consistency of the system.
The URL Prefix and the Target Class URIs are not writable.
The other properties, like child resources and external links are writable, and can be expanded or modified after the initial creation of the resource definition.
Deleting resource definitions¶
Deleting a resource definition should be used with caution. Existing metadata instances are no longer accessible after the resource definition is deleted. This includes child resources, if those are not linked to other resources.
Shapes¶
The FAIR Data Point reference implementation uses SHACL to add validation constraints to the metadata model.
Creating a new shape¶
A typical resource shape contains the following key elements:
sh:targetClass
to indicate the type of resource the shape applies tosh:property
for each resource propertysh:path
defines the predicate IRIsh:nodeKind
/sh:datatype
defines the object typesh:minCount
/sh:maxCount
defines the property cardinality
@prefix sh: <http://www.w3.org/ns/shacl#> .
@prefix dash: <http://datashapes.org/dash#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix ex: <http://example.com/> .
:MyResourceShape a sh:NodeShape ;
sh:targetClass ex:MyResourceType ;
sh:property [
sh:path ex:value ;
sh:nodeKind sh:Literal ;
sh:minCount 1 ;
sh:maxCount 2 ;
] .
User interface directives¶
The DASH vocabulary introduces extensions to the core SHACL model. One of the extensions is focused on providing user interface hints for shape properties. Introducing or removing a dash:viewer
or dash:editor
property to a sh:PropertyShape
instance influences how the user interface displays the property value.
sh:property [
sh:path ex:value ;
sh:nodeKind sh:Literal ;
dash:viewer dash:LiteralViewer ;
dash:editor dash:TextFieldEditor ;
]
By adding a dash:viewer
statement, the user interface is instructed to show the property value when the resource metadata is displayed. Removing a dash:viewer
statement will instruct the user interface will not render the property value at all. The value will still be present in the metadata model. The supported set of viewers:
sh:LabelViewer
sh:URIViewer
By adding a dash:editor
statement, the editor form in the user interface will show an edit field for the property. Removing a dash:editor
statement will prevent the property from being edited. This could be intended behaviour for properties that are generated server side. The supported set of editors:
sh:TextFieldEditor
sh:TextAreaEditor
sh:URIEditor
sh:DatePickerEditor
Extending an existing shape¶
Extending an existing shape can be achieved by targeting the same sh:targetClass
. For example, to extend the existing dcat:Dataset
shape, an extension shape could look like the following:
:MyExtension a sh:NodeShape ;
sh:targetClass dcat:Dataset ;
sh:property [
sh:path <http://example.com/vocab#myProperty> ;
sh:nodeKind sh:Literal ;
sh:minCount 1 ;
] .
Limitations¶
The current implementation does not provide proper support for overriding properties when an existing resource is extended
The set of supported
dash:viewer
anddash:editor
types does not cover the full range as specified in the DASH specs.
API Usage¶
The FAIR Data Point exposes API endpoints that allow consumers to interact with the metadata. Some of the endpoints are available for all users, while others require an API token for authorization.
Obtaining an API token¶
In order to obtain an API token, you invoke the /tokens
endpoint with your user credentials.
curl -X POST -H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{ "email": "user@example.com", "password": "secret" }' \
https://fdp.example.com/tokens
A successful call will return a JSON
object with a token.
{ "token": "efIobn394nvJJFJ30..." }
Interacting with metadata¶
The metadata layers as defined by the Resource definitions are exposed through their respective endpoints. The general approach is that each layer, defined by its prefix
, supports a number of read and write HTTP methods.
Method |
URL pattern |
Functionality |
---|---|---|
|
|
|
|
|
|
|
|
Retrieving metadata¶
Retrieving metadata is open for GET requests without authorization. In the following example, we retrieve a Dataset
resource by issuing a GET
request to the /dataset
prefix followed by its identifier (a UUID).
curl -H "Accept: text/turtle" https://fdp.example.com/dataset/58d7fbde-6c16-483e-b152-0f3ced131ca9
Creating metadata¶
New metadata can be created by POST
-ing the content to the appropriate endpoint. First we will create a file called metadata.ttl
to store our new metadata.
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
<> a dcat:Dataset ;
dct:title "test" ;
dct:hasVersion "1.0" ;
dct:publisher [ a foaf:Agent ; foaf:name "Example User" ] ;
dcat:theme <http://www.wikidata.org/entity/Q14944328> ;
dct:isPartOf <https://fdp.example.com/catalog/5f4a32c5-1f26-4657-9240-fc7ede7f1ce5> .
This metadata can be created by the following POST
request.
curl -H "Authorization: Bearer efIobn394nvJJFJ30..." \
-H "Content-Type: text/turtle" \
-d @metadata.ttl https://fdp.example.com/dataset
When created, the metadata is initially in a DRAFT
state. To publish the metadata using the API you can issue the following PUT
request to transistion the metadata from the DRAFT
state to the PUBLISHED
state.
curl -X PUT -H "Authorization: Bearer efIobn394nvJJFJ30..." \
-H "Accept: application/json" \
-H "Content-Type: application/json" \
-d '{ "current": "PUBLISHED" }' \
https://fdp.example.com/dataset/79508287-a2a7-4ae2-95b3-3f595e3088cc/meta/state
Update metadata¶
Existing metadata can be updated by issuing a PUT
request with the request body being the updated metadata.
curl -X PUT -H "Authorization: Bearer efIobn394nvJJFJ30..." \
-H "Content-Type: text/turtle" \
-d @metadata.ttl https://fdp.example.com/dataset/79508287-a2a7-4ae2-95b3-3f595e3088cc
API endpoint listing¶
The available APIs are documented using OpenAPI. In the /swagger-ui.html
endpoint the APIs are visualized through Swagger UI.
Usage¶
Here you can read how to use the metadata extension for OpenRefine to store FAIR data and create metadata in FAIR Data Point.
About metadata extension¶
The metadata extension for OpenRefine promotes FAIRness of the data by its integration with FAIR Data Point. With the extension you can easily FAIRify your data that you work on in directly in OpenRefine in two steps:
Store FAIR data in configured storage.
Create metadata in FAIR Data Point in selected FAIR Data Point.
It replaces the legacy project called FAIRifier.
Features¶
The extension provides the features only through FAIR Metadata extension menu located in top right corner above data table (typically next to Wikidata and others).
Store FAIR data¶
Open the dialog for storing the data by clicking FAIR Metadata > Store data to FAIR storage
Select the desired storage (see Storages)
Select the desired format (the selection changes based on storage)
Press Preview (download) to download the file to verify the contents
Press Store to store the data in the storage
You will see the URL to the file which you can easy copy to clipboard by clicking a button
Create metadata in FAIR Data Point¶
Open the dialog for creating the metadata by clicking FAIRMetadata > Create metadata in FAIR Data Point
Select pre-configured FAIR Data Point connection or select Custom FDP connection and fill information (if allowed, see Settings)
Press Connect to connect to the selected FAIR Data Point
Select a catalog from available or create a new one
For a new one, fill in the metadata form (see also the optional fields) and press Create catalog
Select a dataset from available or create a new one
For a new one, fill in the metadata form (see also the optional fields) and press Create dataset
Create a new distribution
Fill in the metadata form (se also the optional fiels)
For the download URL you can easily access Store FAIR data feature and field will be filled after storing the data
Check your new distribution (and/or other layers) listed
Setup¶
This part describes how to set up your own OpenRefine with the metadata extension and how to configure it according to your needs.
Installation¶
There are two ways of using our metadata extension for OpenRefine. You can have installed OpenRefine and add extension to it or use Docker with our prepared image.
Installed OpenRefine¶
This option requires you to have installed compatible version of OpenRefine, please check Compatibility. In case you need to install OpenRefine first, visit their documentation.
Get the desired version of the metadata extension from our GitHub releases page by downloading tgz or zip archive, e.g.,
metadata-X.Y.Z-OpenRefine-X.Y.zip
.Extract the archive to
extensions
folder of your OpenRefine (see OpenRefine documentation).
unzip metadata-X.Y.Z-OpenRefine-X.Y.zip path/to/openrefine-X.Y/webapp/extensions
With Docker¶
If you want to use Docker, we provide a Docker image fairdata/openrefine-metadata-extension that combines the extension with OpenRefine of supported version. It is of course possible to use volume for the data
directory (eventually data/extensions
to include other extensions). All you need to have is Docker running and then:
docker run -p 3333:3333 -v /home/me/openrefine-data:/data:z fairdata/openrefine-metadata-extension
This will run the OpenRefine with metadata extension on port 3333 that will be exposed and mounts your folder /home/me/openrefine-data
as OpenRefine data folder. You should be able to open OpenRefine in browser on localhost:3333
. If there are some other extensions in /home/me/openrefine-data/extensions
, those should be loaded as well. For more information, see OpenRefine documentation.
For configuration files you need to mount /webapp/extensions/metadata/module/config
, see Configuration for more details.
Configuration¶
Configuration files of the metadata extension use the YAML format and are stored in extensions/metadata/module/config
directory of the used OpenRefine installment. The configuration files are loaded when OpenRefine is started. Therefore, you are required to restart OpenRefine before changes in configuration files take effect. We provide examples of the configuration files that you can (re)use.
Settings¶
Settings configuration file serves for generic configuration options that adjust behaviour of the extension. The structure of the file is following:
allowCustomFDP
(boolean) = should be user allowed to enter custom FAIR Data Point URI, username, and password (or use only the pre-configured)metadata
(map) = key-value specification of instance-wide pre-defined metadata, e.g., setlicense
tohttp://purl.org/NET/rdflicense/cc-by3.0
and that URI will be pre-set in all metadata forms in fieldlicense
(but can be overwritten by the user)fdpConnections
(list) = list of pre-configured FAIR Data Point connections that users can use, each is object with attributes:name
(string) = custom name identifying the connectionbaseURI
(string) = base URI of FAIR Data Pointemail
(string) = email address identifying user of FAIR Data Pointpassword
(string) = password for authenticating the user of FAIR Data Pointpreselected
(boolean, optional) = flag if should be pre-selected in the form (in case that more connections have this set to true, only first one is applied)metadata
(map, optional) = similar to instance-wide but only for specific connection
For more information and further configuration options, see examples.
Storages¶
Storages configuration file holds details about storages that are possible to use for Store FAIR data feature. In the file, list of storage object is expected where each of them has:
name
(string) = custom name identifying the storagetype
(string) = one of the allowed types (others are ignored):ftp
,virtuso
,tripleStoreHTTP
details
(object) = configuration related to specific type of storage (see examples)
For FTP and Virtuoso, directory
should containt absolute path where files should be stored. In case of triple stores, repository name is used to specify the target location.
Compatibility¶
Check in-app “About” dialog for compatibility information.
Contributing¶
Development¶
Our projects are open source and you can contribute via GitHub (fork and pull request):
Changelog¶
Overview¶
Here we summarize the key features and changes for each FAIR Data Point release. For details including bugfixes and minor changes, see Detailed changelog.
1.12.0¶
Settings (metrics and ping) can be adjusted directly from UI
Default values can be specified using
sh:defaultValue
**/expanded
endpoint marked as deprecated (may be removed in the following version)Fixed bugs related to resource definition (same child relations, multiple parents)
Fixed computing cache on DB migration and reset to defaults and ordering or resource definitions
1.11.0¶
All metadata have
dct:conformsTo
with profile based on resource definitionResolving labels for RDF resources
Registration of standard namespaces in RDF output
Resource definitions are now related directly to shapes
Fixed metadata with empty keywords and pagination
1.10.0¶
Reset to factory defaults (users, metadata, resource definitions)
Improved UX for browsing child metadata
Allow to change internal shapes (and delete dataset and distribution)
Several dependencies updated (including Java 16)
1.9.0¶
Publishing and sharing SHACL shapes between FDPs
Metadata children pagination
Generating OpenAPI based on resource definitions
Several dependencies updates including Spring Boot 2.4.5
1.8.0¶
Added Admin UI to FDP Index with possibility to trigger metadata retrieval, change settings, or delete entry
Several bug fixes and dependencies updated (including Java 15)
1.7.0¶
Including FDP Index functionality into FAIR Data Point with harvesting metadata
Metadata search including RDF types
Possibility to change profile and password for current user
1.6.0¶
API keys for making integrations with FDP easier
State “draft” for created metadata
1.5.0¶
Support for editable resource definitions
Possibility to specify custom storage in OpenRefine using frontend
1.4.0¶
Ping service for call home functionality
Suggesting prefixes for namespaces
1.3.0¶
Introduced DASH and dynamic SHACL shapes
Audit log in OpenRefine extension to keep track of actions performed
1.2.0¶
Option to customize metamodel (metadata layers)
Possibility to delete and create metadata entities
1.1.0¶
New monitoring and configuration for client application
Several further improvements in terms of technical debt
Enhanced connecting to FDP from OpenRefine extension and update to OpenRefine 3.3
1.0.0¶
User management, enhanced security, and ACL
Huge refactoring and upgrades of previously accumulated features and technical debt
Separate project for FAIR Data Point Client (frontend application using FDP API)
New OpenRefine Metadata Extension as a replacement for the deprecated FAIRifier
Detailed changelog¶
Each of components developed has its own Changelog based on Keep a Changelog, and our projects adhere to Semantic Versioning. It is recommended to use matching versions of all components.