Skip to content

Data Sync Scripts

datasync

Provide subcommands for synchronizing different resources, see subcommands

Usage

datasync [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

Name Description Required Default
--install-completion Install completion for the current shell. No -
--show-completion Show completion for the current shell, to copy it or customize the installation. No -

Commands

Name Description
nva Commands to handle NVA tasks
ubw Export UBW APIs to Parquet in S3 bucket
dms
ninagen Commands to handle NINAGEN tasks
pit-registering-salmon
grass-gis
services Miljødata Infrastructure as Code pipelines
gbif-backbone export GBIF Backbone data to DuckDB database
ipt Provide commands to deal with IPT

Subcommands

nva

Commands to handle NVA tasks

Usage

datasync nva [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

run

Sync NVA data from REST API to target

Usage

datasync nva run [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--resources / --no-resources No no-resources
--projects / --no-projects No no-projects
--persons / --no-persons No no-persons
--categories / --no-categories No no-categories
--funding-sources / --no-funding-sources No no-funding-sources
--base-url No https://api.nva.unit.no/
--duckdb-name No nva_sync
--institution-code No 7511.0.0.0
--endpoint-url No -
--access-key No -
--secret-key No -
--bucket No -
--prefix No nva
--region No us-east-1
filter-data

Filter and create the tables from NVA data to parquet

Usage

datasync nva filter-data [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--data-s3-path No -
--storage-s3-path No -
--storage-access-key No -
--storage-secret-key No -
--storage-bucket No -
--storage-prefix No nva-filtered
--storage-url-style No path

ubw

Export UBW APIs to Parquet in S3 bucket

Usage

datasync ubw [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

run

No description available

Usage

datasync ubw run [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--access-key No -
--secret-key No -
--endpoint-url No -
--bucket No -
--prefix No -
--base-url No -
--auth No -

dms

No description available

Usage

datasync dms [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

generate-csw-metadata

No description available

Usage

datasync dms generate-csw-metadata [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--base-url No -
--access-key No -
--secret-key No -
--endpoint No -
--bucket No -
--publish-url No -
--limit No -
--search Filter resources by title using a LIKE expression No -
generate-geoapi-config

No description available

Usage

datasync dms generate-geoapi-config [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--base-url No -
--publish-url No -
--search Filter resources by title using a LIKE expression No -
generate-maps-json

Generate a maps.json file from map resources in the DMS parquet files. The output follows the format used by the NINA map-editor. The URL for each map is read from the uri field of the resource. The file is written to S3 as a publicly accessible file.

Usage

datasync dms generate-maps-json [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--base-url No -
--access-key No -
--secret-key No -
--endpoint No -
--where Provide an additional SQL filter No 1=1
--bucket S3 bucket for output (e.g., 'my-bucket') No -
--output S3 key path for output JSON file No /dms/maps/maps.json

ninagen

Commands to handle NINAGEN tasks

Usage

datasync ninagen [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

snp-database-normalize

Convert SNP excel sheet to parquet

Usage

datasync ninagen snp-database-normalize [OPTIONS] FILE [SHEET]

Arguments
Name Description Required
FILE path to the file Yes
SHEET Name of the Excel Sheet to use No
Options
Name Description Required Default
--header-row XLSX Row number that contains the header No 1
--allele-start-column XLSX column that contains the first allele No F
snp-analysis-to-parquet

Convert SNP csv of an analysis to a parquet file

Usage

datasync ninagen snp-analysis-to-parquet [OPTIONS] FILE

Arguments
Name Description Required
FILE Path to the csv file Yes
Options

No options available

pit-registering-salmon

No description available

Usage

datasync pit-registering-salmon [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

run

Download PIT data from BioMark's API to a .duckdb file.

Usage

datasync pit-registering-salmon run [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--duckdb-path No biomark_pit_registering_salmon_v1.duckdb
--place Site location (kongsfjord, sylte, vigda, agdenes, vatne) No -
--begin-date Start date for data download in YYYY-MM-DD format No -
--end-date End date for data download in YYYY-MM-DD format No -
--tags / --no-tags Download tags data No no-tags
--readers / --no-readers Download readers voltage data No no-readers
--environment / --no-environment Download environment data No no-environment
--all-locations / --no-all-locations Download data from all accessible locations No no-all-locations
--base-url No https://data3.biomark.com/api/v1/
--yesterday / --no-yesterday Set date range to yesterday only No no-yesterday
--dataset-name No main
replicate

Upload data from .duckdb to S3 bucket.

Usage

datasync pit-registering-salmon replicate [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--bucket No -
--endpoint-url No -
--access-key No -
--secret-key No -
--duckdb-path No biomark_pit_registering_salmon_v1.duckdb
--region No us-east-1
--dataset-name No main
--tags / --no-tags Add tags data to S3 No no-tags
--readers / --no-readers Add readers voltage data to S3 No no-readers
--environment / --no-environment Add environment data to S3 No no-environment

grass-gis

No description available

Usage

datasync grass-gis [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

register-layers

No description available

Usage

datasync grass-gis register-layers [OPTIONS] PARQUET_FILE_PATH

Arguments
Name Description Required
PARQUET_FILE_PATH Yes
PROJECT_NUMBER Yes
GISBASE Yes
Options

No options available

services

Miljødata Infrastructure as Code pipelines

Usage

datasync services [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

services-to-parquet

Convert metadata.yml definitions to a set of parquet that can be imported in the DMS

Usage

datasync services services-to-parquet [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--org No ninanor
--repo No -
--bucket No -
--endpoint No -
--access-key No -
--secret-key No -
--prefix No /dms/tables
--git-username No -
--git-token No -
dashboard

Produce a Homer Dashbord using the Miljødata Infrastructure as Code repository as data source

Usage

datasync services dashboard [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--org No ninanor
--repo No -
--config-org No ninanor
--config-repo No -
--bucket No -
--endpoint No -
--access-key No -
--secret-key No -
--git-username No -
--git-token No -
--prefix No /dms/services

gbif-backbone

export GBIF Backbone data to DuckDB database

Usage

datasync gbif-backbone [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

import-all

Import GBIF Backbone data into a DuckDB database.

Usage

datasync gbif-backbone import-all [OPTIONS]

Arguments

No arguments available

Options

No options available

ipt

Provide commands to deal with IPT

Usage

datasync ipt [OPTIONS] COMMAND [ARGS]...

Arguments

No arguments available

Options

No options available

Subcommands

run

Convert IPT resources to geoparquet, register them in the DMS, publish metadata and configurations

Usage

datasync ipt run [OPTIONS]

Arguments

No arguments available

Options
Name Description Required Default
--skip-data / --no-skip-data Ignore data conversion step, perform only metadata No no-skip-data
--skip-dms / --no-skip-dms Skip publishing to DMS No no-skip-dms
--skip-csw / --no-skip-csw Skip publishing to CSW No no-skip-csw
--skip-geoapi / --no-skip-geoapi Skip publishing to pygeoapi No no-skip-geoapi
--limit Only import a certain amount of records No -
--search execute only on resources which contains that string No -
validate-iso

Validate an XML file against ISO 19115 schema

Usage

datasync ipt validate-iso [OPTIONS] FILE

Arguments
Name Description Required
FILE Yes
Options

No options available