CLI Open Endpoint Examples

The DataBiosphere CLI provides several ways for users of the data store to access and download data sets from the data store. This page covers how to access the data store using the dbio command line utility.

NOTE: The Data Biosphere CLI utility is compatible with Python 3.5+.

dbio create-version

Returns a timestamp in DSS_VERSION format (e.g., 1985-04-12T232050.520000Z), necessary for versioning bundles or files.

Note

A version is a timestamp in RFC3339 format that keeps track of the most recent iteration of a bundle or file. A bundle is a collection of many different data files, and both bundles and files have version numbers.

Example call to dbio create-version:

#!/usr/bin/env bash

dbio dss create-version

dbio download

Downloads a bundle to the local filesystem as a directory. By default, both data and metadata files are downloaded (flags can be added to download only the data or the metadata).

Implementation detail: All files are downloaded to a local cache directory called .dbio that is created in the directory where the download is initiated. The user should never need to interact directly with the .dbio directory.

See note above regarding version numbering.

Example call to dbio get-bundle:

#!/usr/bin/env bash

dbio dss download --replica aws --bundle-uuid ffffaf55-f19c-40e3-aa81-a6c69d357265 --version 2019-08-01T200147.836832Z --download-dir download_test

Example response:

{
  "bundle": {
    "creator_uid": 8008,
    "files": [
      {
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "5c084696",
        "indexed": true,
        "name": "cell_suspension_0.json",
        "s3_etag": "bd60da05055d1cd544855dd35cb12470",
        "sha1": "fdeb52d3caf0becce0575528c81bf0a06cb4a023",
        "sha256": "e0ff1c402a4d6c659937f90d00d9820a2ebf0ebc920260a2a2bddf0961c30de5",
        "size": 847,
        "uuid": "134c0f04-76ae-405d-aea4-b72c08a53dd9",
        "version": "2019-07-09T230754.589000Z"
      },
      {
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "39e6f9e1",
        "indexed": true,
        "name": "specimen_from_organism_0.json",
        "s3_etag": "f30917f841530d78e16223354049c8dc",
        "sha1": "98171c05647a3b771afb3bd61e65d0a25b0afe7f",
        "sha256": "35406f0b8fa1ece3e3589151978aefef28f358afa163874b286eab837fcabfca",
        "size": 864,
        "uuid": "577a91d8-e579-41b6-9353-7e4e774c161a",
        "version": "2019-07-09T222811.151000Z"
      },

      ...

      {
        "content-type": "application/gzip; dcp-type=data",
        "crc32c": "38f31e58",
        "indexed": false,
        "name": "SRR6579532_2.fastq.gz",
        "s3_etag": "ac67e10df687471f5808be96499836c6",
        "sha1": "8743feb4d1ce82328127d10e2b1dfa35e5ae4b5a",
        "sha256": "3d788e06b5ca4c8fc679b47c790b1e266f73d48818a1749743ec85f096d657ea",
        "size": 43810957,
        "uuid": "1330ef1a-7a21-40c6-84c5-5cec18204028",
        "version": "2019-08-03T150636.729022Z"
      }
    ],
    "uuid": "ffffaf55-f19c-40e3-aa81-a6c69d357265",
    "version": "2019-08-01T200147.836832Z"
  }
}

dbio download-manifest

Downloads a list of files specified in a user-provided manifest file.

The manifest file should be in TSV (tab-separated variable) format, with one line in the manifest per file to download. The manifest should contain information about files (one file per line). The information that must be provided for a given bundle is available from the get_bundle() method.

The header row must define the columns:

  • bundle_uuid - UUID of the requested bundle
  • bundle_version - the version of the requested bundle
  • file_name - the name of the file as specified in the bundle
  • file_uuid - the UUID of the file in the DSS
  • file_sha256 - the SHA-256 hash of the file
  • file_size - the size of the file

Example call to dbio download-manifest:

#!/usr/bin/env bash

MANIFEST="manifest.tsv"

# Make the manifest file
cat /dev/null > ${MANIFEST}
echo -e "bundle_uuid\tbundle_version\tfile_name\tfile_uuid\tfile_version\tfile_sha256\tfile_size\tfile_path\n" >> ${MANIFEST}
echo -e "ffffaf55-f19c-40e3-aa81-a6c69d357265\t2019-08-01T200147.836832Z\tlinks.json\tdbf7bd27-b58e-431d-ba05-6a48f29e7cef\t2019-08-03T150636.118831Z\tda4df14eb39cacdff01a08f27685534822c2d40adf534ea7b3e4adf261b9079a\t2081\t.dbio/v2/files_2_4/da/4df1/da4df14eb39cacdff01a08f27685534822c2d40adf534ea7b3e4adf261b9079a\n" >> ${MANIFEST}

echo "manifest.json file: ${MANIFEST}"

# Download files in the manifest
dbio dss download-manifest --replica aws --manifest ${MANIFEST}

Example manifest TSV file:

bundle_uuid                             bundle_version              file_name               file_uuid                               file_version                file_sha256                                                         file_size   file_path
002aeac5-4d74-462d-baea-88f5c620cb50    2019-08-01T200147.836900Z   cell_suspension_0.json  c14b99ea-d8e2-4c84-9dc2-ce2245d8a743    2019-07-09T231935.003000Z   b43cebcca9cd5213699acce7356d226de07edef5c5604510a697159af1a12149    847         .dbio/v2/files_2_4/b4/3ceb/b43cebcca9cd5213699acce7356d226de07edef5c5604510a697159af1a12149

dbio file-head

Returns the metadata for the latest version of a file with a given UUID. If the version is provided, the metadata for that specific version is returned instead. The metadata is returned in the headers.

Example call to dbio file-head:

#!/usr/bin/env bash

# Get the latest version
dbio dss head-file --replica aws --uuid 666ff3f0-67a1-4ead-82e9-3f96a8c0a9b1

# Get the specified version
dbio dss head-file --replica aws --uuid 6887bd52-8bea-47d9-bbd9-ff71e05faeee --version 2019-01-30T165057.189000Z

Example JSON header returned by API:

{
    "Date": "Tue, 22 Oct 2019 19:16:50 GMT",
    "Content-Type": "text/html; charset=utf-8",
    "Content-Length": "0",
    "Connection": "keep-alive",
    "x-amzn-RequestId": "bea3fd18-f373-4cb9-b0d2-0642c955eb5b",
    "X-DSS-SHA1": "ccac0f3fb16d1209ac88de8f293e61a115cfee38",
    "Access-Control-Allow-Origin": "*",
    "X-DSS-S3-ETAG": "d1634210a190ae78f6dd7a21f3c6ef1d",
    "X-DSS-SHA256": "24265fd0ebcdfe84eb1a09227c58c117ed03006b1de3f1e0694e50ed63b2f9e7",
    "Strict-Transport-Security": "max-age=31536000; includeSubDomains; preload",
    "Access-Control-Allow-Headers": "Authorization,Content-Type,X-Amz-Date,X-Amz-Security-Token,X-Api-Key",
    "X-DSS-CONTENT-TYPE": 'application/json; dcp-type="metadata/biomaterial"',
    "X-DSS-CRC32C": "ec41da6a",
    "X-DSS-CREATOR-UID": "8008",
    "x-amz-apigw-id": "B-pROGlIoAMFUwg=",
    "X-DSS-VERSION": "2019-01-30T165057.189000Z",
    "X-Amzn-Trace-Id": "Root=1-5daf55a1-132caa16297ffc40a4046739;Sampled=0",
    "X-AWS-REQUEST-ID": "eeeb46a0-61a2-4fb5-aae9-21fe6a01f277",
    "X-DSS-SIZE": "856",
}

dbio get-bundle

For a given bundle UUID and optionally a bundle version, returns information about the latest version of that bundle. Information returned includes the bundle creator, UUID, and version, as well as information about each file in the bundle, such as the file name, UUID, version, etc.

Example call to dbio get-bundle:

#!/usr/bin/env bash

dbio dss get-bundle --replica aws --uuid fff746b3-e3eb-496a-88a3-5fa1fa358392 --version 2019-08-01T200147.130156Z

Example JSON returned by dbio get-bundle:

{
  "bundle": {
    "creator_uid": 8008,
    "files": [
      {
        "name": "cell_suspension_0.json",
        "uuid": "c14b99ea-d8e2-4c84-9dc2-ce2245d8a743",
        "version": "2019-07-09T231935.003000Z"
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "892ad18b",
        "indexed": true,
        "s3_etag": "57814b3405165d975a6688dc8110dea0",
        "sha1": "849ebad4cff8f4fdf10ad25ad801ebb8aacc58b7",
        "sha256": "b43cebcca9cd5213699acce7356d226de07edef5c5604510a697159af1a12149",
        "size": 847,
      },
      {
        "name": "specimen_from_organism_0.json",
        "uuid": "05998af7-fa6f-44fe-bd16-ac8eafb42f28",
        "version": "2019-07-09T222953.739000Z"
        "content-type": "application/json; dcp-type=\"metadata/biomaterial\"",
        "crc32c": "8686eb38",
        "indexed": true,
        "s3_etag": "c3079914aa72f4aafa926594c756c978",
        "sha1": "885f0d6c524796116394fc4e60f0d9f65988765f",
        "sha256": "d0c8cc0d13e30b73241405035d98265eab891ea94fbccc3da4bb0ca10c3d0f24",
        "size": 872,
      },
      ...
    ],
    "uuid": "002aeac5-4d74-462d-baea-88f5c620cb50",
    "version": "2019-08-01T200147.836900Z"
  }
}

dbio get-bundles-checkout

Check the status and location of a checkout request.

Example call to dbio get-bundles-checkout:

#!/usr/bin/env bash

dbio dss get-bundles-checkout --replica aws --checkout-job-id 4de1c603-fa8b-4c07-af37-06159e6951e0
 

Example JSON returned by dbio get-bundles-checkout:

{
  "location": "s3://ucsc-cgp-dss-checkout-prod/bundles/fff54b87-26fe-42a9-be54-3f5a7ef8176e.2019-03-26T131455.775610Z",
  "status": "SUCCEEDED"
}

dbio get-file

Retrieves a file given a UUID, optionally a version, and displays the details of the file.

Example call to dbio get-file:

#!/usr/bin/env bash

dbio dss get-file --replica aws --uuid 666ff3f0-67a1-4ead-82e9-3f96a8c0a9b1
 

Example JSON returned by dbio get-file:

{
  "describedBy": "https://schema.humancellatlas.org/type/file/7.0.2/sequence_file",
  "schema_type": "file",
  "file_core": {
    "file_name": "SRR6546754_2.fastq.gz",
    "file_format": "fastq.gz"
  },
  "read_index": "read2",
  "insdc_run": [
    "SRR6546754"
  ],
  "technical_replicate_group": "Rep_id_7031",
  "provenance": {
    "document_id": "39a93f75-0db3-4ee2-ab22-3eaa9932cf67",
    "submission_date": "2019-01-30T11:15:21.403Z",
    "update_date": "2019-02-19T17:17:10.540Z"
  }
}

dbio login

Configures and saves authentication credentials.

Example call to dbio login:

#!/usr/bin/env bash

dbio dss login --access-token test

dbio logout

Clears authentication credentials previously configured with login.

Example call to dbio logout:

#!/usr/bin/env bash

dbio dss logout

dbio post-bundles-checkout

Returns a checkout-job-id (e.g., 4de1c603-fa8b-4c07-af37-06159e6951e0). This checkout-job-id can then be used with the get_bundles_checkout() method.

Example call to dbio post-bundles-checkout:

#!/usr/bin/env bash

dbio dss post-bundles-checkout --replica aws --uuid fff746b3-e3eb-496a-88a3-5fa1fa358392

dbio post-search

Find bundles by their bundle_fqid, which is the bundle’s UUID and version separated by a dot (.).

For example, the bundle FQID fff807ba-bc98-4247-a560-49fb90c9675c.2019-08-01T200147.111027Z is a bundle with the UUID fff807ba-bc98-4247-a560-49fb90c9675c and the version number 2019-08-01T200147.111027Z.

This method returns an FQID and URL for each matching bundle.

Example call to dbio post-search:

#!/usr/bin/env bash

dbio dss post-search --replica aws --es-query {} --no-paginate

Example output:

{
  ...
},
{
  "bundle_fqid": "fff807ba-bc98-4247-a560-49fb90c9675c.2019-08-01T200147.111027Z",
  "bundle_url": "https://dss.dev.ucsc-cgp-redwood.org/v1/bundles/fff807ba-bc98-4247-a560-49fb90c9675c?version=2019-08-01T200147.111027Z&replica=aws",
  "search_score": null
},
{
  ...
}

dbio get-subscription(s), dbio put-subscription, dbio delete-subscription

  • get_subscriptions(): Gets a list of users subscription.
  • put_subscription(): Create a collection for the user given a replica and a call-back url.
  • get_subscription(): Given the UUID of the subscription, show a subscription that the user created.
  • delete_subscription(): Given a UUID and rpelica or the subscription, delete the subscription the user created.

Example CLI calls:

#!/usr/bin/env bash

# Creates a sub based given a replica and a url
instance_info=$(dbio dss put-subscription --callback-url https://dcp-cli-tutorials-put-get-delete-sub.data.ucsc-cgp-redwood.org --replica aws) 

ID=`echo ${instance_info} | jq -r '.uuid'`

echo $ID

# Lists all of subs created
dbio dss get-subscriptions --replica aws

# List a sub
dbio dss get-subscription --replica aws --uuid $ID

# Deletes a sub based on a UUID
dbio dss delete-subscription --replica aws --uuid $ID

dbio refresh-swagger

Manually refresh the swagger document.

#!/usr/bin/env bash

dbio dss refresh-swagger

Links: Index / Module Index / Search Page