Skip to main content Link Search Menu Expand Document (external link)

Create a project in ingest

Purpose of this document

This document is intended to give an introduction to the HCA ingest service, specifically targeting data and metadata in the system and how they interact in the ingest ecosystem of data.

These documents will be coupled with a set of python notebooks, which will show examples of how to interact with the data.

This is the create a project doc, which will explain how to create, modify and delete a project in ingest.

Table of contents

  1. Step by step
    1. Static view
    2. Interactive view (Notebook)
  2. Understanding the metadata
    1. Schema fields
    2. Project metadata peculiarities (Non-schema fields)
      1. System fields
      2. Prioritisation fields
      3. Triage fields
  3. Creating your own project metadata
  4. API utilities
    1. Searching for/Retrieving your project
    2. Updating your project
    3. Deleting your project
    4. Accessing other entities from the project

Step by step

In this section, a rendering of a python notebook showing the steps to create a project, step by step, can be expanded from below.

Static view

Project notebook (Markdown render)

Create a project

This notebook is intended to give an insight into how to generate a submission by using the python tools available, the hca-ingest library. This library, amongst other utilities for interacting with the Ingest service, contains a wrapper for Ingest’s API, which lets you easily create, update and delete.

This section will be focused around Projects

Download example project

In order to have the files necessary for this guide, we’re going to download the template file for the project metadata

!wget https://raw.githubusercontent.com/ebi-ait/ingest-programmatic-submissions/main/_data/submission_example/project/example_project.json
--2022-11-09 17:32:55--  https://raw.githubusercontent.com/ebi-ait/ingest-programmatic-submissions/main/_data/submission_example/project/example_project.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1232 (1.2K) [text/plain]
Saving to: ‘example_project.json’

example_project.jso 100%[===================>]   1.20K  --.-KB/s    in 0s      

2022-11-09 17:32:55 (34.3 MB/s) - ‘example_project.json’ saved [1232/1232]

Set up libraries and dependencies

Install external libraries

!pip install hca-ingest
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting hca-ingest
  Downloading hca-ingest-2.6.0.tar.gz (57 kB)
     |████████████████████████████████| 57 kB 2.7 MB/s 
[?25hCollecting jsonref
  Downloading jsonref-1.0.1-py3-none-any.whl (9.5 kB)
Requirement already satisfied: openpyxl in /usr/local/lib/python3.7/dist-packages (from hca-ingest) (3.0.10)
Collecting polling
  Downloading polling-0.3.2.tar.gz (5.2 kB)
Requirement already satisfied: PyYAML>=5.3.1 in /usr/local/lib/python3.7/dist-packages (from hca-ingest) (6.0)
Requirement already satisfied: requests[security] in /usr/local/lib/python3.7/dist-packages (from hca-ingest) (2.23.0)
Collecting xlsxwriter
  Downloading XlsxWriter-3.0.3-py3-none-any.whl (149 kB)
     |████████████████████████████████| 149 kB 10.9 MB/s 
[?25hCollecting mergedeep
  Downloading mergedeep-1.3.4-py3-none-any.whl (6.4 kB)
Collecting cryptography
  Downloading cryptography-38.0.3-cp36-abi3-manylinux_2_24_x86_64.whl (4.1 MB)
     |████████████████████████████████| 4.1 MB 43.8 MB/s 
[?25hCollecting requests-cache
  Downloading requests_cache-0.9.7-py3-none-any.whl (48 kB)
     |████████████████████████████████| 48 kB 4.6 MB/s 
[?25hRequirement already satisfied: cffi>=1.12 in /usr/local/lib/python3.7/dist-packages (from cryptography->hca-ingest) (1.15.1)
Requirement already satisfied: pycparser in /usr/local/lib/python3.7/dist-packages (from cffi>=1.12->cryptography->hca-ingest) (2.21)
Requirement already satisfied: et-xmlfile in /usr/local/lib/python3.7/dist-packages (from openpyxl->hca-ingest) (1.1.0)
Collecting urllib3>=1.25.5
  Downloading urllib3-1.26.12-py2.py3-none-any.whl (140 kB)
     |████████████████████████████████| 140 kB 72.8 MB/s 
[?25hRequirement already satisfied: appdirs>=1.4.4 in /usr/local/lib/python3.7/dist-packages (from requests-cache->hca-ingest) (1.4.4)
Collecting cattrs>=22.2
  Downloading cattrs-22.2.0-py3-none-any.whl (35 kB)
Requirement already satisfied: attrs>=21.2 in /usr/local/lib/python3.7/dist-packages (from requests-cache->hca-ingest) (22.1.0)
Collecting url-normalize>=1.4
  Downloading url_normalize-1.4.3-py2.py3-none-any.whl (6.8 kB)
Collecting exceptiongroup
  Downloading exceptiongroup-1.0.1-py3-none-any.whl (12 kB)
Requirement already satisfied: typing_extensions in /usr/local/lib/python3.7/dist-packages (from cattrs>=22.2->requests-cache->hca-ingest) (4.1.1)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests[security]->hca-ingest) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests[security]->hca-ingest) (3.0.4)
Collecting urllib3>=1.25.5
  Downloading urllib3-1.25.11-py2.py3-none-any.whl (127 kB)
     |████████████████████████████████| 127 kB 37.2 MB/s 
[?25hRequirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests[security]->hca-ingest) (2022.9.24)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from url-normalize>=1.4->requests-cache->hca-ingest) (1.15.0)
Collecting pyOpenSSL>=0.14
  Downloading pyOpenSSL-22.1.0-py3-none-any.whl (57 kB)
     |████████████████████████████████| 57 kB 5.0 MB/s 
[?25hBuilding wheels for collected packages: hca-ingest, polling
  Building wheel for hca-ingest (setup.py) ... [?25l[?25hdone
  Created wheel for hca-ingest: filename=hca_ingest-2.6.0-py3-none-any.whl size=70596 sha256=114859a9f8d95b35d96ab061da4a47a702825e7a9a0015cfdcf7e06ac4300515
  Stored in directory: /root/.cache/pip/wheels/c2/8e/0c/4693dcbc7e2b550c87e3b89a41f0b2e4d98924bc7681c85846
  Building wheel for polling (setup.py) ... [?25l[?25hdone
  Created wheel for polling: filename=polling-0.3.2-py3-none-any.whl size=4129 sha256=3fb1d18a14a77ee5a6864cc4e989d673d80b8421e54621e759012f50d975858a
  Stored in directory: /root/.cache/pip/wheels/e5/3f/0c/54a03b715fce3176335c957ae94d7d0b2a918e89b1b195bace
Successfully built hca-ingest polling
Installing collected packages: urllib3, exceptiongroup, cryptography, url-normalize, pyOpenSSL, cattrs, xlsxwriter, requests-cache, polling, mergedeep, jsonref, hca-ingest
  Attempting uninstall: urllib3
    Found existing installation: urllib3 1.24.3
    Uninstalling urllib3-1.24.3:
      Successfully uninstalled urllib3-1.24.3
Successfully installed cattrs-22.2.0 cryptography-38.0.3 exceptiongroup-1.0.1 hca-ingest-2.6.0 jsonref-1.0.1 mergedeep-1.3.4 polling-0.3.2 pyOpenSSL-22.1.0 requests-cache-0.9.7 url-normalize-1.4.3 urllib3-1.25.11 xlsxwriter-3.0.3

Load libraries

import requests as rq
import json
from hca_ingest.api.ingestapi import IngestApi

Get a token

In order to get a token, you need to log in to the ingest UI: https://staging.contribute.data.humancellatlas.org/. For the purpose of this notebook, we will be using staging. However, that can be change to prod (by deleting the first part of the domain) or to dev (by changing staging to dev) at any point in the process.

If you are going to use any other environment, please remember to change the environment variable in the next section

The steps to obtain the token are detailed in this guide: API tokens

token = "Bearer <paste_token_here>"

Set up environment and global variables

# Environment-related set-up and global variables used across the notebook
accepted_environments = {
    'develop': '.dev',
    'staging': '.staging',
    'production': ''
}

environment = 'staging'  #staging environment by default

# Set up environment value for API's URL
try:
  env_for_url = accepted_environments[environment]
except KeyError:
  print(f"Environment {environment} not recognised. Defaulting to staging")
  env_for_url = accepted_environments['staging']

base_url = f'https://api.ingest{env_for_url}.archive.data.humancellatlas.org'

# Set up API object
api = IngestApi(url=base_url)
headers = api.set_token(token=token)

Create a project

This block of code will be dedicated to creating a project within ingest. The following will be assumed:

  • A JSON entity is available for use as the “content”

For the purpose of this notebook, everything will be performed in the staging environment. To perform this on other environments (e.g. prod), please update the environment variable to any of the values accepted in accepted_environments

# Load the project metadata entity
with open('example_project.json', 'r') as f:
  project_content = json.load(f)

ingest_project = api.create_project(submission_url='', content=project_content)

The returned object is the project as contained by ingest: this object contains the metadata that was submitted in the previous step, but also contains some extra, important metadata:

  • uuid: Unique identifier for your project, generated randomly
  • Management metadata: This metadata comprises metadata that will apply to your experiment, e.g. organs, species used, etc.

We’re going to print the object and take a look

ingest_project
{'content': {'describedBy': 'https://schema.staging.data.humancellatlas.org/type/project/17.0.0/project',
  'schema_type': 'project',
  'project_core': {'project_short_name': 'myCoolLabel',
   'project_title': 'Test_project_with_minimum_information',
   'project_description': 'This is a test project with minimum information for the programmatic submissions guide'},
  'contributors': [{'name': 'Enrique,,Ventura',
    'email': 'enrique@ebi.ac.uk',
    'institution': 'EMBL-EBI',
    'corresponding_contributor': True,
    'project_role': {'text': 'data curator',
     'ontology': 'EFO:0009737',
     'ontology_label': 'data curator'}}],
  'publications': [{'authors': ['Lorem IP', 'Sed UP'],
    'title': 'A combined approach for single-cell mRNA and intracellular protein expression analysis',
    'url': 'https://www.frontiersin.org/articles/10.3389/fcell.2020.00384/full',
    'official_hca_publication': False}],
  'funders': [{'grant_title': 'a cool grant',
    'grant_id': '000000000bp1',
    'organization': 'EMBL-EBI'}]},
 'submissionDate': '2022-11-09T17:36:29.253947Z',
 'updateDate': '2022-11-09T17:36:29.253947Z',
 'user': '5ece3464ec0680746267e784',
 'lastModifiedUser': '5ece3464ec0680746267e784',
 'type': 'Project',
 'uuid': {'uuid': 'e838a25f-8f52-4678-9744-6d650ca65374'},
 'events': [],
 'firstDcpVersion': '2022-11-09T17:36:29.253947Z',
 'dcpVersion': '2022-11-09T17:36:29.253947Z',
 'contentLastModified': '2022-11-09T17:36:29.246070Z',
 'accession': None,
 'validationState': 'Draft',
 'validationErrors': None,
 'graphValidationErrors': None,
 'isUpdate': False,
 'releaseDate': None,
 'accessionDate': None,
 'technology': None,
 'organ': None,
 'cellCount': None,
 'dataAccess': None,
 'identifyingOrganisms': None,
 'primaryWrangler': None,
 'secondaryWrangler': None,
 'wranglingState': None,
 'wranglingPriority': None,
 'wranglingNotes': None,
 'isInCatalogue': None,
 'cataloguedDate': None,
 'publicationsInfo': None,
 'dcpReleaseNumber': None,
 'projectLabels': None,
 'hasOpenSubmission': False,
 '_links': {'self': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786'},
  'project': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786',
   'title': 'A single project'},
  'validating': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/validatingEvent'},
  'bundleManifests': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/bundleManifests',
   'title': 'Access or create bundle manifests (describing which submitted contents went into which bundle in the datastore)'},
  'auditLogs': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/auditLogs'},
  'supplementaryFiles': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/supplementaryFiles'},
  'submissionEnvelopes': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelopes',
   'title': 'Access or create new submission envelopes'},
  'submissionEnvelope': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelope',
   'title': 'A single submission envelope'}}}

Everything looks correct, so we will save the identifier for our project (called the uuid) and store it in case we need to retrieve the project later.

# Store project uuid
ingest_project_uuid = ingest_project['uuid']['uuid']

Understanding the information on the project

After printing the resulting ingest_project, you probably have noticed that there is much more meatadata than what was sent; for most entities, this is just system-generated and you don’t need to worry about it.

However, for project metadata, we load some information regarding statuses and general-level metadata for different purposes (e.g. display in the project catalogue).

This project-level metadata is explained in more detail in the create a project guideline associated with this notebook. For now, we will focus on the metadata that we should fill out:

minimum_required_fields = {
    'releaseDate': None,          # Date that you want your data to be released. If the data is to be released as soon as possible, or if data has already been released (e.g. in GEO) input today's date in format: YYYY-MM-DDT00:00:00Z (e.g. 2021-11-29T00:00:00Z)
    'accessionDate': None,        # Same as above, but for accessioning in public archives.
    'technology': None,           # Library preparation technology(ies) used in the experiment, ontologised. More below.
    'organ': None,                # Organ(s) used in the experimnt, ontologised. More below
    'cellCount': None,            # Estimated number of cells generated by this project.
    'dataAccess': None,           # Type of data access, selected from a list of terms. For more detail, refer to readme.
    'identifyingOrganisms': None, # Organism that was used to generate the data, can be: Human, Mouse, or both.
    'primaryWrangler': None,      # Person that is in charge of the project/submission: associated with a user.
    'wranglingState': None,       # Status of the project. For a detailed list of accepted values, refer to readme.
    'wranglingPriority': None,    # 1, 2, or 3. 1 is highest priority and 3 is lowest. Refer to readme for more information.
    'wranglingNotes': None,       # Extra notes associated with the project; feel free to input your own notes here.
    'isInCatalogue': None,        # If the project is to be displayed in the catalogue, True, otherwise False
    }

Adding minimum information

Now, we will be modifying the information on the list above, to make sure we enter the minimum amount of metadata that the project should contain. We’re going to divide the fields in 2 types:

  • Ontologised: fields that are validated against the HCA ontology.
  • Other: Fields that have are not ontologised and that are validated against other premises.

We’re going to start with the ontologised fields.

Ontologised fields

These terms are called “ontologised” because they are validated against a set of restrictions defined both in our validation rules and enforced in the ontologies themselves; for example, organ validates that the term used as an input is validated as a child term, only with relationship subclassOf, of the term anatomical structure(UBERON:0000061). Detailed information on the restrictions can be found in the readme file.

In this category, we have 2 fields:

  • organ: A list of the organs used in this experiment; for this notebook, we’re going to use the terms “lung”(UBERON:0002048) and “heart”(UBERON:0000948).
  • technology: A list of the library preparation technologies used in this experiment; for this notebook, we’re going to use the terms 10x 3' v2(EFO:0009899) and 10x 3' v3(EFO:0009922). This field also accepts free text in case there is no ontology for the term just yet; we are also going to add an entry for this
# Set up organ
organ = {
    "ontologies": [
      {
        "text": "lung",                 # Text field, free string that allows the user to introduce a more exact definition of the term if not available in the ontology
        "ontology": "UBERON:0002048",   # Unique identifier for the ontology term, in the form of <ontology>:<ID>
        "ontology_label": "lung"        # Text field, must exactly match the label provided in the ontology, case sensitive.
      },
      {
        "text": "heart",
        "ontology": "UBERON:0000948",
        "ontology_label": "heart"
      }
    ]
}

# Set up technology
technology = {
    "ontologies": [
      {
        "text": "10x 3' v2",
        "ontology": "EFO:0009899",
        "ontology_label": "10x 3' v2"
      },
      {
        "text": "10x 3' v3",
        "ontology": "EFO:0009922",
        "ontology_label": "10x 3' v3"
      }
    ],
    "others": [
        "Mysupercoollibrarypreptechnology"  # Free text field to introduce as many terms as you want that couldn't be found in the ontology
    ]
}

# pass the values to our variable
minimum_required_fields['organ'] = organ
minimum_required_fields['technology'] = technology
Other fields
# Dates
# Dates must follow the following format: YYYY-MM-DDThh:mm:ssZ
minimum_required_fields['releaseDate'] = "2022-08-30T00:00:00Z"
minimum_required_fields['accessionDate'] = "2022-08-30T00:00:00Z"

# Enum values
# Set of values accepted are predetermined, depending on the field. 
# For the full list of values, please refer to the readme
minimum_required_fields['dataAccess'] = {
                                          "type": "All fully open",
                                          "notes": "Can be released publicly! :D"
                                        }
minimum_required_fields['identifyingOrganisms'] = ["Human", "Mouse", "Other"]
minimum_required_fields['wranglingPriority'] = 1 # Very important project!
minimum_required_fields['wranglingState'] = "Eligible"

# Simple values
# Set of fields that have a simple value; it may be a free string, an integer or a boolean
minimum_required_fields['cellCount'] = 17500
minimum_required_fields['primaryWrangler'] = ingest_project['user'] # User ID is required in this field.
minimum_required_fields['wranglingNotes'] = "This is an awesome project and I will finish it soon"
minimum_required_fields['isInCatalogue'] = True # We want the project to be displayed in the project catalogue

Updating project with missing information

Now that we understand the metadata that we are handling, and that we have filled in the missing bits necessary for a minimum information project, we will update the project with the values that we have been gathering.

Once we have the content that we have to update, the update itself is pretty easy!

# Retrieve project URL to update
ingest_project_url = ingest_project['_links']['self']['href']
response = api.patch(url=ingest_project_url, json=minimum_required_fields)

updated_ingest_project = response.json()

Let’s print the project and check if the changes have made it through!

updated_ingest_project
{'content': {'describedBy': 'https://schema.staging.data.humancellatlas.org/type/project/17.0.0/project',
  'schema_type': 'project',
  'project_core': {'project_short_name': 'myCoolLabel',
   'project_title': 'Test_project_with_minimum_information',
   'project_description': 'This is a test project with minimum information for the programmatic submissions guide'},
  'contributors': [{'name': 'Enrique,,Ventura',
    'email': 'enrique@ebi.ac.uk',
    'institution': 'EMBL-EBI',
    'corresponding_contributor': True,
    'project_role': {'text': 'data curator',
     'ontology': 'EFO:0009737',
     'ontology_label': 'data curator'}}],
  'publications': [{'authors': ['Lorem IP', 'Sed UP'],
    'title': 'A combined approach for single-cell mRNA and intracellular protein expression analysis',
    'url': 'https://www.frontiersin.org/articles/10.3389/fcell.2020.00384/full',
    'official_hca_publication': False}],
  'funders': [{'grant_title': 'a cool grant',
    'grant_id': '000000000bp1',
    'organization': 'EMBL-EBI'}]},
 'submissionDate': '2022-11-09T17:36:29.253Z',
 'updateDate': '2022-11-09T17:37:57.227484Z',
 'user': '5ece3464ec0680746267e784',
 'lastModifiedUser': '5ece3464ec0680746267e784',
 'type': 'Project',
 'uuid': {'uuid': 'e838a25f-8f52-4678-9744-6d650ca65374'},
 'events': [],
 'firstDcpVersion': '2022-11-09T17:36:29.253Z',
 'dcpVersion': '2022-11-09T17:36:29.253Z',
 'contentLastModified': '2022-11-09T17:36:29.246Z',
 'accession': None,
 'validationState': 'Valid',
 'validationErrors': [],
 'graphValidationErrors': None,
 'isUpdate': False,
 'releaseDate': '2022-08-30T00:00:00Z',
 'accessionDate': '2022-08-30T00:00:00Z',
 'technology': {'ontologies': [{'text': "10x 3' v2",
    'ontology': 'EFO:0009899',
    'ontology_label': "10x 3' v2"},
   {'text': "10x 3' v3",
    'ontology': 'EFO:0009922',
    'ontology_label': "10x 3' v3"}],
  'others': ['Mysupercoollibrarypreptechnology']},
 'organ': {'ontologies': [{'text': 'lung',
    'ontology': 'UBERON:0002048',
    'ontology_label': 'lung'},
   {'text': 'heart',
    'ontology': 'UBERON:0000948',
    'ontology_label': 'heart'}]},
 'cellCount': 17500,
 'dataAccess': {'type': 'All fully open',
  'notes': 'Can be released publicly! :D'},
 'identifyingOrganisms': ['Human', 'Mouse', 'Other'],
 'primaryWrangler': '5ece3464ec0680746267e784',
 'secondaryWrangler': None,
 'wranglingState': 'Eligible',
 'wranglingPriority': 1,
 'wranglingNotes': 'This is an awesome project and I will finish it soon',
 'isInCatalogue': True,
 'cataloguedDate': '2022-11-09T17:37:57.207027Z',
 'publicationsInfo': None,
 'dcpReleaseNumber': None,
 'projectLabels': None,
 'hasOpenSubmission': False,
 '_links': {'self': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786'},
  'project': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786',
   'title': 'A single project'},
  'processing': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/processingEvent'},
  'draft': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/draftEvent'},
  'bundleManifests': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/bundleManifests',
   'title': 'Access or create bundle manifests (describing which submitted contents went into which bundle in the datastore)'},
  'auditLogs': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/auditLogs'},
  'supplementaryFiles': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/supplementaryFiles'},
  'submissionEnvelopes': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelopes',
   'title': 'Access or create new submission envelopes'},
  'submissionEnvelope': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelope',
   'title': 'A single submission envelope'}}}

And we have our project, updated, with the minimum required metadata!

Retrieve a project

Once we have created a project with minimum information, we may want to retrieve the project to do further things with it (Add more metadata, check status, etc). In order to do this, we are going to use one of the many functions that we have available to retrieve a project:

  • IngestApi.get_project_by_uuid: Retrieves a single project with a UUID

But there are other functions available, in case you don’t have the UUID at hand or can’t remember, listed below:

Functions to search for projects
  • .get_user_projects: Retrieve all the projects associated with your user (Requires token to be set)
  • .get_project_by_id: Retrieve a project with the MongoDB ID provided
ingest_project = api.get_project_by_uuid(ingest_project_uuid)

Let’s ensure we have retrieved our project correctly:

ingest_project
{'content': {'describedBy': 'https://schema.staging.data.humancellatlas.org/type/project/17.0.0/project',
  'schema_type': 'project',
  'project_core': {'project_short_name': 'myCoolLabel',
   'project_title': 'Test_project_with_minimum_information',
   'project_description': 'This is a test project with minimum information for the programmatic submissions guide'},
  'contributors': [{'name': 'Enrique,,Ventura',
    'email': 'enrique@ebi.ac.uk',
    'institution': 'EMBL-EBI',
    'corresponding_contributor': True,
    'project_role': {'text': 'data curator',
     'ontology': 'EFO:0009737',
     'ontology_label': 'data curator'}}],
  'publications': [{'authors': ['Lorem IP', 'Sed UP'],
    'title': 'A combined approach for single-cell mRNA and intracellular protein expression analysis',
    'url': 'https://www.frontiersin.org/articles/10.3389/fcell.2020.00384/full',
    'official_hca_publication': False}],
  'funders': [{'grant_title': 'a cool grant',
    'grant_id': '000000000bp1',
    'organization': 'EMBL-EBI'}]},
 'submissionDate': '2022-11-09T17:36:29.253Z',
 'updateDate': '2022-11-09T17:37:57.227Z',
 'user': '5ece3464ec0680746267e784',
 'lastModifiedUser': '5ece3464ec0680746267e784',
 'type': 'Project',
 'uuid': {'uuid': 'e838a25f-8f52-4678-9744-6d650ca65374'},
 'events': [],
 'firstDcpVersion': '2022-11-09T17:36:29.253Z',
 'dcpVersion': '2022-11-09T17:36:29.253Z',
 'contentLastModified': '2022-11-09T17:36:29.246Z',
 'accession': None,
 'validationState': 'Valid',
 'validationErrors': [],
 'graphValidationErrors': None,
 'isUpdate': False,
 'releaseDate': '2022-08-30T00:00:00Z',
 'accessionDate': '2022-08-30T00:00:00Z',
 'technology': {'ontologies': [{'text': "10x 3' v2",
    'ontology': 'EFO:0009899',
    'ontology_label': "10x 3' v2"},
   {'text': "10x 3' v3",
    'ontology': 'EFO:0009922',
    'ontology_label': "10x 3' v3"}],
  'others': ['Mysupercoollibrarypreptechnology']},
 'organ': {'ontologies': [{'text': 'lung',
    'ontology': 'UBERON:0002048',
    'ontology_label': 'lung'},
   {'text': 'heart',
    'ontology': 'UBERON:0000948',
    'ontology_label': 'heart'}]},
 'cellCount': 17500,
 'dataAccess': {'type': 'All fully open',
  'notes': 'Can be released publicly! :D'},
 'identifyingOrganisms': ['Human', 'Mouse', 'Other'],
 'primaryWrangler': '5ece3464ec0680746267e784',
 'secondaryWrangler': None,
 'wranglingState': 'Eligible',
 'wranglingPriority': 1,
 'wranglingNotes': 'This is an awesome project and I will finish it soon',
 'isInCatalogue': True,
 'cataloguedDate': '2022-11-09T17:37:57.207Z',
 'publicationsInfo': None,
 'dcpReleaseNumber': None,
 'projectLabels': None,
 'hasOpenSubmission': False,
 '_links': {'self': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786'},
  'project': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786',
   'title': 'A single project'},
  'processing': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/processingEvent'},
  'draft': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/draftEvent'},
  'bundleManifests': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/bundleManifests',
   'title': 'Access or create bundle manifests (describing which submitted contents went into which bundle in the datastore)'},
  'auditLogs': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/auditLogs'},
  'supplementaryFiles': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/supplementaryFiles'},
  'submissionEnvelopes': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelopes',
   'title': 'Access or create new submission envelopes'},
  'submissionEnvelope': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelope',
   'title': 'A single submission envelope'}}}

Check status of a project

When a project (or any piece of metadata) is updated to ingest, it gets validated, the content being validated against the schema it is pointing to (on the describedBy field), and in the case of the project, the base fields validating against other set of rules.

The ingest service has the ability to provide with a full report of these validation events, including the status of the entity and the error messages.

On this section, we will focus on retrieving the errors (currently none) of the project we just uploaded and we will update the project to artificially produce a couple of errors. We will then retrieve the project again and check on the errors, but for a detailed explanation of each type of error, please refer to the Readme file.

# Print validation errors
validation_errors = ingest_project['validationErrors']
print(f"Validation errors: {validation_errors if validation_errors else None}")

# Print validation status
validation_status = ingest_project['validationState']
print(f"Validation status: {validation_status}")
Validation errors: None
Validation status: Valid
non_valid_content = ingest_project['content']
non_valid_content['estimated_cell_count'] = '17500'             # Cell count should always be an integer
non_valid_content['insdc_project_accessions'] =  [              # INSDC project accessions:
                                      'GSE7777777',   # SHOULD NOT be a GEO series accession
                                      'SRP000000',    # SHOULD follow SRPXXXXXX format
                                      '',             # SHOULD NOT be an empty string
                                      347289347       # SHOULD NOT be a numer
                                      ]

non_valid_values =  {    
                      "content" : non_valid_content   # Patching "content" field
                    } 

# Patch the non_valid content into the project content
response = api.patch(url=ingest_project_url, json=non_valid_values)
response.json()
{'content': {'describedBy': 'https://schema.staging.data.humancellatlas.org/type/project/17.0.0/project',
  'schema_type': 'project',
  'project_core': {'project_short_name': 'myCoolLabel',
   'project_title': 'Test_project_with_minimum_information',
   'project_description': 'This is a test project with minimum information for the programmatic submissions guide'},
  'contributors': [{'name': 'Enrique,,Ventura',
    'email': 'enrique@ebi.ac.uk',
    'institution': 'EMBL-EBI',
    'corresponding_contributor': True,
    'project_role': {'text': 'data curator',
     'ontology': 'EFO:0009737',
     'ontology_label': 'data curator'}}],
  'publications': [{'authors': ['Lorem IP', 'Sed UP'],
    'title': 'A combined approach for single-cell mRNA and intracellular protein expression analysis',
    'url': 'https://www.frontiersin.org/articles/10.3389/fcell.2020.00384/full',
    'official_hca_publication': False}],
  'funders': [{'grant_title': 'a cool grant',
    'grant_id': '000000000bp1',
    'organization': 'EMBL-EBI'}],
  'estimated_cell_count': '17500',
  'insdc_project_accessions': ['GSE7777777', 'SRP000000', '', 347289347]},
 'submissionDate': '2022-11-09T17:36:29.253Z',
 'updateDate': '2022-11-09T17:38:35.057769Z',
 'user': '5ece3464ec0680746267e784',
 'lastModifiedUser': '5ece3464ec0680746267e784',
 'type': 'Project',
 'uuid': {'uuid': 'e838a25f-8f52-4678-9744-6d650ca65374'},
 'events': [],
 'firstDcpVersion': '2022-11-09T17:36:29.253Z',
 'dcpVersion': '2022-11-09T17:38:35.057205Z',
 'contentLastModified': '2022-11-09T17:38:35.057205Z',
 'accession': None,
 'validationState': 'Valid',
 'validationErrors': [],
 'graphValidationErrors': None,
 'isUpdate': False,
 'releaseDate': '2022-08-30T00:00:00Z',
 'accessionDate': '2022-08-30T00:00:00Z',
 'technology': {'ontologies': [{'text': "10x 3' v2",
    'ontology': 'EFO:0009899',
    'ontology_label': "10x 3' v2"},
   {'text': "10x 3' v3",
    'ontology': 'EFO:0009922',
    'ontology_label': "10x 3' v3"}],
  'others': ['Mysupercoollibrarypreptechnology']},
 'organ': {'ontologies': [{'text': 'lung',
    'ontology': 'UBERON:0002048',
    'ontology_label': 'lung'},
   {'text': 'heart',
    'ontology': 'UBERON:0000948',
    'ontology_label': 'heart'}]},
 'cellCount': 17500,
 'dataAccess': {'type': 'All fully open',
  'notes': 'Can be released publicly! :D'},
 'identifyingOrganisms': ['Human', 'Mouse', 'Other'],
 'primaryWrangler': '5ece3464ec0680746267e784',
 'secondaryWrangler': None,
 'wranglingState': None,
 'wranglingPriority': 1,
 'wranglingNotes': 'This is an awesome project and I will finish it soon',
 'isInCatalogue': True,
 'cataloguedDate': '2022-11-09T17:37:57.207Z',
 'publicationsInfo': None,
 'dcpReleaseNumber': None,
 'projectLabels': None,
 'hasOpenSubmission': False,
 '_links': {'self': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786'},
  'project': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786',
   'title': 'A single project'},
  'processing': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/processingEvent'},
  'draft': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/draftEvent'},
  'bundleManifests': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/bundleManifests',
   'title': 'Access or create bundle manifests (describing which submitted contents went into which bundle in the datastore)'},
  'auditLogs': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/auditLogs'},
  'supplementaryFiles': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/supplementaryFiles'},
  'submissionEnvelopes': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelopes',
   'title': 'Access or create new submission envelopes'},
  'submissionEnvelope': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/636be51d897ce65cad07c786/submissionEnvelope',
   'title': 'A single submission envelope'}}}

After patching the project with invalid values, let’s repeat the check we did previously.

ingest_project = api.get_project_by_uuid(ingest_project_uuid)
# Print validation errors
validation_errors = ingest_project['validationErrors']
newline = '\n'
print(f"Validation errors: {validation_errors if validation_errors else None}")

# Print validation status
validation_status = ingest_project['validationState']
print(f"Validation status: {validation_status}")
Validation errors: [{'errorType': 'METADATA_ERROR', 'message': 'should match pattern "^[D|E|S]RP[0-9]+$"', 'userFriendlyMessage': 'should match pattern "^[D|E|S]RP[0-9]+$" at .insdc_project_accessions[0]', 'absoluteDataPath': '.insdc_project_accessions[0]'}, {'errorType': 'METADATA_ERROR', 'message': 'should match pattern "^[D|E|S]RP[0-9]+$"', 'userFriendlyMessage': 'should match pattern "^[D|E|S]RP[0-9]+$" at .insdc_project_accessions[2]', 'absoluteDataPath': '.insdc_project_accessions[2]'}, {'errorType': 'METADATA_ERROR', 'message': 'should be string', 'userFriendlyMessage': 'should be string at .insdc_project_accessions[3]', 'absoluteDataPath': '.insdc_project_accessions[3]'}, {'errorType': 'METADATA_ERROR', 'message': 'should be integer', 'userFriendlyMessage': 'should be integer at .estimated_cell_count', 'absoluteDataPath': '.estimated_cell_count'}]
Validation status: Invalid

As we can see, this time it has returned 2 things:

  • A set of errors, comprised in a list that details the errors, from type to message.
  • Validation status: invalid, indicating that the validation went wrong.

For detailed information on how to understand the errors, please proceed to the “readme.md” file.

Delete a project

Projects in our database can be deleted. While we do not advise to delete projects once they have been published in the data portal (uuid identifiers are important for updates), at any point before finishing the submission (Later in the notebook), any metadata entity can be deleted, including projects.

# Delete ingest project and check everything went correctly
response = api.delete(ingest_project_url)

assert response.status_code == 204

If the status code of the response is 204, the project has been deleted!

Addendum

Updating projects

Deleting a field/Replacing all values

Deleting a field requires a slightly different sort of operation; up until now, we have used patch to address field modifications. However, if we want to delete a field or replace all values, we would need to delete the field from the content, and then PUT the whole content of the project entity to the project URL.

This operation will completely replace the older entry with the new one; using the old one as a template ensures critical fields (e.g. uuid) get preserved over this operation.

Adding new fields

When adding new fields, considering the type of field that is going to be added is essential; nested properties and arrays can’t be just modified through a patch operation, they need the document to be partially (or entirely) replaced

In this notebook, we are going to add 2 fields:

  • A completely new field, available in the schema, insdc_project_accessions
  • A new publication that we want associated to this project, without deleting the already existing one.
# Adding the INSDC project accession
response = api.patch(url=ingest_project_url, patch={"content": {"insdc_project_accessions": ["SRP000000"]}})
assert response.status_code == 200
updated_project = api.get_project_by_uuid(ingest_project_uuid)

# Let's print the project and ensure the modification has gone through!
updated_project
{'content': {'insdc_project_accessions': ['SRP000000']},
 'submissionDate': '2022-08-31T14:05:37.625Z',
 'updateDate': '2022-08-31T14:07:18.255Z',
 'user': '5ece3464ec0680746267e784',
 'lastModifiedUser': '5ece3464ec0680746267e784',
 'type': 'Project',
 'uuid': {'uuid': '019b3b05-903b-4b85-bdae-1e10589ccd06'},
 'events': [],
 'firstDcpVersion': '2022-08-31T14:05:37.625Z',
 'dcpVersion': '2022-08-31T14:07:18.252Z',
 'contentLastModified': '2022-08-31T14:07:18.252Z',
 'accession': None,
 'validationState': 'Draft',
 'validationErrors': [],
 'graphValidationErrors': None,
 'isUpdate': False,
 'releaseDate': '2022-08-30T00:00:00Z',
 'accessionDate': '2022-08-30T00:00:00Z',
 'technology': {'ontologies': [{'text': "10x 3' v2",
    'ontology': 'EFO:0009899',
    'ontology_label': "10x 3' v2"},
   {'text': "10x 3' v3",
    'ontology': 'EFO:0009922',
    'ontology_label': "10x 3' v3"}],
  'others': ['Mysupercoollibrarypreptechnology']},
 'organ': {'ontologies': [{'text': 'lung',
    'ontology': 'UBERON:0002048',
    'ontology_label': 'lung'},
   {'text': 'heart',
    'ontology': 'UBERON:0000948',
    'ontology_label': 'heart'}]},
 'cellCount': 17500,
 'dataAccess': {'type': 'All fully open',
  'notes': 'Can be released publicly! :D'},
 'identifyingOrganisms': ['Human', 'Mouse', 'Other'],
 'primaryWrangler': '5ece3464ec0680746267e784',
 'secondaryWrangler': None,
 'wranglingState': 'Eligible',
 'wranglingPriority': 1,
 'wranglingNotes': 'This is an awesome project and I will finish it soon',
 'isInCatalogue': True,
 'cataloguedDate': '2022-08-31T14:06:06.624Z',
 'publicationsInfo': None,
 'dcpReleaseNumber': None,
 'projectLabels': None,
 'hasOpenSubmission': False,
 '_links': {'self': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69'},
  'project': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69',
   'title': 'A single project'},
  'validating': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/validatingEvent'},
  'bundleManifests': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/bundleManifests',
   'title': 'Access or create bundle manifests (describing which submitted contents went into which bundle in the datastore)'},
  'auditLogs': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/auditLogs'},
  'supplementaryFiles': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/supplementaryFiles'},
  'submissionEnvelopes': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/submissionEnvelopes',
   'title': 'Access or create new submission envelopes'},
  'submissionEnvelope': {'href': 'https://api.ingest.staging.archive.data.humancellatlas.org/projects/630f6ab106f1711fccbe4d69/submissionEnvelope',
   'title': 'A single submission envelope'}}}

Interactive view (Notebook)

Open In Colab

Download Notebook

Understanding the metadata

In the example above, this is the metadata that we have initially sent:

Project JSON metadata
{
  "describedBy": "https://schema.staging.data.humancellatlas.org/type/project/17.0.0/project",
  "schema_type": "project",
  "project_core": {
    "project_short_name": "myCoolLabel",
    "project_title": "Test_project_with_minimum_information",
    "project_description": "This is a test project with minimum information for the programmatic submissions guide"
  },
  "contributors": [
    {
      "name": "Enrique,,Ventura",
      "email": "enrique@ebi.ac.uk",
      "institution": "EMBL-EBI",
      "corresponding_contributor": true,
      "project_role": {
        "text": "data curator",
        "ontology": "EFO:0009737",
        "ontology_label": "data curator"
      }
    }
  ],
  "publications": [
    {
      "authors": [
        "Lorem IP",
        "Sed UP"
      ],
      "title": "A combined approach for single-cell mRNA and intracellular protein expression analysis",
      "url": "https://www.frontiersin.org/articles/10.3389/fcell.2020.00384/full",
      "official_hca_publication": false
    }
  ],
  "funders": [
    {
      "grant_title": "a cool grant",
      "grant_id": "000000000bp1",
      "organization": "EMBL-EBI"
    }
  ]
}

This piece of metadata, in JSON format, is what we call an instance of a schema. These instances point out to the schema it should be validated against, in this case, the project v17.0.0 schema.

This schema validates all the necessary information, and this information is what will be submitted downstream (e.g. to the HCA-DCP Data portal). However, the project JSON object has another type of metadata. This metadata is mainly used by the ingest system, both to filter datasets within the system (triage operations) and to expose minimal information about datasets that are of interest to the Human Cell Atlas (HCA Project catalogue)

In the following subsections, there will be an explanation on how to understand specifically the project schema and how to expand on it.

Schema fields

All the schema fields that are accepted can be found under the project type schema. Each field is self-describing, meaning that by looking at the schema, you should be able to figure out what it is and how to fill out the information for that field.

A very human-readable webpage is also built from the schemas and can be accessed here: metadata dictionary

Project metadata peculiarities (Non-schema fields)

As stated previously, the project metadata schema captures some extra information with the goal to:

  • Help the system understand and show the ownership/provenance/status of the metadata (System)
  • Help prioritisation of the work for the HCA wranglers (Prioritisation)
  • Understand the process the current project may go through (Triage)
  • Display current landscape of HCA data quickly into the HCA DCP Project Catalogue (Triage)

There are many fields available outside of the content field; here, we will go one by one, explaining them in detail.

System fields

These fields will be filled automatically. You do not need to worry about them. Most of the fields are shared amongst all entities in ingest; for more information, you can go back to System-specific fields. Nonetheless, the project has 4 exclusive system fields that are unique to the project and showcase provenance information for the project metadata:

  • cataloguedDate: Date at which the project was first exposed in the HCA DCP Project Catalogue.
  • publicationsInfo: Information about the publication related to this dataset. This information is surfaced automatically from the publication-related schema fields.
  • hasOpenSubmission: Boolean value, indicating if the project is linked to an open submission; more on submission statuses on the submissionEnvelopes document.
  • accession: UI-focused field; for programmatic submissions, it is not necessary.
Prioritisation fields

As a user, you do not need to worry about these fields. Feel free to skip filling this information; this will be used by the wranglers

  • primaryWrangler: ID of the user that is mainly responsible for wrangling this data.
  • secondaryWrangler: ID of another user related to this project; not necessarily the main person of contact.
  • wranglingState: Wrangling status, which may be one of the following values:
    • New
    • Eligible
    • Not eligible
    • In progress
    • Stalled
    • Submitted
    • Published in DCP
    • Deleted
    • New suggestion
  • wranglingPriority: Priority assigned to the dataset, from 1 (More important) to 3.
  • wranglingNotes: Free text string to include notes about the project.
Triage fields
  • dcpReleaseNumber: Number of DCP release this has been released to.
  • projectLabels: Label of the project. This field helps wranglers assess the route the data will go through.
  • isInCatalogue: Boolean value, indicating if this project is to be surfaced in the catalogue. If true, the information will appear immediately in the project catalogue.
  • releaseDate: Date of release of the data, in YYYY-MM-DDThh:mm:ssZ format
  • accessionDate: Date by which the data wants to be archived, in YYYY-MM-DDThh:mm:ssZ format
  • dataAccess: Type of access for the data from this dataset. Currently, we only accept raw data for full open access datasets or count matrices/processed data for datasets that require controlled access. Its value is an object, that contains:
    • type: Value extracted from the following enum:
      • All fully open
      • All managed access
      • A mixture of open and managed
      • it's complicated
    • notes: a free string value for the user to put the data access restrictions in context.

The next set of fields are especially important, since they get surfaced in the project catalogue:

  • cellCount: Number of cells that were newly generated in this dataset
  • identifyingOrganisms: Array of strings, that can accept any of these values:
    • Human
    • Mouse
    • Other
  • technology: Array of library preparation technologies that were used to generate the data for this project. It is an ontologised field, that only accepts terms under library preparation . However, since it’s outside of the schema, it is encapsulated with another option. The way to fill it:
    • ontologies: Array of objects, each following the ontologies format
    • others: Array of free strings, in case the ontology value is not available for the technology Example:
      {"technology": 
      {"ontologies": [
      {
        "text": "10x 3' v2",
        "ontology": "EFO:0009899",
        "ontology_label": "10x 3' v2"
      }
      ]}, 
      "others": [
        "CoolTech"
      ]
      }
      
  • organ: Very similar to technologies, but only allowing the ontologies field. The ontology value is evaluated to be a term under anatomical entity.

Creating your own project metadata

In the previous section, you have learnt about the project metadata schema, its allowed properties and how they are supposed to be filled. With this knowledge in mind, you can create your own entity, that could be used to create a project.

There are no limits to how many projects you can create, so try to add and experiment! (But don’t forget to delete them afterwards!).

The only thing that you need to remember is to add the out-of-schema information afterwards. The easiest way to do that is by retrieving the project (If not already retrieved) and modifying the information according to the guidelines of the project metadata peculiarities section.

API utilities

Searching for/Retrieving your project

If you have generated a project in ingest, but you have forgotten its UUID, there are a couple ways you can retrieve it back:

  • Using the IngestApi function: IngestApi.get_user_projects. This requires you to have set up your token beforehand, and it will retrieve all the projects that you have created.
  • Going to the /user/projects/ endpoint in the API. If you send a GET request to this endpoint, with the token set-up, you will be able to retrieve all the projects that are assigned to yourself.

If you remember the project UUID, you can also retrieve the project by calling IngestApi.get_project_by_uuid(uuid)

Updating your project

Once you have the project again, updating the project is very simple with the IngestApi object:

  • Updating metadata schema fields: Send a PATCH request with {'content': {<fields + values you want to update>}}, e.g. api.patch(url=ingest_project_url, json={"content": {"insdc_project_accessions": ["SRP000000"]}})
  • Updating non-metadata schema fields: Send a PUT request with an updated version of your project

When updating using the IngestApi.put/patch functions, remember to give the project._links.self.href url as the url parameter and the content as the json parameter.

The response, if successful, will be the updated version of the project. You can inspect it to ensure the patch has gone through.

The most common mistake is to not have your token set up; if you are getting 400 error responses, please check that the token has not expired

Deleting your project

If you want to delete your project, you just need to send a DELETE request to the URL of your project. Example: api.delete(ingest_project_url).

The returning value will always be a response object; to ensure that it has been deleted successfully, you can assert that it’s a 204 response code by checking the object.

Accessing other entities from the project

From the project’s _links field, you can access a number of other entities related to this project:

  • self: link to itself
  • project: link to the project; equivalent to self
  • draft: link to put the project in Draft status; requires an empty POST request.
  • supplementaryFiles: Supplementary files associated with this project
  • submissionEnvelopes: Submissions associated with this project
  • submissionEnvelope: OUTDATED, please ignore
  • bundleManifests: OUTDATED, please ignore
  • auditLogs: OUTDATED, please ignore