Submission variations based on request types

As of 2025, we might have to deal with 4 types of submissions based the state of data/metadata:

Submission from Published project
Submission from contributor or Unpublished project
Tier 1 submission
Tier 2 submission

Tier-ed metadata

Integration teams are using their own distinct metadata schema, to accomodate their needs. We can convert this Tier-ed schema into the HCA metadata schema, in order to proceed with ingestion into the DCP2.0.

More information can be found in here, or the confluence page.

In cases 1-2 we would need to gather the metadata (case 1), or help the contributor to gather the metadata (case 2).

On the other hand, Tier-ed metadata can be easily converted to HCA schema therefore there is limited curation required for the submission to be complete.

1. Submission from published project

Why?

We might have to wrangle a dataset from a publication. One case would be because contributor is not responding but project is valuable for the atlas.

How?

flowchart LR
    P[/"Publication"/]
    A[("Archive")]
    DWData Wrangler
    UA[["hca-util upload area"]]
    P --"fill in spreadsheet"--> DW
    A --"access files from archive"--> DW --"upload data"--> UA --> I
    DW --"review spreadsheet"--> I([Ingest]) --> DP[("Data Portal")]

No contact with contributor is achieved
If no DCA can be aquired, use only Tier 1 equivalent metadata (see here).

2. Submission from contributor or unpublished project

Why?

We might have to wrangle a dataset that has not been published before or help a contributor do that. One case could be that project is HCA publication.

How?

flowchart LR
    CContributor
    DWData Wrangler
    UA[["hca-util upload area"]]
    C --"fill in spreadsheet"--x DW
    C x--"upload data"--x UA
    UA x--x I
    DW --"share init spreadsheet"--> C
    DW x--"review spreadsheet"--x I([Ingest]) x--x DP[("Data Portal")]

No contact with contributor is achieved
DCA might be required

3. Tier 1 submission

Why?

We only need to submit Tier 1 metadata, with the analysis file(s). This would be useful if we wanna add the source study into the Data Portal despite that we don’t have Sequence data or Tier 2 metadata.

How?

flowchart LR
    CContributor
    DWData Repository
    IntTIntegration Teams/ Tracker
    C --"Tier 1 metadata"--> IntT
    IntT --"Tier 1 metadata"--> DW
    DW --"Convert to DCP"--> I([Ingest]) --> DP[("Data Portal")]

Contributor has already provided Tier 1 metadata to the Integration Teams
No need for Data Repository to communicate with contributor
No DCA is needed for only-Tier 1 metadata submission
Converting of tier-ed metadata can be done via this

4. Tier 2 submission

Why?

We need to submit Tier 2 metadata.

How?

flowchart LR
    CContributor
    IntTIntegration Teams/ Tracker
    DWData Repository
    UA[["hca-util upload area"]]
    C --"Tier 1 metadata"--> IntT
    C x--"Tier 2 metadata"--x DW
    C x--"upload data"--x UA
    DW x--"Merge & Convert to DCP"--x I([Ingest]) x--x DP[("Data Portal")]
    UA x--x I
    IntT --"Tier 1 metadata"--> DW

    subgraph legend
        direction LR
        A['encrypted route available'] x--x B['for MA submissions']
    end
    legend ~~~ DP

Contributor has already provided Tier 1 metadata to the Integration Teams
Data Repository need to communicate with contributor for signing a DCA & accessing Tier 2 and/or sequence data
MA or OA DCA has to be signed
Merge and convert of tier-ed metadata can be done via this