Public and Publishing: Difference between revisions

From Network for Advanced NMR
Jump to navigationJump to search
Created page with " = Overview = NAN distinguishes between "public" and "published" datasets to balance accessibility with data integrity and provenance. == Public Datasets == A dataset marked as '''public''' is viewable through the data browser by anyone, including users without a NAN account, provided they have read permissions. Authorized users may continue to edit public datasets. * Users may manually mark datasets as public at any time. * By policy, datasets are automatically made p..."
 
Line 4: Line 4:


== Public Datasets ==
== Public Datasets ==
A dataset marked as '''public''' is viewable through the data browser by anyone, including users without a NAN account, provided they have read permissions. Authorized users may continue to edit public datasets.
A dataset marked as '''public''' is viewable through the data browser by anyone, including users without a NAN account. Authorized users may continue to edit public datasets.


* Users may manually mark datasets as public at any time.
* Users with proper permissions may manually mark datasets as public at any time.
* By policy, datasets are automatically made public three years after archival.
* By policy, datasets are automatically made public three years after archival.
* Within six months of the scheduled public release date, users may opt to extend the release by one additional year.
* Within six months of the scheduled public release date, users may opt to extend the release by one additional year.
* The public release date is displayed in the data browser and can be used for sorting.
* The public release date is displayed in the data browser and can be used for sorting and filtering.


== Published Datasets ==
== Published Datasets ==
Publishing a dataset performs the following:
'''Publishing''' a dataset performs the following:


* Makes the dataset public
* Makes the dataset public
Line 36: Line 36:
== Versioning and Provenance ==
== Versioning and Provenance ==


* Each published dataset receives a new version number.
* When re-publishing a dataset a new version number is created (e.g. V1, V2, V3)
** If no changes to the dataset have been made since it was last published a new version is not created.
* All changes to datasets are tracked in a provenance record.
* All changes to datasets are tracked in a provenance record.
* All published versions and the original dataset are linked together so that users can see all versions of a given dataset

Revision as of 13:54, 28 May 2025

Overview

NAN distinguishes between "public" and "published" datasets to balance accessibility with data integrity and provenance.

Public Datasets

A dataset marked as public is viewable through the data browser by anyone, including users without a NAN account. Authorized users may continue to edit public datasets.

  • Users with proper permissions may manually mark datasets as public at any time.
  • By policy, datasets are automatically made public three years after archival.
  • Within six months of the scheduled public release date, users may opt to extend the release by one additional year.
  • The public release date is displayed in the data browser and can be used for sorting and filtering.

Published Datasets

Publishing a dataset performs the following:

  • Makes the dataset public
  • Creates an immutable, versioned snapshot of the dataset
  • Assigns an ARK persistent identifier

The published version includes:

  • All dataset files
  • Metadata and database records
  • Associated samples
  • Any supplemental data

The original (parent) dataset remains fully editable. A reference is maintained between the parent and all published versions. If a user requests to publish a dataset that has not changed since the last publication, a new version is not created.

Publishing Dataset Collections

Dataset collections may also be published. When a collection is published:

  • Each dataset within the collection is individually published following the same process
  • The collection itself receives a dedicated ARK identifier
  • The ARK resolves to the published collection view in the data browser

Versioning and Provenance

  • When re-publishing a dataset a new version number is created (e.g. V1, V2, V3)
    • If no changes to the dataset have been made since it was last published a new version is not created.
  • All changes to datasets are tracked in a provenance record.
  • All published versions and the original dataset are linked together so that users can see all versions of a given dataset