NAN Archive: Difference between revisions
From Network for Advanced NMR
Jump to navigationJump to search
Mmaciejewski (talk | contribs) No edit summary |
Mmaciejewski (talk | contribs) No edit summary |
||
| Line 1: | Line 1: | ||
== Overview == | == Overview == | ||
While no system can guarantee perfect security or trust, NAN goes to great lengths to safeguard data in the NAN Archive. | While no system can guarantee perfect security or trust, NAN goes to great lengths to safeguard data in the NAN Archive. | ||
Latest revision as of 19:35, 1 August 2025
Overview
While no system can guarantee perfect security or trust, NAN goes to great lengths to safeguard data in the NAN Archive.
The NAN Archive consists of:
- A Postgres database holding all metadata records across the NAN portal and NDTS
- Network-attached storage (NAS) for all files associated with datasets
- Disaster recovery storage containing immutable dataset backups
NAN Postgres Database
- Hosted as a virtual machine (VM) with a virtual datastore
- Replicated in near real-time to a second VM on a different physical server in a separate datacenter
- The secondary VM uses a distinct NAS system for added resilience
- In case of failure, the replica can be promoted to primary within minutes
- Hourly backups to a separate NAS system
- In the unlikely event both the primary and replica fail, the system can be restored from backups (recovery may take several hours)
- All NAS systems supporting the database and backups:
- Are continuously monitored
- Feature high data durability
- Are under vendor hardware/software support
- Employ daily (or more frequent) snapshots retained for weeks, allowing recovery of VMs and recent states
Metadata Provenance Tracking
- Changes to dataset metadata are stored in immutable audit tables
- Complete change history is preserved to ensure traceability
Data Storage
Primary Storage: Dell PowerScale (Isilon) A3000
- Four A3000 nodes, each with 400 TB raw capacity (1.6 PB total)
- Erasure coding reduces usable capacity to ~900 TB
- OneFS clustered architecture scales to 252 nodes (up to 100 PB)
Disaster Recovery Storage: HP Scality RING
- Distributed, peer-to-peer architecture across four datacenters (Farmington and Storrs, CT)
- 14×9s data durability via erasure coding, replication, and self-healing
- WORM (Write-Once-Read-Many) S3 bucket ensures:
- Protection from accidental/malicious deletion
- Automatic lease renewal
- No file deletions by users, admins, or vendors
- Resilience against ransomware encryption attempts
Landing Zone: Qumulo NAS
- Data from NDTS Gateways arrives at the Landing Zone
- Hosted on Qumulo NAS and replicated in real-time to a second NAS
- Data is only deleted after verified transfer to both primary and disaster recovery storage
- At minimum, two independent copies exist from the moment of data arrival
Access Control & Monitoring
- Access is restricted using role-based Active Directory (AD) groups
- All logins require key-based SSH
- All code changes are Git-versioned with rollback support
- CrowdStrike Falcon agent monitors systems for suspicious activity
- Server logs are centrally aggregated and retained for audit
- These measures ensure operational integrity and compliance with security protocols
NSF Trusted CI Center of Excellence
- NAN participates as an NSF Trusted CI Center of Excellence
- Undergoes periodic third-party reviews
- Aligns policies with research cyberinfrastructure best practices