...
Ceph replicates data and makes it fault-tolerant, using commodity hardware and requiring no specific hardware support. As a result of its design, the system is both self-healing and self-managing, aiming to minimize administration time and other costs.
Ceph Storage at Lehigh
LTS Research Computing provides a Ceph based storage resource, also called as Ceph. In Fall 2018, a 768TB storage cluster was designed, build and deployed to replace the original Ceph cluster, a 1PB storage cluster. In Fall 2020, total storage was increased to 2019TB by the addition of 796TB from Hawk and a further 455TB investment from LTS. Total allocatable storage is 542.87TB with an additional 29.38TB for short term storage on Sol and Hawk.
How is Data Stored in Ceph?
Data is replicated on across three disks on three nodes in three racks with distinct power feeds and network paths, secured against simultaneous failure of two full nodes in the primary data center. With current connectivity, the cluster supports an aggregate read/write speed of 3.75GB/s, with capability to increase bandwidth as needed. The Ceph software performs daily and weekly data scrubbing to ensure replicas remain consistent.
NOTE: Ceph does not do backups. If you need daily snapshot and store the snapshots, you need to purchase an additional block of Ceph storage. If you need backup, one alternative is mount the Ceph project as a network drive and use Crashplan to backup contents in your Ceph project.
...
Name of the project, default name is the PIs username followed by group for e.g. alp514group
List of username of users who will have access to the storage. The list can be modified at any time during the 5 year duration.
Amount of Storage desired (minimum 1TB)
Banner Index to charge with authorization from Finance Manager
Ceph storage on Sol and Hawk
Ceph is the storage backend for Sol and Hawk. Each Principle Investigator is provided with a 1TB ceph space for his/her research group. If additional space is requested, please include a justification in your compute allocation request or explicitly request a storage allocation. This storage exists as long as your allocation is active and will be deleted (no backups kept) a month after your allocation expires. If you purchase a Ceph project, then your storage will exist for 5 years irrespective of your compute allocation status.
...
Ceph storage projects are shared using cifs utilities and can be mounted as a network drive on Windows, Mac OSX and Linux. Ceph projects are mounted on Sol and Hawk. Groups that use Ceph as home directory have access to their projects when they login to Sol. All others can access their Ceph projects at /share/ceph/projectname
...
CephFS is available at /share/ceph/scratch/username on Sol the login and compute nodes. Users should use CephFS storage for in flight jobs only and are responsible for transferring simulation data from CephFS to their home directories or Ceph storage projects. The SLURM scheduler automatically creates a folder ${SLURM_JOB_ID} to store data generated from job ${SLURM_JOB_ID}. Users cannot create a folder in CephFS only subfolder in the ${SLURM_JOB_ID}. All data older than 7 days in CephFS will be deleted. It is the responsibility of the user to transfer data from CephFS to their home directories, Ceph project spaces or external storage resource.
...
home/ceph: permanent storage for the life of your account limited by 150GB/ size of ceph project (default 1TB shared by all members of the research group).
cephfs: semi-permanent and should moved to permanent storage. This 29TB space is shared by all users and is deleted 7 days after job completion.
local scratch: temporary - available only for in-flight jobs. This is 500GB space on each compute node that is shared by all users assigned to that node.