Table of Contents |
---|
Ceph is a free-software storage platform, implements object storage on a single distributed computer cluster, and provides interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available.
Ceph replicates data and makes it fault-tolerant, using commodity hardware and requiring no specific hardware support. As a result of its design, the system is both self-healing and self-managing, aiming to minimize administration time and other costs.
Ceph Storage at Lehigh
LTS Research Computing provides a Ceph based storage resource, also called as Ceph. In Fall 2018, a 768TB storage cluster was designed, build and deployed to replace the original Ceph cluster, a 1PB storage cluster. In Fall 2020, total storage was increased to 2019TB by the addition of 796TB from Hawk and a further 455TB investment from LTS. Total allocatable storage is 542.87TB with an additional 29.38TB for short term storage on Sol and Hawk.
How is Data Stored in Ceph?
Data is replicated on across three disks on three nodes in three racks with distinct power feeds and network paths, secured against simultaneous failure of two full nodes in the primary data center. With current connectivity, the cluster supports an aggregate read/write speed of 3.75GB/s, with capability to increase bandwidth as needed. The Ceph software performs daily and weekly data scrubbing to ensure replicas remain consistent.
NOTE: Ceph does not do backups. If you need daily snapshot and store the snapshots, you need to purchase an additional block of Ceph storage. If you need backup, one alternative is mount the Ceph project as a network drive and use Crashplan to backup contents in your Ceph projectUsers are responsible for their data. Ceph will not lose data due to mechanical failure, but we cannot protect against user error.
System Configuration
7 storage nodes
One 2.5GHz 4GHz 16-core AMD EPYC 7351P, 2.4GHz
128GB 2666MHz DDR4 RAM
Three Micron 1.9TB SATA 2.5 IN Enterprise SSD
Total Raw Storage: 5.7TB for CephFS (Fast Tier)
Two Intel 240GB DC S4500 Enterprise SSD (OS only)
13 Seagate 8TB SATA HDD
Total Raw Storage: 104TB Ceph (Slow Tier)
11 storage nodes
One 3.0GHz 16-core AMD EPYC 7302P,
128GB 2666MHz DDR4 RAM
Three 1.9TB SATA SSD
Total Raw Storage: 5.7TB for CephFS (Fast Tier)
Two Intel 240GB DC S4510 Enterprise SSD (OS only)
9 12TB SATA HDD
Total Raw Storage: 108TB Ceph (Slow Tier)
- 10 GbE and 1 GbE network interface CentOS 7.x
- Debian 10
Raw Storage: 728TB 1916TB (Slow Tier) and 39102.9TB 6TB (Fast Tier)
Available Storage: 206TB 543TB (Slow Tier) and 11.3TB 29TB (Fast Tier)
Why two tiers of storage?
...
To use Ceph as a storage device, Faculty, Staff, Department and Colleges need to purchase a storage project, minimum 1TB for a duration of 5 years . The or request a storage allocation annually. If purchasing, the cost for 1TB of storage is $375 for 5 year duration. The storage project can be shared with a named group of users including students at no charge. To request purchase a Ceph storage project, please contact the Manager of Research Computing with the following
Name of the project, default name is the PIs username followed by group for e.g. alp514group
List of username of users who will have access to the storage. The list can be modified at any time during the 5 year duration.
Amount of Storage desired (minimum 1TB)
Banner Index to charge with authorization from Finance Manager
Ceph storage on Sol and Hawk
Ceph is the storage backend for Sol . All users are provided with 150GB home storage with the $50/user/year account fees. Principle Investigators are encouraged to purchase a Ceph storage project for their research groups use on Sol. Such PIs have the option of using their Ceph storage project for their home directories instead of the 150GB quota and have their annual user fees waived for the duration of the Ceph project (i.e. 5 years from purchase) for current and future usersand Hawk. Each Principle Investigator is provided with a 1TB ceph space for his/her research group. If additional space is requested, please include a justification in your compute allocation request or explicitly request a storage allocation. This storage exists as long as your allocation is active and will be deleted (no backups kept) a month after your allocation expires. If you purchase a Ceph project, then your storage will exist for 5 years irrespective of your compute allocation status.
Using Ceph for storage
Ceph storage projects are shared using cifs utilities and can be mounted as a network drive on Windows, Mac OSX and Linux. Ceph projects are mounted on Sol and Hawk. Groups that use Ceph as home directory have access to their projects when they login to Sol. All others can access their Ceph projects at /share/ceph/projectname
...
CephFS is available at /share/ceph/scratch/username on Sol the login and compute nodes. Users should use CephFS storage for in flight jobs only and are responsible for transferring simulation data from CephFS to their home directories or Ceph storage projects. The SLURM scheduler automatically creates a folder ${SLURM_JOB_ID} to store data generated from job ${SLURM_JOB_ID}. Users cannot create a folder in CephFS only subfolder in the ${SLURM_JOB_ID}. All data older than 7 days in CephFS will be deleted. It is the responsibility of the user to transfer data from CephFS to their home directories, Ceph project spaces or external storage resource.
...
home/ceph: permanent storage for the life of your account limited by 150GB/ size of ceph project (default 1TB shared by all members of the research group).
cephfs: semi-permanent and should moved to permanent storage. This 29TB space is shared by all users and is deleted 7 days after job completion.
local scratch: temporary - available only for in-flight jobs. This is 500GB space on each compute node that is shared by all users assigned to that node.