Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Ceph is a free-software storage platform, implements object storage on a single distributed computer cluster, and provides interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalable to the exabyte level, and freely available.

Ceph replicates data and makes it fault-tolerant, using commodity hardware and requiring no specific hardware support. As a result of its design, the system is both self-healing and self-managing, aiming to minimize administration time and other costs.

Ceph Storage at Lehigh

LTS Research Computing provides a Ceph based storage resource, also called as Ceph. In Fall 2018, a 768TB storage cluster was designed, build and deployed to replace the original Ceph cluster, a 1PB storage cluster. In Fall 2020, total storage was increased to 2019TB by the addition of 796TB from Hawk and a further 455TB investment from LTS. Total allocatable storage is 542.87TB with an additional 29.38TB for short term storage on Sol and Hawk.

How is Data Stored in Ceph?

Data is replicated on across three disks on three nodes in three racks with distinct power feeds and network paths, secured against simultaneous failure of two full nodes in the primary data center. With current connectivity, the cluster supports an aggregate read/write speed of 3.75GB/s, with capability to increase bandwidth as needed. The Ceph software performs daily and weekly data scrubbing to ensure replicas remain consistent. 

NOTE: Ceph does not do backups. If you need daily snapshot and store the snapshots, you need to purchase an additional block of Ceph storage. If you need backup, one alternative is mount the Ceph project as a network drive and use Crashplan to backup contents in your Ceph projectUsers are responsible for their data. Ceph will not lose data due to mechanical failure, but we cannot protect against user error.

System Configuration

  • 7 storage nodes

    • One 2.4GHz 16-core AMD EPYC 7351P, 

    • 128GB 2666MHz DDR4 RAM

    • Three Micron 1.9TB SATA 2.5 IN Enterprise SSD

      • Total Raw Storage: 5.7TB for CephFS (Fast Tier)

    • Two Intel 240GB DC S4500 Enterprise SSD (OS only)

    • 13 Seagate 8TB SATA HDD

      • Total Raw Storage: 104TB Ceph (Slow Tier)

  • 11 storage nodes

    • One 3.0GHz 16-core AMD EPYC 7302P, 

    • 128GB 2666MHz DDR4 RAM

    • Three 1.9TB SATA SSD

      • Total Raw Storage: 5.7TB for CephFS (Fast Tier)

    • Two Intel 240GB DC S4510 Enterprise SSD (OS only)

    • 9 12TB SATA HDD

      • Total Raw Storage: 108TB Ceph (Slow Tier)

  • 10 GbE and 1 GbE network interface
  • Debian 10
  • Raw Storage: 1916TB (Slow Tier) and 102.6TB (Fast Tier)

  • Available Storage: 543TB (Slow Tier) and 29TB (Fast Tier)

...

  • Name of the project, default name is the PIs username followed by group for e.g. alp514group

  • List of username of users who will have access to the storage. The list can be modified at any time during the 5 year duration.

  • Amount of Storage desired (minimum 1TB)

  • Banner Index to charge with authorization from Finance Manager

Ceph storage on Sol and Hawk

Ceph is the storage backend for Sol and Hawk. Each Principle Investigator is provided with a 1TB ceph space for his/her research group. If additional space is requested, please include a justification in your compute allocation request or explicitly request a storage allocation. This storage exists as long as your allocation is active and will be deleted (no backups kept) a month after your allocation expires. If you purchase a Ceph project, then you your storage will exist for  5 years irrespective of your compute allocation status.

...

Ceph storage projects are shared using cifs utilities and can be mounted as a network drive on Windows, Mac OSX and Linux. Ceph projects are mounted on Sol and Hawk. Groups that use Ceph as home directory have access to their projects when they login to Sol. All others can access their Ceph projects at /share/ceph/projectname

...

CephFS is available at /share/ceph/scratch/username on Sol the login and compute nodes.  Users should use CephFS storage for in flight jobs only and are responsible for transferring simulation data from CephFS to their home directories or Ceph storage projects. The SLURM scheduler automatically creates a folder ${SLURM_JOB_ID} to store data generated from job ${SLURM_JOB_ID}. Users cannot create a folder in CephFS only subfolder in the ${SLURM_JOB_ID}. All data older than 7 days in CephFS will be deleted. It is the responsibility of the user to transfer data from CephFS to their home directories, Ceph project spaces or external storage resource.

...

  • home/ceph: permanent storage for the life of your account limited by 150GB/ size of ceph project (default 1TB shared by all members of the research group).

  • cephfs: semi-permanent and should moved to permanent storage. This 29TB space is shared by all users and is deleted 7 days after job completion.

  • local scratch: temporary - available only for in-flight jobs. This is 500GB space on each compute node that is shared by all users assigned to that node.