Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 5 Next »

Motivation and intended usage:

This system is designed for relatively light course usage at an introductory level, so that instructors do not need to go through the process of setting up RStudio on dozens or more student owned machines with extremely varied configurations and capabilities, or reserve resources in Lehigh's High Performance Computing clusters.  It is specifically not intended for research usage.

This system provides a uniform experience where RStudio runs in the user's browser.  Everyone has the same version of the software and its libraries, and it is available for students to use at any time.

The trade-off is that resource usage is constrained.  In order to support many simultaneous users, the RStudio instances are only guaranteed a half of a CPU and 512 megs of RAM dedicated to them.  This is more than enough for most introductory work that uses data sets of a few thousand rows or fewer.

For research usage, or for usage that requires more hardware resources than this, users will want to install RStudio on their own machines, or reserve High Performance Computing resources: https://lts.lehigh.edu/services/high-performance-computing-hpc-research-computing

System Architecture:

RStudio Server instances each run in a Docker container, managed by a kubernetes cluster running in Lehigh's data center.  The OpenOndemand application (OOD) is used to start a user's RStudio Server instance in the kubernetes cluster.  

OOD is the same application that is used to reserve and schedule jobs on HPC resources, so that when users do need access to those resources, they are already familiar with the application used to get access to them.

Starting an RStudio Server instance:

To get an RStudio Server in the kubernetes cluster, the user goes to https://hpcportal.cc.lehigh.edu, then clicks on the "Class Apps" dropdown menu along the top of the page and selects "RStudio server (kubernetes)" for the generic RStudio instance in kubernetes, or the particular application for their course if applicable (a course can get its own entry here if it maps a network shared drive that is specific to the course).  They are brought to a form where they can select the version of R that they want, and then they can click the "Launch" button.  It takes a minute or so for the instance to start, and then a "Connect to RStudio Server" button appears that takes the user to their RStudio instance.

The RStudio Server instance will prompt the user for their Lehigh username and password the first time they open it in a browser session.

User home directories:

The user's home directory that's mounted inside their RStudio Server instance is actually stored on a remote NFS server, so any files the users creates or modifies will persist after their RStudio instance ends.  The next time they log in and create a new RStudio Server, the same files will still be there.

The user's home directory is the same home directory that they see on HPC nodes in the sol and hawk clusters to make getting data into and out of RStudio easier.

Session management:

For as long as it is still running, the user's RStudio Server is available at https://rstudio.cc.lehigh.edu/<username>/.  The user can also get to it by going back to https://hpcportal.cc.lehigh.edu/ and clicking on the "My Interactive Sessions" link along the top menu bar.

Users do not need to do anything to close their RStudio Server instance.  It will be automatically cleaned up after a period of inactivity, currently set to 1 hour.

When the user is finished using their RStudio Server, they will want to click the small red power-button icon in the upper-right corner of the browser window to tell RStudio to end their session.  If they don't do this, the next time they log in and get a new RStudio Server instance, they will see a red error message in the RStudio console saying "ERROR The previous R session terminated abnormally" surrounded by some debugging information.  This error is unavoidable if the user doesn't click the button to end their session.  It's a harmless error - everything will still work fine, but it can be scary to see that on session startup. 

User management:

In order to use the RStudio Server in the kubernetes cluster, users must exist in the OOD application.  If you want to use this system with your course, please create a help request for the Systems Engineering Team (SET) here:
https://jira.cc.lehigh.edu/servicedesk/customer/portal/1
Please include a list of all users of the system including all instructors, TAs, and students.  SET will make sure the proper accounts are provisioned for everyone.

Known issues:

A harmless error that shows up on starting an RStudio session that we have not yet found a way to prevent is this one:

2022-01-26T20:04:48.938685Z [rsession-<username>] WARNING No memory control group found in /proc/self/cgroup; LOGGED FROM: std::__cxx11::string rstudio::core::system::{anonymous}::getMemoryCgroup() src/cpp/core/system/LinuxResources.cpp:335

This error is related to a version mismatch between the kubernetes, the container engine it's using to run RStudio, and the OS running both.  It should eventually go away when we are able to upgrade everything to use the same version of cgroups, which are used to control the resources granted to containers.  Everything will still work fine in spite of this error.


Upon using the "plot" function in RStudio, this error may appear:

Warning message:
In grSoftVersion() :
  unable to load shared object '/usr/local/lib/R/modules//R_X11.so':
  libXt.so.6: cannot open shared object file: No such file or directory


This is also harmless - the plot function still works.  The error is caused by the way the container was built - the RStudio images are based on the Rocker project (https://www.rocker-project.org/ ) and this is related to the way they built the image.  An update to the image would fix it - we'll raise an issue in the Rocker project's github to make sure they are aware of it.

  • No labels