...
...
...
...
...
...
...
...
...
...
Motivation and intended usage:
This system is designed for relatively light course usage at an introductory level, so that instructors do not need to go through the process of setting up JupyterLab/Notebook or RStudio Server on dozens or more student owned machines with extremely varied configurations and capabilities, or reserve resources in Lehigh's High Performance Computing clusters. It is specifically not intended for research usage.
This system provides a uniform experience where JupyterLab or RStudio runs in the user's browser. Everyone has the same version of the software and its libraries, and it is available for students to use at any time.
The trade-off is that resource usage is constrained. In order to support many simultaneous users, the RStudio instances containers that run both software packages are only guaranteed a half of a CPU and 512 megs of RAM dedicated to them. This is more than enough for most introductory work that uses data sets of a few thousand rows or fewer.
For research usage, or for usage that requires more hardware resources than this, users will want to install JupyterLab or RStudio on their own machines, or reserve High Performance Computing resources: https://lts.lehigh.edu/services/high-performance-computing-hpc-research-computing
System Architecture:
JupyterLab and RStudio Server instances each run in a containerd.io containercontainers, managed by a kubernetes cluster running in Lehigh's data center. The
JupyterLab instances are started by an application called JupyterHub running at jupyter.cc.lehigh.edu. This "hub" application starts JupyterLab instances for users and cleans them up when they become idle. Anyone at Lehigh can login at jupyter.cc.lehigh.edu and start an instance.
RStudio Server does not come with an equivalent "hub" application for managing instances, so we are using the OpenOndemand application (OOD) is used to start a user's RStudio Server instance in the kubernetes cluster. to handle this. It is running at kubeportal.cc.lehigh.edu. This application runs a series of custom-written scripts to start instances and clean up idle ones. OOD is the same application that is used to reserve and schedule jobs on HPC resources, so that when users do need access to those resources, they are already familiar with the application used to get access to them.
Starting a JupyterLab instance:
Just login at jupyter.cc.lehigh.edu with your normal Lehigh username and password, then pick a profile from the "Server Options" menu and click "Start". The profiles describe what's included in them - they determine what programming tools and libraries are available in your JupyterLab instance. When you're done, just close your browser. Any files you edit and save will still be there the next time you login.
Starting an RStudio Server instance:
To get an RStudio Server in the kubernetes cluster, the user goes you go to https://kubeportal.cc.lehigh.edu, then clicks click on the "Class Apps" dropdown menu along the top of the page and selects "RStudio server (kubernetes)" for the generic RStudio instance in kubernetes, or the particular application for their course if applicable (a course can get its own entry here if it maps a network shared drive that is specific to the course). They You are brought to a form where they you can select the version of R that they want, and then they you can click the "Launch" button. It takes a minute or so for the instance to start, and then a "Connect to RStudio Server" button appears that takes the user you to their your RStudio instance.
The RStudio Server instance will prompt the user you for their your Lehigh username and password the first time they you open it in a browser session.
User home directories:
The user's home directory that's mounted inside their RStudio Server instance is actually stored on a remote NFS server, so any files the users creates or modifies will persist after their RStudio instance ends. The next time they log in and create a new RStudio Server, the same files will still be there.
Session management:
For as long as it is still running, the user's RStudio Server is available at https://rstudio.cc.lehigh.edu/<username>/. The user can also get to it by going back to https://kubeportal.cc.lehigh.edu/ and clicking on the "My Interactive Sessions" link along the top menu bar.
...
When the user is finished using their RStudio Server, they will want to click the small red power-button icon in the upper-right corner of the browser window to tell RStudio to end their session. If they don't do this, the next time they log in and get a new RStudio Server instance, they will see a red error message in the RStudio console saying "ERROR The previous R session terminated abnormally" surrounded by some debugging information. This error is unavoidable if the user doesn't click the button to end their session. It's a harmless error - everything will still work fine, but it can be scary to see that on session startup.
User management:
In order to use the RStudio Server in the kubernetes cluster, users must exist in the OOD application. If you want to use this system with your course, please create a help request for the Systems Engineering Team (SET) here:
https://jira.cc.lehigh.edu/servicedesk/customer/portal/1
Please include a list of all users of the system including all instructors, TAs, and students. SET will make sure the proper accounts are provisioned for everyone.
Known issues:
The following error is harmless:
2022-01-26T20:04:48.938685Z [rsession-<username>] WARNING No memory control group found in /proc/self/cgroup; LOGGED FROM: std::__cxx11::string rstudio::core::system::{anonymous}::getMemoryCgroup() src/cpp/core/system/LinuxResources.cpp:335
It shows up on starting an RStudio session. We have not yet found a way to prevent it from happening. The error is related to a version mismatch between kubernetes, the container engine it's using to run RStudio, and the OS running both. It should eventually go away when we are able to upgrade everything to use the same version of "cgroups", which are used to control the resources granted to containers. Everything will still work fine in spite of this error.
...
Upon using the "plot" function in RStudio, this error may appear:
Warning message:
In grSoftVersion() :
unable to load shared object '/usr/local/lib/R/modules//R_X11.so':
libXt.so.6: cannot open shared object file: No such file or directory
This is also harmless - the plot function still works. The error is caused by the way the container was built - the RStudio images are based on the Rocker project (https://www.rocker-project.org/ ) and this is related to the way they built the image. An update to the image would fix it - we'll raise an issue in the Rocker project's github to make sure they are aware of it.