Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

Version 1 Current »

  • All access to the SRC is currently through a Virtual Desktop Infrastructure (VDI) in AWS AppStream, just like Lehigh’s LUApps environment

    • This means you get a Windows or Linux desktop that you interact with through a browser Window

    • We do it this way to restrict the ways in which data can be removed from the SRC

  • To access the SRC, users start out by going to either

  • After the authentication finishes, you’ll see a countdown while your session starts.

    • This will take up to 2 minutes

    • If you session doesn’t finish starting in 2 minutes, AWS will sometimes show an error and give up

      • If this happens, close the browser window, re-visit the go.lehigh.edu link you used to start the session. It may ask you to re-authenticate, but then your session should start immediately - even though the error showed in your browser before, AWS kept creating your session in the background anyway.

  • If you are using the Windows environment, you’ll get another prompt to login with your Lehigh password. This is for the Windows Active Directory system that all Lehigh Windows PCs are a part of, and it separate from the SSO login you did earlier to get into AWS AppStream and start a session. Enter your password again and click “Sign In”.

  • You’ll see a “Connecting as <your username>” window like the one pictured below - wait a minute or two for this to finish.

SRCVDI-Connecting.png

  • Once your session starts, when you click in the window you’ll see a prompt asking if you want to allow pasting into your session, as pictured below. Click the “Allow” or “Yes” button.

    • You’ll also see a long warning message about the sensitivity of the data in the SRC. This will open every time you start a new VDI session - please review this once. You can just close it from then on.

      SRCWarningPastePrompt.png

  • Now you are in your VDI session, but you still don’t have access to your SRC project’s data. For that, we need to obtain AWS credentials and paste them into a command window.

    • First go to go.lehigh.edu/awssso in your browser (not inside the VDI)

    • You should see the SSO authentication go by automatically since you are already logged in

    • Click on the “AWS Account” button in the browser. It will show a list of Lehigh-managed AWS accounts in which you have permissions.

      • This may be only the SRC account (AWS account number 419858278791).

    • Click on the SRC account to display a list of roles you can assume within that account

      • Each role corresponds to an SRC project. You may have only one, or you may have many.

    • Click on the “Command line or programmatic access” link next to the role for the SRC project you are working on.

    • A box will pop-up in the browser containing some environment variable assignments that you can copy into your VDI session

      • If you are using the Linux VDI environment, you are already on the right tab in this box, as “mac OS and Linux” is the default.

        • Click on the “export AWS_ACCESS_KEY_ID=…” box to copy the code to export the environment variables

      • If you are using the Windows VDI environment, click on the “Windows” tab within the credentials pop-up box, as pictured below

    • SRCWindowsAWSCreds.png

  • This will copy the AWS credentials into your copy/paste buffer

  • Now change back over to the SRC VDI browser tab

    • If you are in the Windows environment, go to the start menu and type “cmd” - you should see the following

  • SRCWindowsVDICMD.png
    • A “Command Prompt” window will pop up in the VDI

    • Paste your AWS credentials into it by clicking in the “Command Prompt” window icon in the upper left part of the window, going down to “Edit”, and picking “Paste”, as shown here:

  • SRCWindowsVDIPasteCreds.png
    • Now type the following commands into the command prompt:

      • cd Desktop

      • setup_src_environment.bat

    • The setup_src_environment.bat script will run and prompt you for the name of your SRC project - enter it when asked

    • The script will take a minute or so to fetch data from AWS and initialize a bunch of environment variables

    • When it finishes, do NOT close the command prompt window

      • Doing so will close your connection to your SRC project’s S3 storage bucket

    • Click on the folder icon in the task bar to open up “This PC”

    • The result of running the script and opening up “This PC” in Explorer should look like this:

    • SRCWindowsVDIBucketMounted.png

    • Now you’re ready to work with the data in the S3 bucket.

    • The Windows environment has Jupyter Notebook, VS Code, Stata, SAS, and MS Office on it for data analysis.

      • A SQL client called “SquirrelSQL” is also installed, but only works if the data in your project’s S3 bucket consists of CSV files that can be crawled by the AWS Glue service. This can be useful for getting a SQL-like interface to your data that supports long running queries.

      • New tools can be added to the VDI image by requesting them from LTS

  • On the Linux VDI, the process is similar, screenshots to come …

    • Copy the credentials from the “mac OS and Linux” pane in the AWS credentials box as described above

    • Click on the “Applications” menu in the upper left part of the VDI window, then pick the “Terminal”

    • Paste the credentials into the terminal window in the VDI

    • Run the following script, which is right in your user’s home directory, to mount the S3 bucket:

      • ./mount_s3_bucket.sh

      • When prompted, enter the name of your SRC project

      • The result will be that the project S3 bucket will be mounted on a folder in your home directory named <project name>-s3-bucket

  • No labels