Table of Contents |
---|
There are several tools available to transfer data from external source to HPC supported resources. All except one tool is available on Sol and Hawk and can be accessed from the command line. Some of these tools can be used to download data from web servers, cloud providers and remote *nix systems
...
For more information, visit https://curl.se/ or view the cURL man page by typing man curl
on the command prompt.
SCP
Secure copy protocol (SCP) is a means of securely transferring data between a local host and a remote host or between two remote hosts. It uses Secure Shell (SSH) for data transfer and uses the same mechanisms for authentication, thereby ensuring the authenticity and confidentiality of the data in transit. A client can send (upload) files to a server, optionally including their basic attributes (permissions, timestamps). Clients can also request files or directories from a server (download).
SCP is installed system wide and does not require a module to be loaded.
...
For more information, view the scp man page by typing man scp
on the command -line or visit https://linux.die.net/man/1/scpprompt.
SFTP
SFTP is a command-line interface client program to transfer files using the SSH File Transfer Protocol (SFTP), which runs inside the encrypted Secure Shell connection. It provides an interactive interface similar to that of traditional command-line FTP clients.
...
At the sftp command prompt, use linux commands to navigate the file system and the get/put command to transfer from/to the remote system. Some standard command-line sftp commands
Command | Description |
---|---|
get | Copy a file from the remote host to the local computer. |
put | Copy a file from the local computer to the remote host |
help (or ?) | Get help on the use of SFTP commands |
exit (or quit) | Close the connection to the remote host, and exit SFTP |
For more information, view the sftp man page by typing man sftp
on the command prompt.
rsync
rsync is a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
rsync is installed system wide and does not require a module to be loaded.
...
Code Block | ||
---|---|---|
| ||
rsync [OPTION] … SRC … [USER@]HOST:DEST rsync [OPTION] … [USER@]HOST:SRC [DEST] |
where SRC
is the file or directory (or a list of multiple files and directories) to copy from, DEST
is the file or directory to copy to, and square brackets indicate optional parameters.
Common Options
Option | Description |
---|---|
-a | archive mode |
-r | recurse into directories |
-v | increase verbosity |
-z | compress file data during the transfer |
-u | skip files that are newer on the DEST |
-t | preserve modification times |
-n | dry-run, perform a trial run with no changes made |
For more information, view the rsync man page by typing man rsync
on the command prompt.
Wget
GNU Wget or Wget is a free software package for retrieving files using HTTP, HTTPS, FTP and FTPS, the most widely used Internet protocols. It is a non-interactive command line tool, so it may easily be called from scripts, cron
jobs, terminals without X-Windows support, etc.
Wget is installed system wide and does not require a module to be loaded.
Code Block | ||||
---|---|---|---|---|
| ||||
wget www.example.com |
For more information, visit https://www.gnu.org/software/wget/ or view the Wget man page by typing man wget
on the command prompt.
aria
axel
Rclone
Rclone is a command line program to manage files on cloud storage. It is a feature rich alternative to cloud vendors' web storage interfaces. More details available on the Rclone page.
Globus
Globus is a third party web based tool to manage transfer of data between two gridftp endpoints. Sol and Hawk do not have a gridftp endpoint so you cannot transfer data from external sources directly to your home directory. Instead, you need to first transfer data either to your local system or to Lehigh's Data Transfer Node (DTN) and then transfer to Sol using either scp, sftp or rsync as described above. More details are available on the Globus page.
...