- Requesting accounts
- Accessing the cluster
- Data management
Accounts that need to be created by the administrator (Peng SUN, [email protected]) include:
- A Linux account on the login node (
login.lins.lab
) - An account for the batch system (Determined AI, gpu.lins.lab).
Note:
- You can only access the links above after setting up the
hosts
file.
A Web entry point can be found here.
Accessing the cluster is currently only possible via secure protocols (ssh, scp, rsync). The cluster is only accessible from inside the campus's local area network. If you would like to connect from a computer, which is not inside the campus network, then you would need to establish a VPN connection first.
Since our cluster is only accessible inside the campus's LAN, and we do not have the administration of the DNS server, setting up the hosts
file is the best way to translate human-friendly hostnames into IP addresses.
The way to modify the hosts
file is as follows:
-
Press
Win-Key + R
. A small window will pop up. -
Type in the following command and press
Ctrl+Shift+Enter
, to make notepad run as administrator and edit thehosts
file.
notepad C:\Windows\System32\drivers\etc\hosts
- Edit
/etc/hosts
with root privilege in your favorite way. For example:
sudo vim /etc/hosts
Append these lines to the end of the hosts
file:
10.0.2.166 login.lins.lab
10.0.2.169 lins.lab
10.0.2.169 gpu.lins.lab
10.0.2.169 harbor.lins.lab
10.0.2.169 grafana.lins.lab
Since we are using a self-signed certificate, after modifying the host, when we use a web browser to access the service, a security warning appears saying the certificate is not recognized. We can suppress this warning by making the system trust the certificate.
The certificate can be downloaded at: https://lins.lab/lins-lab.crt
-
For Windows, right-click the CA certificate file and select 'Install Certificate'. Follow the prompts to add the certificate to the Trusted Root Certification Authorities. If you are using Git for Windows, you will need to configure Git to use Windows native crypto backend:
git config --global http.sslbackend schannel
-
For Linux (tested Ubuntu), first, you need the
ca-certificates
package installed, then copy the.crt
file into the folder/usr/local/share/ca-certificates
, and update certificates system-wide with the commandsudo update-ca-certificates
. This works for most applications, but browsers like google-chrome and chromium on Linux have their own certificate storage. You need to go tochrome://settings/certificates
, select "Authorities", and import the.crt
file. To use our Harbor registryharbor.lins.lab
, you need to create the folder/etc/docker/certs.d/harbor.lins.lab/
and copy the certificate into it.
You can connect to the cluster via the SSH protocol. For this purpose, it is required that you have an SSH client installed. The information required to connect to the cluster is the hostname (which resolves to an IP address) of the cluster and your account credentials (username, password).
Since we have set up the hosts
in the previous section, we can use the human-readable hostname to make our connection.
Hostname | IP Address | Port |
---|---|---|
login.lins.lab | 10.0.2.166 | 22332 |
Open a terminal and use the standard ssh command
ssh -p 22332 [email protected]
where username is your username and the hostname can be found in the table shown above. The parameter -p 22332
is used to declare the SSH port used on the server. For security, we modified the default port. If for instance, user peter would like to access the cluster, then the command would be
peter@laptop:~$ ssh -p 22332 [email protected]
[email protected]'s password:
Welcome to Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-104-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue 15 Mar 2022 11:51:03 AM UTC
System load: 0.0 Users logged in: 1
Usage of /: 28.0% of 125.49GB IPv4 address for docker0: 172.17.0.1
Memory usage: 6% IPv4 address for enp1s0: 192.168.122.2
Swap usage: 0% IPv4 address for enp6s0: 10.0.2.166
Processes: 278
0 updates can be applied immediately.
Last login: Tue Mar 15 11:29:19 2022 from 172.16.29.72
Note that when it prompts to enter the password:
[email protected]'s password:
there will not be any visual feedback (i.e. asterisks) in order not to show the length of your password.
Since Windows 10, an ssh client is also provided in the operating system, but it is more common to use third-party software to establish ssh connections. Widely used ssh clients are for instance MobaXterm, XShell, FinalShell, Terminus, PuTTY and Cygwin.
For using MobaXterm, you can either start a local terminal and use the same SSH command as for Linux and Mac OS X, or you can click on the session button, choose SSH and then enter the hostname and username. After clicking on OK, you will be asked to enter your password.
How to use MobaXterm: How to access the cluster with MobaXterm - ETHZ / Download and setup MobaXterm - CECI
How to use PuTTY: How to access the cluster with PuTTY - ETHZ
An alternative option: use WSL/WSL2 [CECI Doc]
It is recommended to create SSH keys: Imagine when the network connection is unstable, typing the passwords, again and again, is frustrating. Using SSH Certificates, you will never need to type in passwords during logging in. Powered by cryptography, it prevents man-in-the-middle attacks, etc.
The links above demonstrate methods using GUI. You can also create the keys with CLI:
For security reasons, we recommend that you use a different key pair for every computer you want to connect to:
ssh-keygen -t ed25519 -f $HOME/.ssh/id_lins
It is recommended to set a passphrase for the private key.
Once this is done, copy the public key to the cluster:
ssh-copy-id -i $HOME/.ssh/id_lins.pub -p 22332 [email protected]
Finally, you can add the private key to the ssh-agent temporarily so that you don't need to enter the passphrase every time (You still need to do this every time after reboot).
ssh-add ~/.ssh/id_lins
For Windows, third-party software (PuTTYgen](https://www.puttygen.com/), MobaXterm) is commonly used to create SSH keys (demonstrated in the links above), however, since Windows 10, we can also follow similar steps in PowerShell:
- Step 1. On your PC, go to the folder:
mkdir ~/.ssh && cd ~/.ssh
- Step 2. Create a public/private key pair:
ssh-keygen -t ed25519 -f id_lins
It's recommended to set a passphrase for the private key for advanced safety.
- Step 3. The program
ssh-copy-id
is not available so we manually copy the public key:
notepad ~/.ssh/id_lins.pub
(Copy)
- Step 4. On the remote Server, create and edit the file, and paste the public key into it:
mkdir ~/.ssh && vim ~/.ssh/authorized_keys
(Paste to above and Save)
- Step 5. Start the ssh-agent; Apply the private key so that you don't need to enter the passphrase every time (You need to do this every time after the system starts up)
ssh-agent
ssh-add ~/.ssh/id_lins
-
Always use a (strong) passphrase to protect your SSH key. Do not leave it empty!
-
Never share your private key with somebody else, or copy it to another computer. It must only be stored on your personal computer
-
Use a different key pair for each computer you want to connect to
-
Do not reuse the key pairs for other systems
-
Do not keep open SSH connections in detached
screen
sessions -
Disable the ForwardAgent option in your SSH configuration and do not use ssh -A (or use ssh -a to disable agent forwarding)
If you use different key pairs for different computers (as recommended above), you need to specify the right key when you connect, for instance:
ssh -p 22332 -i $HOME/.ssh/id_lins [email protected]
To make your life easier, you can configure your ssh client to use these options automatically by adding the following lines in your $HOME/.ssh/config file:
Host cluster
HostName login.lins.lab
Port 22332
User username
IdentityFile ~/.ssh/id_lins
For Windows Users, you need to use the backslash in IdentityFile
:
IdentityFile ~\.ssh\id_lins
Then your ssh command simplifies as follows:
ssh cluster
Sometimes we need to run GUI applications on the login node. To directly run GUI applications in ssh terminals, you must open an SSH tunnel and redirect all X11 communication through that tunnel.
Xorg (X11) is normally installed by default as part of most Linux distributions. For Windows, tools such as vcxsrv or x410 can be used. For macOS, since X11 is no longer included, you must install XQuartz. You may want to check out the Troubleshooting section by ETHZ IT-Services.
RDP (Remote Desktop Protocol) provides a remote desktop interface that is more user-friendly. To connect using RDP, you need an RDP Client installed. On Windows, there is a built-in remote desktop software mstsc.exe
, or you can download a newer Microsoft Remote Desktop
from the Microsoft Store.
On Linux, it's recommended to install Remmina
and remmina-plugin-rdp
.
Using the RDP Clients is simple. Following the prompts, type in the server address, user name and password. Then, set the screen resolution and color depth you want.
For security, RDP is only allowed from SSH tunnels, and the default RDP port is also changed from 3389 to 23389. One can create the SSH tunnel and forward RDP connections to localhost:23389 by:
ssh -p 22332 -NL 23389:localhost:23389 [email protected]
Note: If you have completed this step, you can shorten the command:
ssh -NL 23389:localhost:23389 cluster
Then connect to localhost:23389
using mstsc.exe
or Remote Desktop App from Microsoft Store
We are currently using NFS to share filesystems between cluster nodes. The storage space of the login node is small (about 100GB), so it is recommended to store code and data in NFS shared folders: /dataset
for datasets and /workspace
for workspaces. The two NFS folders are allocated on the storage server, which currently offers a capacity of 143TB, with data redundancy and snapshot capability powered by TrueNAS ZFS.
We can check the file systems with the command df -H
:
[email protected]: ~ $ df -H
Filesystem Size Used Avail Use% Mounted on
/dev/nvme0n1p2 138G 25G 113G 19% /
nas.lins.lab:/mnt/Peter/Datasets 143T 4.2T 139T 3% /datasets
nas.lins.lab:/mnt/Peter/Workspace/peter 8.8T 136G 8.7T 2% /workspace/peter
You need to ask the system admin to create your workspace folder /workspace/<username>
.
By default, other users do not have either read or write permissions on your folder.
When you transfer data from your personal computer to a storage server, it's called an upload.
We can use CLI tools like scp
, rsync
; or GUI tools like mobaXterm
, FinalShell
, VSCode
, xftp
, SSHFS
for uploading files from a personal computer to the data storage.
Here is an example of using FinalShell:
Here is an example of using SSHFS-win:
When you get data from a service provider such as Baidu Netdisk, Google Drive, Microsoft Onedrive, Amazon S3, etc., it's called a download. For example, you can use the Baidu Netdisk client (already installed). You can also download datasets directly from the source. It is recommended to use professional download software to download large datasets, such as aria2, motrix (aria2 with GUI), etc.
Here is an example of using Baidu Netdisk:
We have configured both HTTP and SOCKS5 proxy services on the cluster:
- Osaka, central Japan
- http://192.168.123.169:18889
- socks5://192.168.123.169:10089
Project homepage: proxychains-ng
Example Usage:
proxychains curl google.com
proxychains -q curl google.com # Quite mode
proxychains git clone https://github.com/LINs-lab/cluster_tutorial
Export these environment variables before program execution.
This is useful when some programs that do not use libc
cannot be hooked by proxychains
,
such as many programs written in python
or golang
.
export http_proxy=http://192.168.123.169:18889 &&\
export https_proxy=http://192.168.123.169:18889 &&\
export HTTP_PROXY=http://192.168.123.169:18889 &&\
export HTTPS_PROXY=http://192.168.123.169:18889 &&\
export NO_PROXY="localhost,127.0.0.0/8,::1,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,lins.lab,*.lins.lab,westlake.edu.cn,*.westlake.edu.cn,*.edu.cn" &&\
export no_proxy="localhost,127.0.0.0/8,::1,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,lins.lab,*.lins.lab,westlake.edu.cn,*.westlake.edu.cn,*.edu.cn"
curl google.com
Outputs:
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>