DeepSeek via HPC in JupyterLab Setup

In my previous blog post I showed how to start DeepSeek R1 in the High Performance Cluster (HPC) of the TUD.

This gave us a prompt. Which is not very useful if you want to automatically query DeepSeek and process data with it.

A solution is to connect DeepSeek to JupyterLab. However, starting JupyterLab in the same HPC Job is not possible and JupyterHub of the TUD is quite limited (long startup times, limited environment possibilities due to missing C-dependency capability for installing python packages etc.).

As an alternative, we can use a Jump Host, similar to what I used for Stable Diffusion.

HPC Setup

Let’s first setup the HPC environment.

  1. Connect to your jump host and open a byobu session
ssh your-vm-user@141.11.11.1
byobu
F2 # create a new window

Connect to the HPC alpha cluster from within your byobu session:

ssh -A s1111111@login1.alpha.hpc.tu-dresden.de

Start a GPU job with 1 GPU and 8 hours runtime:

srun --account=p_llm_mapping \
    --gres=gpu:1 \
    --pty --ntasks=1 --cpus-per-task=2 --nodes=1 \
    --time=8:00:00 --mem-per-cpu=16000 --pty bash -l   

Ollama Setup

This needs to be done once. Since ollama models are quite big (you can easily cover 100+ GB with a few models), we needs to reserve a workspace. We use horse for this.

ws_allocate -F horse -r 7 -m s1111111@tu-dresden.de p_llm_mapping 90

Info: creating workspace. /data/horse/ws/s1111111-p_llm_mapping remaining extensions : 10 remaining time in days: 90

Get ollama binary:

cd /data/horse/ws/s1111111-p_llm_mapping
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
tar -C . -xzf ollama-linux-amd64.tgz
rm ollama-linux-amd64.tgz
mv ./ollama-linux-amd64 ollama

Create a models folder and symlink into your home folder:

cd /data/horse/ws/s1111111-p_llm_mapping
mkdir models
cd ~/
mkdir .ollama
cd ~/.ollama
rm -rf models
ln -s /data/horse/ws/s1111111-p_llm_mapping/models ~/.ollama/models

Fix permissions:

chown -R s1111111:p_llm_mapping /data/horse/ws/s1111111-p_llm_mapping
chmod -R g+x /data/horse/ws/s1111111-p_llm_mapping

This is necessary if you have colleages who are working with you on the same HPC project.

Now you can start and connect your HPC job to the Jump Host:

  • first we add the ollama binary to PATH
  • then we start ollama serve in a SSH session connected to port 11434 of our Jump Host
PATH=$PATH:/data/horse/ws/s1111111-p_llm_mapping/ollama/bin
ssh service@141.76.18.72 -o ExitOnForwardFailure=yes \
    -o ServerAliveInterval=20 -f -R :11434:127.0.0.1:11434 -p 22 -N; \
    ollama serve

VM Setup

We use Carto-Lab Docker. Follow the Setup guide or use your own JupyterLab docker container.

In order to be able to connect to port 11434 on the host from within the Docker Container, we need to enable host-networking.

Edit the Carto-Lab Docker docker-compose.yml:

services:

  jupyterlab:
    image: registry.gitlab.vgiscience.org/lbsn/tools/jupyterlab:${TAG:-latest}
    network_mode: "host"
    # networks:
    #  - lbsn-network

# networks:
#  lbsn-network:
#    name: ${NETWORK_NAME:-lbsn-network}
#    external: true
  • add network_mode: "host"
  • comment out networks:

Now start Carto-Lab Docker and connect to it. See this Notebook where I show how to connect from JupyterLab to DeepSeek via Rest API/requests.