Files
OBIJupyterHub/Readme.md

369 lines
9.3 KiB
Markdown
Raw Normal View History

2025-10-15 07:10:44 +02:00
# JupyterHub Configuration with OrbStack on Mac (all in Docker)
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Prerequisites
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
You must have docker running on your computer
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
- On MacOS, [OrbStack](https://orbstack.dev/ "A Docker implementation optimised for MacOS") is recommanded
2025-10-15 07:10:44 +02:00
## Installation Steps
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### 1. Create Directory Structure
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
```bash
2025-10-15 15:04:00 +02:00
git clone https://forge.metabarcoding.org/MetabarcodingSchool/OBIJupyterHub.git
2025-10-14 17:40:41 +02:00
```
2025-10-16 01:07:07 +02:00
Enter into the `OBIJupyterHub` directory
```bash
cd OBIJupyterHub
```
2025-10-15 15:04:00 +02:00
#### File Structure
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
Your `OBIJupyterHub` directory should contain:
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
```
2025-10-16 01:07:07 +02:00
OBIJupyterHub
├── start-jupyterhub.sh - The script used to setup and start the server
├── obijupyterhub - The files describing the docker images and the stack
│   ├── Caddyfile
│   ├── docker-compose.yml
│   ├── Dockerfile
│   ├── Dockerfile.hub
│   ├── jupyterhub_config.py
│   ├── sftpgo_config.json
│   └── start-notebook.sh
├── jupyterhub_volumes - The directory containing the docker volumes
│   ├── caddy
│   ├── course - Read only volume mounted on every student container
│   │   ├── bin
│   │   └── R_packages
│   ├── jupyterhub
│   ├── shared - Read write volume shared in every student container
│   ├── users
│   └── web
│   ├── img
│   │   └── welcome_metabar.webp
│   ├── index.html
│   └── pages
├── Readme.md - This documentation
├── tools
│   ├── generate_pages_json.py
│   └── install_packages.sh
└─── web_src - The quarto document sources used to build the web site
   ├── _output
   ├── _quarto.yml
   ├── 00_home.qmd
   ├── lectures
   │   └── computers
   │   └── regex
   │   ├── lecture_regex.qmd
   │   ├── slides_regex.qmd
   │   └── slides.css
   └── scripts
   └── copy-to-web.sh
2025-10-14 17:40:41 +02:00
```
2025-10-15 15:04:00 +02:00
### 2. Start JupyterHub
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
From the terminal, in the `OBIJupyterHub` directory, run the following command:
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:10:44 +02:00
./start-jupyterhub.sh
2025-10-14 17:40:41 +02:00
```
2025-10-15 15:04:00 +02:00
### 3. Access JupyterHub
2025-10-14 17:40:41 +02:00
2025-10-15 14:08:52 +02:00
Open your browser and go to: **http://localhost:8888**
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
You can log in as a student with any username and password: `metabar2025`
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Useful Commands
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### View JupyterHub logs
2025-10-15 15:04:00 +02:00
``` bash
2025-10-16 01:07:07 +02:00
cd obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose logs -f jupyterhub
```
2025-10-15 07:10:44 +02:00
### View all containers (hub + students)
2025-10-15 15:04:00 +02:00
``` bash
docker ps | grep jupyterhub
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Stop JupyterHub
2025-10-15 15:04:00 +02:00
``` bash
2025-10-16 01:07:07 +02:00
cd obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose down
```
2025-10-15 07:10:44 +02:00
### Restart JupyterHub (after config modification)
2025-10-15 15:04:00 +02:00
``` bash
2025-10-16 01:07:07 +02:00
cd obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose restart jupyterhub
```
2025-10-15 07:10:44 +02:00
### View logs for a specific student
2025-10-15 15:04:00 +02:00
``` bash
docker logs jupyter-<username>
2025-10-14 17:40:41 +02:00
```
2025-10-16 01:07:07 +02:00
Replace `<username>` by the actual username of the student.
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
### Clean up after lab
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:10:44 +02:00
# Stop and remove all containers
2025-10-16 01:07:07 +02:00
cd obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose down
2025-10-15 07:10:44 +02:00
# Remove student containers
2025-10-14 17:40:41 +02:00
docker ps -a | grep jupyter- | awk '{print $1}' | xargs docker rm -f
2025-10-15 07:10:44 +02:00
# Remove volumes (WARNING: deletes student data)
2025-10-14 17:40:41 +02:00
docker volume ls | grep jupyterhub-user | awk '{print $2}' | xargs docker volume rm
2025-10-15 07:10:44 +02:00
# Clean everything (containers + volumes + network)
2025-10-14 17:40:41 +02:00
docker-compose down -v
docker ps -a | grep jupyter- | awk '{print $1}' | xargs docker rm -f
docker volume prune -f
```
2025-10-15 07:10:44 +02:00
## Managing Shared Data
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Directory Structure for Each Student
2025-10-14 17:40:41 +02:00
2025-10-15 14:08:52 +02:00
Each student will see this directory structure in their JupyterLab (everything under `work/` is persistent):
2025-10-15 15:04:00 +02:00
```
2025-10-16 01:07:07 +02:00
work/ # Personal workspace root (persistent)
2025-10-15 14:08:52 +02:00
├── [student files] # Their own files and notebooks
├── R_packages/ # Personal R packages (writable by student)
├── shared/ # Shared workspace (read/write, shared with all)
└── course/ # Course files (read-only, managed by admin)
├── R_packages/ # Shared R packages (read-only, installed by prof)
├── bin/ # Shared executables (in PATH)
└── [course materials] # Your course files
```
2025-10-16 01:07:07 +02:00
**R Package Priority:**
1. R checks `work/R_packages/` first (personal, writable)
1. Then `work/course/R_packages/` (shared, read-only, installed by prof)
1. Then system libraries
2025-10-15 14:08:52 +02:00
**Important:** Everything is under `work/`, so all student files are automatically saved in their persistent volume.
2025-10-15 07:15:05 +02:00
### User Accounts
2025-10-16 01:07:07 +02:00
**Admin Account:**
- Username: `admin`
- Password: `admin2025` (change in docker-compose.yml: `JUPYTERHUB_ADMIN_PASSWORD`)
- Can write to `course/` directory
2025-10-15 07:15:05 +02:00
2025-10-16 01:07:07 +02:00
**Student Accounts:**
- Username: any name
- Password: `metabar2025` (change in docker-compose.yml: `JUPYTERHUB_PASSWORD`)
- Read-only access to `course/` directory
2025-10-15 07:15:05 +02:00
### Installing R Packages (Admin Only)
**From your Mac (recommended):**
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:15:05 +02:00
# Install packages
2025-10-16 01:07:07 +02:00
tools/install_packages.sh reshape2 plotly knitr
2025-10-15 07:15:05 +02:00
```
2025-10-15 15:04:00 +02:00
This script: - Installs packages in the `course/R_packages/` directory - All students can use them (read-only) - No need to rebuild the image
2025-10-15 07:15:05 +02:00
2025-10-15 14:08:52 +02:00
**Students can also install their own packages:**
Students can install packages in their personal `work/R_packages/`:
2025-10-16 01:07:07 +02:00
```r
2025-10-15 14:08:52 +02:00
# Install in personal library (each student has their own)
2025-10-16 01:07:07 +02:00
install.packages('mypackage') # Will install in work/R_packages/
2025-10-15 14:08:52 +02:00
```
2025-10-15 07:15:05 +02:00
### Using R Packages (Students)
Students simply load packages normally:
2025-10-15 15:04:00 +02:00
``` r
2025-10-15 14:08:52 +02:00
library(reshape2) # R checks: 1) work/R_packages/ 2) work/course/R_packages/ 3) system
2025-10-15 07:15:05 +02:00
library(plotly)
```
2025-10-16 01:07:07 +02:00
R automatically searches in this order:
1. Personal packages: `/home/jovyan/work/R_packages/` (R_LIBS_USER)
1. Prof packages: `/home/jovyan/work/course/R_packages/` (R_LIBS_SITE)
1. System packages
2025-10-15 07:15:05 +02:00
### List Available Packages
2025-10-15 15:04:00 +02:00
``` r
2025-10-15 14:08:52 +02:00
# List all available packages (personal + course + system)
2025-10-15 07:15:05 +02:00
installed.packages()[,"Package"]
2025-10-15 14:08:52 +02:00
# Check personal packages
list.files("/home/jovyan/work/R_packages")
# Check course packages (installed by prof)
list.files("/home/jovyan/work/course/R_packages")
2025-10-15 07:15:05 +02:00
```
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Deposit Files for Course
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
To put files in the `course/` directory (accessible read-only):
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:10:44 +02:00
# Create a temporary directory
2025-10-14 17:40:41 +02:00
mkdir -p ~/jupyterhub-tp/course-files
2025-10-15 07:10:44 +02:00
# Copy your files into it
cp my_notebooks.ipynb ~/jupyterhub-tp/course-files/
cp my_data.csv ~/jupyterhub-tp/course-files/
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
# Copy into Docker volume
2025-10-14 17:40:41 +02:00
docker run --rm \
-v jupyterhub-course:/target \
-v ~/jupyterhub-tp/course-files:/source \
alpine sh -c "cp -r /source/* /target/"
```
2025-10-15 07:10:44 +02:00
### Retrieve Student Work
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:10:44 +02:00
# List user volumes
2025-10-16 01:07:07 +02:00
docker volume ls | grep 'obijupyterhub_user-'
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
# Copy files from a specific student
2025-10-14 17:40:41 +02:00
docker run --rm \
2025-10-16 01:07:07 +02:00
-v obijupyterhub_user-alice:/source \
2025-10-15 07:10:44 +02:00
-v ~/submissions:/target \
2025-10-14 17:40:41 +02:00
alpine sh -c "cp -r /source/* /target/alice/"
2025-10-15 07:10:44 +02:00
# Copy all shared work
2025-10-14 17:40:41 +02:00
docker run --rm \
2025-10-16 01:07:07 +02:00
-v obijupyterhub_shared:/source \
2025-10-15 07:10:44 +02:00
-v ~/submissions/shared:/target \
2025-10-14 17:40:41 +02:00
alpine sh -c "cp -r /source/* /target/"
```
2025-10-15 07:10:44 +02:00
## User Management
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Option 1: Predefined User List
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
In `jupyterhub_config.py`, uncomment and modify:
2025-10-15 15:04:00 +02:00
``` python
2025-10-15 07:10:44 +02:00
c.Authenticator.allowed_users = {'student1', 'student2', 'student3'}
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Option 2: Allow Everyone (for testing)
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
By default, the configuration allows any user:
2025-10-15 15:04:00 +02:00
``` python
2025-10-14 17:40:41 +02:00
c.Authenticator.allow_all = True
```
2025-10-15 07:10:44 +02:00
⚠️ **Warning**: DummyAuthenticator is ONLY for local testing!
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Kernel Verification
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
Once logged in, create a new notebook and verify you have access to:
- **Python 3** (default kernel)
- **R** (R kernel)
- **Bash** (bash kernel)
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Customization for Your Labs
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Add Additional R Packages
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Modify the `Dockerfile` (before `USER ${NB_UID}`):
2025-10-15 15:04:00 +02:00
``` dockerfile
2025-10-15 07:10:44 +02:00
RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')"
2025-10-14 17:40:41 +02:00
```
2025-10-16 01:07:07 +02:00
Then restart the server (it rebuilds the images if needed):
2025-10-15 15:04:00 +02:00
2025-10-16 01:07:07 +02:00
```bash
./start-jupyterhub.sh
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Add Python Packages
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Add to the `Dockerfile` (before `USER ${NB_UID}`):
2025-10-15 15:04:00 +02:00
``` dockerfile
2025-10-14 17:40:41 +02:00
RUN pip install numpy pandas matplotlib seaborn
```
2025-10-15 07:10:44 +02:00
### Distribute Files to Students
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Create a `files_lab/` directory and add to the `Dockerfile`:
2025-10-15 15:04:00 +02:00
``` dockerfile
2025-10-15 07:10:44 +02:00
COPY files_lab/ /home/${NB_USER}/lab/
RUN chown -R ${NB_UID}:${NB_GID} /home/${NB_USER}/lab
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Change Port (if 8000 is occupied)
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Modify in `docker-compose.yml`:
2025-10-15 15:04:00 +02:00
``` yaml
2025-10-14 17:40:41 +02:00
ports:
2025-10-15 07:10:44 +02:00
- "8001:8000" # Accessible on localhost:8001
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
## Advantages of This Approach
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
**Everything in Docker**: No need to install Python/JupyterHub on your computer\
**Portable**: Easy to deploy on another server\
**Isolated**: No pollution of your system environment\
**Easy to Clean**: A simple `docker-compose down` is enough\
2025-10-15 07:10:44 +02:00
**Reproducible**: Students will have exactly the same environment
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Troubleshooting
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
**Error "Cannot connect to Docker daemon"**:
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
- Check that OrbStack is running
- Verify the socket exists: `ls -la /var/run/docker.sock`
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
**Student containers don't start**:
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
- Check logs: `docker-compose logs jupyterhub`
- Verify student image exists: `docker images | grep jupyterhub-student`
**Port 8000 already in use**:
- Change port in `docker-compose.yml`
2025-10-15 15:04:00 +02:00
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**I want to start from scratch**:
2025-10-15 15:04:00 +02:00
``` bash
2025-10-16 01:07:07 +02:00
push obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student
2025-10-16 01:07:07 +02:00
popd
2025-10-15 07:10:44 +02:00
# Then rebuild everything
./start-jupyterhub.sh
2025-10-15 15:04:00 +02:00
```