2025-11-25 11:59:28 +01:00
# OBIJupyterHub - the DNA Metabarcoding Learning Server
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
## Intended use
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
This project packages the MetabarcodingSchool training lab into one reproducible bundle. You get Python, R, and Bash kernels, a Quarto-built course website, and preconfigured admin/student accounts, so onboarding a class is a single command instead of a day of setup. Everything runs locally on a single machine, student work persists between sessions, and `./start-jupyterhub.sh` takes care of building images, rendering the site, preparing volumes, and bringing JupyterHub up at `http://localhost:8888` . Defaults (accounts, passwords, volumes) live in the repo so instructors can tweak them quickly.
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
## Prerequisites (with quick checks)
2025-10-15 15:04:00 +02:00
2025-11-25 10:51:23 +01:00
You need Docker, Docker Compose, Quarto, and Python 3 available on the machine that will host the lab.
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
- macOS: install [OrbStack ](https://orbstack.dev/ ) (recommended) or Docker Desktop; both ship Docker Engine and Compose.
- Linux: install Docker Engine and the Compose plugin from your distribution (e.g., `sudo apt install docker.io docker-compose-plugin` ) or from Docker’ s official packages.
- Windows: install Docker Desktop with the WSL2 backend enabled.
- Quarto CLI: get installers from <https://quarto.org/docs/get-started/>.
- Python 3: any recent version is fine (only the standard library is used).
2025-11-17 14:22:19 +01:00
2025-11-25 10:51:23 +01:00
Verify from a terminal; if a command is missing, install it before moving on:
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
```bash
2025-11-25 10:51:23 +01:00
docker --version
docker compose version # or: docker-compose --version
quarto --version
python3 --version
2025-10-14 17:40:41 +02:00
```
2025-11-25 10:51:23 +01:00
## How the startup script works
`./start-jupyterhub.sh` is the single entry point. It builds the Docker images, renders the course website, prepares the volume folders, and starts the stack. Internally it:
- creates the `jupyterhub_volumes/` tree (caddy, course, shared, users, web…)
- builds `jupyterhub-student` and `jupyterhub-hub` images
- renders the Quarto site from `web_src/` , generates PDF galleries and `pages.json` , and copies everything into `jupyterhub_volumes/web/`
- runs `docker-compose up -d --remove-orphans`
2025-11-25 11:59:28 +01:00
You can tailor what it does with a few flags:
- `--no-build` (or `--offline` ): skip Docker image builds and reuse existing images (useful when offline).
- `--force-rebuild` : rebuild images without cache.
- `--stop-server` : stop the stack and remove student containers, then exit.
- `--update-lectures` : rebuild the course website only (no Docker stop/start).
- `--build-obidoc` : force rebuilding the obidoc documentation (auto-built if empty; skipped in offline mode).
2025-11-25 10:51:23 +01:00
## Installation and first run
1) Clone the project:
2025-10-16 01:07:07 +02:00
```bash
2025-11-25 10:51:23 +01:00
git clone https://forge.metabarcoding.org/MetabarcodingSchool/OBIJupyterHub.git
2025-10-16 01:07:07 +02:00
cd OBIJupyterHub
```
2025-11-25 10:51:23 +01:00
2) (Optional) glance at the structure you’ ll populate:
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
```
2025-10-16 01:07:07 +02:00
OBIJupyterHub
2025-11-25 10:51:23 +01:00
├── start-jupyterhub.sh - single entry point (build + render + start)
├── obijupyterhub - Docker images and stack definitions
2025-10-16 01:07:07 +02:00
│ ├── docker-compose.yml
│ ├── Dockerfile
│ ├── Dockerfile.hub
2025-11-25 10:51:23 +01:00
│ └── jupyterhub_config.py
├── jupyterhub_volumes - data persisted on the host
│ ├── course - read-only for students (notebooks, data, bin, R packages)
│ ├── shared - shared read/write space for everyone
│ ├── users - per-user persistent data
│ └── web - rendered course website
└── web_src - Quarto sources for the course website
2025-10-14 17:40:41 +02:00
```
2025-11-25 10:51:23 +01:00
3) Prepare course materials (optional before first run):
- Put notebooks, datasets, scripts, binaries, or PDFs for students under `jupyterhub_volumes/course/` . They will appear read-only at `/home/jovyan/work/course/` .
- For collaborative work, drop files in `jupyterhub_volumes/shared/` (read/write for all at `/home/jovyan/work/shared/` ).
- Edit or add Quarto sources in `web_src/` to update the course website; the script will render them.
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
4) Start everything (build + render + launch):
2025-10-16 01:07:07 +02:00
2025-11-25 10:51:23 +01:00
```bash
2025-10-15 07:10:44 +02:00
./start-jupyterhub.sh
2025-10-14 17:40:41 +02:00
```
2025-11-25 10:51:23 +01:00
5) Access JupyterHub in a browser at `http://localhost:8888` .
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
6) Stop the stack when you’ re done (run from `obijupyterhub/` ):
2025-10-15 15:04:00 +02:00
2025-11-25 10:51:23 +01:00
```bash
2025-10-14 17:40:41 +02:00
docker-compose down
```
2025-11-25 11:59:28 +01:00
### Operating the stack (one command, a few options)
2025-10-14 17:40:41 +02:00
2025-11-25 11:59:28 +01:00
- Start or rebuild: `./start-jupyterhub.sh` (rebuilds images, regenerates the website, starts the stack).
- Start without rebuilding images (offline): `./start-jupyterhub.sh --no-build`
- Force rebuild without cache: `./start-jupyterhub.sh --force-rebuild`
- Stop only: `./start-jupyterhub.sh --stop-server`
- Rebuild website only (no Docker stop/start): `./start-jupyterhub.sh --update-lectures`
- Rebuild obidoc docs: `./start-jupyterhub.sh --build-obidoc` (also builds automatically if `jupyterhub_volumes/web/obidoc` is empty; skipped in offline mode)
2025-11-25 10:51:23 +01:00
- Access at `http://localhost:8888` (students: any username / password `metabar2025` ; admin: `admin` / `admin2025` ).
- Check logs from `obijupyterhub/` with `docker-compose logs -f jupyterhub` .
- Stop with `docker-compose down` (from `obijupyterhub/` ). Rerun `./start-jupyterhub.sh` to start again or after config changes.
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
## Managing shared data
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
Each student lands in `/home/jovyan/work/` with three key areas: their own files, a shared space, and a read-only course space. Everything under `work/` is persisted on the host in `jupyterhub_volumes` .
2025-10-15 15:04:00 +02:00
```
2025-10-16 01:07:07 +02:00
work/ # Personal workspace root (persistent)
2025-10-15 14:08:52 +02:00
├── [student files] # Their own files and notebooks
├── R_packages/ # Personal R packages (writable by student)
├── shared/ # Shared workspace (read/write, shared with all)
└── course/ # Course files (read-only, managed by admin)
├── R_packages/ # Shared R packages (read-only, installed by prof)
├── bin/ # Shared executables (in PATH)
└── [course materials] # Your course files
```
2025-11-25 10:51:23 +01:00
R looks for packages in this order: personal `work/R_packages/` , then shared `work/course/R_packages/` , then system libraries. Because everything lives under `work/` , student files survive restarts.
2025-10-15 07:15:05 +02:00
### User Accounts
2025-11-25 10:51:23 +01:00
Defaults are defined in `obijupyterhub/docker-compose.yml` : admin (`admin` / `admin2025` ) with write access to `course/` , and students (any username, password `metabar2025` ) with read-only access to `course/` . Adjust `JUPYTERHUB_ADMIN_PASSWORD` and `JUPYTERHUB_PASSWORD` there, then rerun `./start-jupyterhub.sh` .
2025-10-15 07:15:05 +02:00
### Installing R Packages (Admin Only)
2025-11-25 10:51:23 +01:00
From the host, install shared R packages into `course/R_packages/` :
2025-10-15 07:15:05 +02:00
2025-10-15 15:04:00 +02:00
``` bash
2025-10-15 07:15:05 +02:00
# Install packages
2025-10-16 01:07:07 +02:00
tools/install_packages.sh reshape2 plotly knitr
2025-10-15 07:15:05 +02:00
```
2025-11-25 10:51:23 +01:00
Students can install their own packages into their personal `work/R_packages/` :
2025-10-15 14:08:52 +02:00
2025-10-16 01:07:07 +02:00
```r
2025-10-15 14:08:52 +02:00
# Install in personal library (each student has their own)
2025-10-16 01:07:07 +02:00
install.packages('mypackage') # Will install in work/R_packages/
2025-10-15 14:08:52 +02:00
```
2025-10-15 07:15:05 +02:00
### Using R Packages (Students)
Students simply load packages normally:
2025-10-15 15:04:00 +02:00
``` r
2025-10-15 14:08:52 +02:00
library(reshape2) # R checks: 1) work/R_packages/ 2) work/course/R_packages/ 3) system
2025-10-15 07:15:05 +02:00
library(plotly)
```
2025-10-16 01:07:07 +02:00
R automatically searches in this order:
1. Personal packages: `/home/jovyan/work/R_packages/` (R_LIBS_USER)
1. Prof packages: `/home/jovyan/work/course/R_packages/` (R_LIBS_SITE)
1. System packages
2025-10-15 07:15:05 +02:00
### List Available Packages
2025-10-15 15:04:00 +02:00
``` r
2025-10-15 14:08:52 +02:00
# List all available packages (personal + course + system)
2025-10-15 07:15:05 +02:00
installed.packages()[,"Package"]
2025-10-15 14:08:52 +02:00
# Check personal packages
list.files("/home/jovyan/work/R_packages")
# Check course packages (installed by prof)
list.files("/home/jovyan/work/course/R_packages")
2025-10-15 07:15:05 +02:00
```
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
### Deposit or retrieve course and student files
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
On the host, place course files in `jupyterhub_volumes/course/` (they appear read-only to students), shared files in `jupyterhub_volumes/shared/` , and collect student work from `jupyterhub_volumes/users/` .
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## User Management
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Option 1: Predefined User List
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
In `jupyterhub_config.py` , uncomment and modify:
2025-10-15 15:04:00 +02:00
``` python
2025-10-15 07:10:44 +02:00
c.Authenticator.allowed_users = {'student1', 'student2', 'student3'}
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Option 2: Allow Everyone (for testing)
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
By default, the configuration allows any user:
2025-10-15 15:04:00 +02:00
``` python
2025-10-14 17:40:41 +02:00
c.Authenticator.allow_all = True
```
2025-10-15 07:10:44 +02:00
⚠️ **Warning ** : DummyAuthenticator is ONLY for local testing!
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Kernel Verification
2025-10-14 17:40:41 +02:00
2025-10-16 01:07:07 +02:00
Once logged in, create a new notebook and verify you have access to:
- **Python 3** (default kernel)
- **R** (R kernel)
- **Bash** (bash kernel)
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Customization for Your Labs
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Add Additional R Packages
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Modify the `Dockerfile` (before `USER ${NB_UID}` ):
2025-10-15 15:04:00 +02:00
``` dockerfile
2025-10-15 07:10:44 +02:00
RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')"
2025-10-14 17:40:41 +02:00
```
2025-11-25 10:51:23 +01:00
Then rerun `./start-jupyterhub.sh` to rebuild and restart.
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Add Python Packages
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Add to the `Dockerfile` (before `USER ${NB_UID}` ):
2025-10-15 15:04:00 +02:00
``` dockerfile
2025-10-14 17:40:41 +02:00
RUN pip install numpy pandas matplotlib seaborn
```
2025-11-25 10:51:23 +01:00
Then rerun `./start-jupyterhub.sh` to rebuild and restart.
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Change Port (if 8000 is occupied)
2025-10-15 15:04:00 +02:00
2025-10-15 07:10:44 +02:00
Modify in `docker-compose.yml` :
2025-10-15 15:04:00 +02:00
``` yaml
2025-10-14 17:40:41 +02:00
ports:
2025-10-15 07:10:44 +02:00
- "8001:8000" # Accessible on localhost:8001
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
## Advantages of This Approach
2025-10-14 17:40:41 +02:00
2025-10-15 15:04:00 +02:00
✅ **Everything in Docker ** : No need to install Python/JupyterHub on your computer\
✅ **Portable ** : Easy to deploy on another server\
✅ **Isolated ** : No pollution of your system environment\
✅ **Easy to Clean ** : A simple `docker-compose down` is enough\
2025-10-15 07:10:44 +02:00
✅ **Reproducible ** : Students will have exactly the same environment
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Troubleshooting
2025-10-14 17:40:41 +02:00
2025-11-25 10:51:23 +01:00
- Docker daemon unavailable: make sure OrbStack/Docker Desktop/daemon is running; verify `/var/run/docker.sock` exists.
- Student containers do not start: check `docker-compose logs jupyterhub` and confirm the images exist with `docker images | grep jupyterhub-student` .
- Port conflict: change the published port in `docker-compose.yml` .
2025-10-15 15:04:00 +02:00
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**I want to start from scratch**:
2025-10-15 15:04:00 +02:00
``` bash
2025-11-25 10:51:23 +01:00
pushd obijupyterhub
2025-10-14 17:40:41 +02:00
docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student
2025-10-16 01:07:07 +02:00
popd
2025-10-15 07:10:44 +02:00
# Then rebuild everything
./start-jupyterhub.sh
2025-11-17 14:22:19 +01:00
```