Merge pull request '🔧 Add selective image rebuild flags and enhance R dependency scanning' (#9) from push-kwqtsoywszyy into master

Reviewed-on: #9
This commit was merged in pull request #9.
This commit is contained in:
2026-04-30 17:36:16 +00:00
7 changed files with 614 additions and 334 deletions
+183 -178
View File
@@ -2,14 +2,14 @@
## Intended use
This project packages the MetabarcodingSchool training lab into one reproducible bundle. You get Python, R, and Bash kernels, a Quarto-built course website, and preconfigured admin/student accounts, so onboarding a class is a single command instead of a day of setup. Everything runs locally on a single machine, student work persists between sessions, and `./start-jupyterhub.sh` takes care of building images, rendering the site, preparing volumes, and bringing JupyterHub up at `http://localhost:8888`. Defaults (accounts, passwords, volumes) live in the repo so instructors can tweak them quickly.
This project packages the MetabarcodingSchool training lab into one reproducible bundle. You get Python, R, and Bash kernels, a Quarto-built course website, and preconfigured admin/student accounts, so onboarding a class is a single command instead of a day of setup. Everything runs locally on a single machine, student work persists between sessions, and `./start-jupyterhub.sh` takes care of pulling images, rendering the site, preparing volumes, and bringing JupyterHub up at `http://localhost:8888`.
## Prerequisites (with quick checks)
You only need **Docker and Docker Compose** on the machine that will host the lab. All other tools (Quarto, Hugo, Python, R) are provided via a builder Docker image and do not need to be installed on your system.
- macOS: install [OrbStack](https://orbstack.dev/) (recommended) or Docker Desktop; both ship Docker Engine and Compose.
- Linux: install Docker Engine and the Compose plugin from your distribution (e.g., `sudo apt install docker.io docker-compose-plugin`) or from Dockers official packages.
- Linux: install Docker Engine and the Compose plugin (`sudo apt install docker.io docker-compose-plugin`) or from Docker's official packages.
- Windows: install Docker Desktop with the WSL2 backend enabled.
Verify from a terminal:
@@ -19,263 +19,268 @@ docker --version
docker compose version # or: docker-compose --version
```
## How the startup script works
## Three operating modes
`./start-jupyterhub.sh` is the single entry point. It builds the Docker images, renders the course website, prepares the volume folders, and starts the stack. Internally it:
`./start-jupyterhub.sh` has three modes that control how Docker images are obtained:
- creates the `jupyterhub_volumes/` tree (caddy, course, shared, users, web...)
- builds the `obijupyterhub-builder` image (contains Quarto, Hugo, R, Python) if not already present
- builds `jupyterhub-student` and `jupyterhub-hub` images
- detects R package dependencies from Quarto files using the `{attachment}` package and installs them automatically
- renders the Quarto site from `web_src/`, generates PDF galleries and `pages.json`, and copies everything into `jupyterhub_volumes/web/`
- runs `docker-compose up -d --remove-orphans`
| Mode | Flag | Description |
|------|------|-------------|
| **Pull** (default) | *(none)* | Pull pre-built images from the registry and start |
| **Local build** | `--local-build` | Build images locally on your machine and start (no push) |
| **Publish** | `--publish` | Build multi-arch images (amd64 + arm64), push to registry, then start |
### Builder image
### Pull mode — default, fastest
The builder image (`obijupyterhub-builder`) contains all the tools needed to prepare the course materials:
```bash
./start-jupyterhub.sh
```
- **Quarto** for rendering the course website
- **Hugo** for building the obidoc documentation
- **R** with the `{attachment}` package for automatic dependency detection
- **Python 3** for utility scripts
Downloads the three pre-built images from `registry.metabarcoding.org/metabarschool/`:
- `obijupyterhub-builder:latest`
- `obijupyterhub-hub:latest`
- `obijupyterhub-student:latest`
This means you don't need to install any of these tools on your host system. The script automatically builds this image on first run and reuses it for subsequent builds. Use `--force-rebuild` to rebuild the builder image if needed.
This is what instructors should use in class. No compilation, no wait.
### R package caching for builds
### Local build mode — for development
R packages required by your Quarto documents are automatically detected and installed during the build process. These packages are cached in `jupyterhub_volumes/builder/R_packages/` so they persist across builds. This means:
```bash
./start-jupyterhub.sh --local-build
```
- **First build**: All R packages used in your `.qmd` files are detected and installed (may take some time)
- **Subsequent builds**: Only missing packages are installed, making builds much faster
- **Adding new packages**: Simply use `library(newpackage)` in your Quarto files; the build process will detect and install it automatically
Builds all three images locally using the Dockerfiles in `obijupyterhub/`. Rebuilt images stay on your machine and are not pushed to the registry. Additional flags apply only in this mode:
To clear the R package cache and force a fresh installation, delete the `jupyterhub_volumes/builder/R_packages/` directory.
| Flag | Effect |
|------|--------|
| `--no-build` / `--offline` | Skip all image operations, use whatever is already local |
| `--force-rebuild` | Rebuild all images without Docker cache |
| `--rebuild-builder` | Force rebuild the builder image only |
| `--rebuild-student` | Force rebuild the student image only |
| `--rebuild-hub` | Force rebuild the JupyterHub image only |
You can tailor what it does with a few flags:
`--rebuild-*` and `--force-rebuild` imply `--local-build` automatically.
- `--no-build` (or `--offline`): skip Docker image builds and reuse existing images (useful when offline).
- `--force-rebuild`: rebuild images without cache.
- `--stop-server`: stop the stack and remove student containers, then exit.
- `--update-lectures`: rebuild the course website only (no Docker stop/start).
- `--build-obidoc`: force rebuilding the obidoc documentation (auto-built if empty; skipped in offline mode).
### Publish mode — for maintainers
```bash
./start-jupyterhub.sh --publish
```
Builds all three images for both `linux/amd64` and `linux/arm64` using `docker buildx`, then pushes them to the registry tagged with both `:latest` and the version from `version.txt`. Requires write access to the registry and `docker buildx` with a `docker-container` driver.
**Before publishing a new version**, bump `version.txt` at the project root:
```
0.2.0
```
## Actions (all modes)
These flags work alongside any mode:
| Flag | Effect |
|------|--------|
| `--stop-server` | Stop the stack and remove student containers, then exit |
| `--update-lectures` | Rebuild the course website only (no Docker stop/start) |
| `--build-obidoc` | Force rebuild of the obidoc documentation |
## Installation and first run
1) Clone the project:
1. Clone the project:
```bash
git clone https://forge.metabarcoding.org/MetabarcodingSchool/OBIJupyterHub.git
cd OBIJupyterHub
```
2) (Optional) glance at the structure youll populate:
2. Repository structure:
```
OBIJupyterHub
├── start-jupyterhub.sh - single entry point (build + render + start)
├── obijupyterhub - Docker images and stack definitions
│   ├── docker-compose.yml
│   ├── install_R_packages.R - An R script used to install all need R packages
│   ├── Dockerfile - Image used by the students
│   ├── Dockerfile.hub - Image for the jupyter hub
│   ├── Dockerfile.builder - Image for the builder
│   └── jupyterhub_config.py
├── jupyterhub_volumes - data persisted on the host
│   ├── builder - R packages cache for building lectures
│   ├── course - read-only for students (notebooks, data, bin, R packages)
│   ├── shared - shared read/write space for everyone
│   ├── users - per-user persistent data
│   └── web - rendered course website
── web_src - Quarto sources for the course website
```
OBIJupyterHub/
├── start-jupyterhub.sh single entry point
├── version.txt current image version number
├── obijupyterhub/
├── docker-compose.yml
├── Dockerfile student image
├── Dockerfile.hub JupyterHub image
├── Dockerfile.builder builder image (Quarto, Hugo, R, Python)
└── jupyterhub_config.py
├── jupyterhub_volumes/ data persisted on the host
├── builder/R_packages/ R package cache for building lectures
├── course/ read-only for students (notebooks, data, bin)
├── shared/ shared read/write space for everyone
├── users/ per-user persistent data
└── web/ rendered course website
── tools/
│ ├── install_quarto_deps.R automatic R dependency detection and install
│ └── install_packages.sh install shared R packages into course/
└── web_src/ Quarto sources for the course website
```
Note: The `tools/` directory contains utility scripts including `install_quarto_deps.R` for automatic R dependency detection.
3. (Optional) place course materials in `jupyterhub_volumes/course/` before first run.
3) Prepare course materials (optional before first run):
- Put notebooks, datasets, scripts, binaries, or PDFs for students under `jupyterhub_volumes/course/`. They will appear read-only at `/home/jovyan/work/course/`.
- For collaborative work, drop files in `jupyterhub_volumes/shared/` (read/write for all at `/home/jovyan/work/shared/`).
- Edit or add Quarto sources in `web_src/` to update the course website; the script will render them.
4) Start everything (build + render + launch):
4. Start everything:
```bash
./start-jupyterhub.sh
./start-jupyterhub.sh # pulls images from registry (recommended)
# or
./start-jupyterhub.sh --local-build # builds locally
```
5) Access JupyterHub in a browser at `http://localhost:8888`.
5. Access JupyterHub at `http://localhost:8888`.
6) Stop the stack when youre done (run from `obijupyterhub/`):
6. Stop when done:
```bash
./start-jupyterhub.sh --stop-server
# or from obijupyterhub/
docker-compose down
```
### Operating the stack (one command, a few options)
## How the builder image works
- Start or rebuild: `./start-jupyterhub.sh` (rebuilds images, regenerates the website, starts the stack).
- Start without rebuilding images (offline): `./start-jupyterhub.sh --no-build`
- Force rebuild without cache: `./start-jupyterhub.sh --force-rebuild`
- Stop only: `./start-jupyterhub.sh --stop-server`
- Rebuild website only (no Docker stop/start): `./start-jupyterhub.sh --update-lectures`
- Rebuild obidoc docs: `./start-jupyterhub.sh --build-obidoc` (also builds automatically if `jupyterhub_volumes/web/obidoc` is empty; skipped in offline mode)
- Access at `http://localhost:8888` (students: any username / password `metabar2025`; admin: `admin` / `admin2025`).
- Check logs from `obijupyterhub/` with `docker-compose logs -f jupyterhub`.
- Stop with `docker-compose down` (from `obijupyterhub/`). Rerun `./start-jupyterhub.sh` to start again or after config changes.
The `obijupyterhub-builder` image contains Quarto, Hugo, R, and Python — you do not need any of these on your host. The script runs this image as a temporary container to:
## Managing shared data
- detect R package dependencies from your `.qmd` files (scans `library()`, `require()`, and `remotes::install_git/github()` calls using base R — no external package required)
- install missing R packages into `jupyterhub_volumes/builder/R_packages/` (cached between runs)
- render the Quarto website from `web_src/`
- generate PDF galleries and `pages.json`
- (optionally) build the obidoc documentation with Hugo
Each student lands in `/home/jovyan/work/` with three key areas: their own files, a shared space, and a read-only course space. Everything under `work/` is persisted on the host in `jupyterhub_volumes`.
### R package caching
```
work/ # Personal workspace root (persistent)
├── [student files] # Their own files and notebooks
├── R_packages/ # Personal R packages (writable by student)
├── shared/ # Shared workspace (read/write, shared with all)
└── course/ # Course files (read-only, managed by admin)
├── R_packages/ # Shared R packages (read-only, installed by prof)
├── bin/ # Shared executables (in PATH)
└── [course materials] # Your course files
Packages are cached in `jupyterhub_volumes/builder/R_packages/`:
- **First build**: all packages used in your `.qmd` files are detected and installed (may take a while).
- **Subsequent builds**: only new packages are installed, making builds much faster.
- **Non-CRAN packages**: packages installed via `remotes::install_git()` or `remotes::install_github()` in your `.qmd` files are detected and pre-installed automatically before rendering.
- **Clear the cache**: delete `jupyterhub_volumes/builder/R_packages/` to force a full reinstall.
## Managing course and student data
Each student lands in `/home/jovyan/work/` with three areas:
```
work/
├── [student files] personal workspace (persistent)
├── R_packages/ personal R packages (writable by student)
├── shared/ shared space (read/write, all students)
└── course/ course files (read-only)
├── R_packages/ shared R packages installed by the instructor
├── bin/ shared executables (added to PATH)
└── [course materials]
```
R looks for packages in this order: personal `work/R_packages/`, then shared `work/course/R_packages/`, then system libraries. Because everything lives under `work/`, student files survive restarts.
On the host, place course files in `jupyterhub_volumes/course/`, collaborative files in `jupyterhub_volumes/shared/`, and collect student work from `jupyterhub_volumes/users/`.
### User Accounts
### Installing shared R packages (instructor)
Defaults are defined in `obijupyterhub/docker-compose.yml`: admin (`admin` / `admin2025`) with write access to `course/`, and students (any username, password `metabar2025`) with read-only access to `course/`. Adjust `JUPYTERHUB_ADMIN_PASSWORD` and `JUPYTERHUB_PASSWORD` there, then rerun `./start-jupyterhub.sh`.
### Installing R Packages (Admin Only)
From the host, install shared R packages into `course/R_packages/`:
``` bash
# Install packages
```bash
tools/install_packages.sh reshape2 plotly knitr
```
Students can install their own packages into their personal `work/R_packages/`:
### Installing personal R packages (students)
```r
# Install in personal library (each student has their own)
install.packages('mypackage') # Will install in work/R_packages/
install.packages('mypackage') # installs into work/R_packages/
```
### Using R Packages (Students)
### Loading packages (students)
Students simply load packages normally:
``` r
library(reshape2) # R checks: 1) work/R_packages/ 2) work/course/R_packages/ 3) system
library(plotly)
```r
library(reshape2) # searches: work/R_packages/ → work/course/R_packages/ → system
```
R automatically searches in this order:
## User accounts
1. Personal packages: `/home/jovyan/work/R_packages/` (R_LIBS_USER)
1. Prof packages: `/home/jovyan/work/course/R_packages/` (R_LIBS_SITE)
1. System packages
Defaults are set in `obijupyterhub/docker-compose.yml`:
### List Available Packages
| Account | Username | Password |
|---------|----------|----------|
| Admin | `admin` | `admin2025` |
| Students | any | `metabar2025` |
``` r
# List all available packages (personal + course + system)
installed.packages()[,"Package"]
Change `JUPYTERHUB_ADMIN_PASSWORD` and `JUPYTERHUB_PASSWORD` in the compose file, then rerun `./start-jupyterhub.sh`.
# Check personal packages
list.files("/home/jovyan/work/R_packages")
To restrict access to a predefined list, edit `jupyterhub_config.py`:
# Check course packages (installed by prof)
list.files("/home/jovyan/work/course/R_packages")
```
### Deposit or retrieve course and student files
On the host, place course files in `jupyterhub_volumes/course/` (they appear read-only to students), shared files in `jupyterhub_volumes/shared/`, and collect student work from `jupyterhub_volumes/users/`.
## User Management
### Option 1: Predefined User List
In `jupyterhub_config.py`, uncomment and modify:
``` python
```python
c.Authenticator.allowed_users = {'student1', 'student2', 'student3'}
```
### Option 2: Allow Everyone (for testing)
## Customising the images
By default, the configuration allows any user:
All image customisations require a rebuild. Use `--local-build` (or the targeted `--rebuild-*` flag) to apply changes locally, or `--publish` to push them to the registry.
``` python
c.Authenticator.allow_all = True
```
### Add R packages baked into the student image
⚠️ **Warning**: DummyAuthenticator is ONLY for local testing!
Edit `obijupyterhub/Dockerfile` (before `USER ${NB_UID}`):
## Kernel Verification
Once logged in, create a new notebook and verify you have access to:
- **Python 3** (default kernel)
- **R** (R kernel)
- **Bash** (bash kernel)
## Customization for Your Labs
### Add Additional R Packages
Modify the `Dockerfile` (before `USER ${NB_UID}`):
``` dockerfile
```dockerfile
RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')"
```
Then rerun `./start-jupyterhub.sh` to rebuild and restart.
Then rebuild:
### Add Python Packages
```bash
./start-jupyterhub.sh --rebuild-student
```
Add to the `Dockerfile` (before `USER ${NB_UID}`):
### Add Python packages
``` dockerfile
Edit `obijupyterhub/Dockerfile` (before `USER ${NB_UID}`):
```dockerfile
RUN pip install numpy pandas matplotlib seaborn
```
Then rerun `./start-jupyterhub.sh` to rebuild and restart.
Then rebuild:
### Change Port (if 8000 is occupied)
Modify in `docker-compose.yml`:
``` yaml
ports:
- "8001:8000" # Accessible on localhost:8001
```bash
./start-jupyterhub.sh --rebuild-student
```
## Advantages of This Approach
### Change the listening port
✅ **Everything in Docker**: No need to install Python/JupyterHub on your computer\
✅ **Portable**: Easy to deploy on another server\
✅ **Isolated**: No pollution of your system environment\
✅ **Easy to Clean**: A simple `docker-compose down` is enough\
✅ **Reproducible**: Students will have exactly the same environment
In `obijupyterhub/docker-compose.yml`:
```yaml
ports:
- "8001:80" # accessible at http://localhost:8001
```
## Troubleshooting
- Docker daemon unavailable: make sure OrbStack/Docker Desktop/daemon is running; verify `/var/run/docker.sock` exists.
- Student containers do not start: check `docker-compose logs jupyterhub` and confirm the images exist with `docker images | grep jupyterhub-student`.
- Port conflict: change the published port in `docker-compose.yml`.
**Docker daemon unavailable**: make sure OrbStack / Docker Desktop / the daemon is running.
**Student containers do not start**: run `docker-compose logs jupyterhub` from `obijupyterhub/` and confirm the student image is present:
**I want to start from scratch**:
``` bash
pushd obijupyterhub
docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student obijupyterhub-builder
popd
# Optionally clear the R package cache
rm -rf jupyterhub_volumes/builder/R_packages
# Then rebuild everything
./start-jupyterhub.sh
```bash
docker images | grep obijupyterhub-student
```
**Port conflict**: change the published port in `docker-compose.yml`.
**Registry pull fails**: check your network, or fall back to a local build:
```bash
./start-jupyterhub.sh --local-build
```
**Start from scratch**:
```bash
./start-jupyterhub.sh --stop-server
cd obijupyterhub
docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student obijupyterhub-builder 2>/dev/null || true
docker rmi registry.metabarcoding.org/metabarschool/obijupyterhub-hub:latest \
registry.metabarcoding.org/metabarschool/obijupyterhub-student:latest \
registry.metabarcoding.org/metabarschool/obijupyterhub-builder:latest 2>/dev/null || true
cd ..
rm -rf jupyterhub_volumes/builder/R_packages # clear R package cache
./start-jupyterhub.sh # pull fresh images and start
```
+8 -4
View File
@@ -32,6 +32,7 @@ RUN apt-get update \
libpng-dev \
libtiff5-dev \
libjpeg-dev \
libuv1-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
@@ -54,12 +55,15 @@ RUN ARCH=$(dpkg --print-architecture) \
| tar -xz -C /usr/local/bin hugo \
&& chmod +x /usr/local/bin/hugo
# Install Quarto using the official .deb package (handles all dependencies properly)
# Install Quarto from the official tarball.
# Using tar.gz instead of .deb avoids dpkg and is more reliable in cross-arch
# (QEMU) builds where GitHub downloads are slower and more prone to transient errors.
ARG QUARTO_VERSION=1.6.42
RUN ARCH=$(dpkg --print-architecture) \
&& curl -fsSL -o /tmp/quarto.deb "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-${ARCH}.deb" \
&& dpkg -i /tmp/quarto.deb \
&& rm /tmp/quarto.deb
&& curl -fsSL --retry 5 --retry-delay 10 \
"https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-${ARCH}.tar.gz" \
| tar -xz -C /opt \
&& ln -s "/opt/quarto-${QUARTO_VERSION}/bin/quarto" /usr/local/bin/quarto
# Create working directory
WORKDIR /workspace
+3 -4
View File
@@ -1,11 +1,8 @@
services:
jupyterhub:
build:
context: .
dockerfile: Dockerfile.hub
container_name: jupyterhub
hostname: jupyterhub
image: jupyterhub-hub:latest
image: ${HUB_IMAGE:-registry.metabarcoding.org/metabarschool/obijupyterhub-hub:latest}
expose:
- "8000"
volumes:
@@ -21,6 +18,8 @@ services:
- jupyterhub-network
restart: unless-stopped
environment:
# Docker image used for student containers (read by jupyterhub_config.py)
STUDENT_IMAGE: ${STUDENT_IMAGE:-registry.metabarcoding.org/metabarschool/obijupyterhub-student:latest}
# Shared password for all students
JUPYTERHUB_PASSWORD: metabar2025
# Admin password (for installing R packages)
+4 -1
View File
@@ -14,7 +14,10 @@ VOLUMES_BASE_PATH = '/volumes/users' # Path as seen from JupyterHub container (
HOST_VOLUMES_PATH = os.environ.get('HOST_VOLUMES_PATH', '/volumes') # Real path on host machine (parent dir)
# Docker image to use for student containers
c.DockerSpawner.image = 'jupyterhub-student:latest'
c.DockerSpawner.image = os.environ.get(
'STUDENT_IMAGE',
'registry.metabarcoding.org/metabarschool/obijupyterhub-student:latest'
)
# Docker network (create with: docker network create jupyterhub-network)
c.DockerSpawner.network_name = 'jupyterhub-network'
+306 -121
View File
@@ -1,33 +1,61 @@
#!/bin/bash
# JupyterHub startup script for labs
# Usage: ./start-jupyterhub.sh [--no-build|--offline] [--force-rebuild] [--stop-server] [--update-lectures] [--build-obidoc]
#
# Modes (mutually exclusive):
# (default) Pull images from registry and start
# --local-build Build images locally and start (no push)
# --publish Build multi-arch images, push to registry, and start
#
# Usage: ./start-jupyterhub.sh [mode] [options]
set -e
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
DOCKER_DIR="${SCRIPT_DIR}/obijupyterhub/"
BUILDER_IMAGE="obijupyterhub-builder:latest"
REGISTRY="registry.metabarcoding.org/metabarschool"
PLATFORMS="linux/amd64,linux/arm64"
BUILDX_BUILDER_NAME="obijupyterhub-buildx"
# Colors for display
GREEN='\033[0;32m'
BLUE='\033[0;34m'
YELLOW='\033[1;33m'
NC='\033[0m' # No Color
NC='\033[0m'
# Operating mode
LOCAL_BUILD=false
PUBLISH=false
# Build options (meaningful in --local-build mode)
NO_BUILD=false
FORCE_REBUILD=false
REBUILD_BUILDER=false
REBUILD_STUDENT=false
REBUILD_HUB=false
# Actions
STOP_SERVER=false
UPDATE_LECTURES=false
BUILD_OBIDOC=false
usage() {
cat <<EOF
Usage: ./start-jupyterhub.sh [options]
Usage: ./start-jupyterhub.sh [mode] [options]
Options:
--no-build | --offline Skip Docker image builds (use existing images)
--force-rebuild Rebuild images without cache
Modes (mutually exclusive, default is pull-from-registry):
--local-build Build images locally and start (no push to registry)
--publish Build multi-arch images, push to registry, and start
Build options (--local-build only):
--no-build | --offline Skip all image operations (use existing local images)
--force-rebuild Rebuild all local images without cache
--rebuild-builder Force rebuild the builder image only
--rebuild-student Force rebuild the student image only
--rebuild-hub Force rebuild the JupyterHub image only
Actions:
--stop-server Stop the stack and remove student containers, then exit
--update-lectures Rebuild the course website only (no Docker stop/start)
--build-obidoc Force rebuild of obidoc documentation
@@ -35,92 +63,114 @@ Options:
EOF
}
dockercompose=$(which docker-compose || echo 'docker compose')
dockercompose=$(which docker-compose 2>/dev/null || echo 'docker compose')
while [[ $# -gt 0 ]]; do
case "$1" in
--no-build|--offline) NO_BUILD=true ;;
--force-rebuild) FORCE_REBUILD=true ;;
--stop-server) STOP_SERVER=true ;;
--update-lectures) UPDATE_LECTURES=true ;;
--build-obidoc) BUILD_OBIDOC=true ;;
--local-build) LOCAL_BUILD=true ;;
--publish) PUBLISH=true ;;
--no-build|--offline) NO_BUILD=true ;;
--force-rebuild) FORCE_REBUILD=true; LOCAL_BUILD=true ;;
--rebuild-builder) REBUILD_BUILDER=true; LOCAL_BUILD=true ;;
--rebuild-student) REBUILD_STUDENT=true; LOCAL_BUILD=true ;;
--rebuild-hub) REBUILD_HUB=true; LOCAL_BUILD=true ;;
--stop-server) STOP_SERVER=true ;;
--update-lectures) UPDATE_LECTURES=true ;;
--build-obidoc) BUILD_OBIDOC=true ;;
-h|--help) usage; exit 0 ;;
*) echo "Unknown option: $1" >&2; usage; exit 1 ;;
esac
shift
done
if $LOCAL_BUILD && $PUBLISH; then
echo "Error: --local-build and --publish cannot be used together" >&2
exit 1
fi
if $STOP_SERVER && $UPDATE_LECTURES; then
echo "Error: --stop-server and --update-lectures cannot be used together" >&2
exit 1
fi
echo "Starting JupyterHub for Lab"
echo "=============================="
echo ""
# ---------------------------------------------------------------------------
# Image name helpers
# ---------------------------------------------------------------------------
echo -e "${BLUE}Building the volume directories...${NC}"
pushd "${SCRIPT_DIR}/jupyterhub_volumes" >/dev/null
mkdir -p caddy/data
mkdir -p caddy/config
mkdir -p course/bin
mkdir -p course/R_packages
mkdir -p jupyterhub
mkdir -p shared
mkdir -p users
mkdir -p web/obidoc
mkdir -p builder/R_packages
popd >/dev/null
local_image_name() {
case "$1" in
hub) echo "jupyterhub-hub:latest" ;;
student) echo "jupyterhub-student:latest" ;;
builder) echo "obijupyterhub-builder:latest" ;;
esac
}
pushd "${DOCKER_DIR}" >/dev/null
registry_image_name() {
echo "${REGISTRY}/obijupyterhub-$1:${2:-latest}"
}
# Check we're in the right directory
if [ ! -f "Dockerfile" ] || [ ! -f "docker-compose.yml" ]; then
echo "Error: Run this script from the jupyterhub-tp/ directory"
exit 1
dockerfile_for() {
case "$1" in
hub) echo "Dockerfile.hub" ;;
student) echo "Dockerfile" ;;
builder) echo "Dockerfile.builder" ;;
esac
}
read_version() {
local vfile="${SCRIPT_DIR}/version.txt"
if [ ! -f "$vfile" ]; then
echo "Error: version.txt not found at ${vfile}" >&2
exit 1
fi
tr -d '[:space:]' < "$vfile"
}
# Set image names based on mode
if $LOCAL_BUILD; then
BUILDER_IMAGE=$(local_image_name builder)
HUB_IMAGE=$(local_image_name hub)
STUDENT_IMAGE=$(local_image_name student)
else
BUILDER_IMAGE=$(registry_image_name builder)
HUB_IMAGE=$(registry_image_name hub)
STUDENT_IMAGE=$(registry_image_name student)
fi
# ---------------------------------------------------------------------------
# Utility
# ---------------------------------------------------------------------------
get_file_timestamp() {
local file="$1"
case "$(uname -s)" in
Linux)
stat -c %Y "$file"
;;
Darwin)
# BSD stat : -f pour format, %m = timestamp modification
stat -f %m "$file"
;;
*)
echo "Système non supporté" >&2
return 1
;;
Linux) stat -c %Y "$file" ;;
Darwin) stat -f %m "$file" ;;
*) echo "Système non supporté" >&2; return 1 ;;
esac
}
check_if_image_needs_rebuild() {
local image_name="$1"
local dockerfile="$2"
local force="${3:-false}"
echo -e "${BLUE}Checking image ${image_name}...${NC}"
# Check if image exists
if ! docker image inspect "$image_name" >/dev/null 2>&1; then
echo -e "${YELLOW}Docker image ${image_name} doesn't exist.${NC}"
return 0 # Need to build (image doesn't exist)
return 0
fi
# If force rebuild, always rebuild
if $FORCE_REBUILD; then
if $FORCE_REBUILD || $force; then
echo -e "${YELLOW}Docker image build is forced.${NC}"
return 0 # Need to rebuild
return 0
fi
# Compare Dockerfile modification time with image creation time
if [ -f "$dockerfile" ]; then
local dockerfile_mtime=$(get_file_timestamp "$dockerfile" 2>/dev/null || echo 0)
local image_created=$(docker image inspect "$image_name" --format='{{.Created}}' 2>/dev/null \
local dockerfile_mtime
dockerfile_mtime=$(get_file_timestamp "$dockerfile" 2>/dev/null || echo 0)
local image_created
image_created=$(docker image inspect "$image_name" --format='{{.Created}}' 2>/dev/null \
| sed -E 's/\.[0-9]+//' \
| (read d; if [[ "$(uname -s)" == "Darwin" ]]; then date -ju -f "%Y-%m-%dT%H:%M:%S" "${d%Z}" +%s; else date -d "$d" +%s; fi) 2>/dev/null || echo 0)
@@ -129,31 +179,138 @@ check_if_image_needs_rebuild() {
if [ "$dockerfile_mtime" -gt "$image_created" ]; then
echo -e "${YELLOW}Dockerfile is newer than image, rebuild needed${NC}"
return 0 # Need to rebuild
return 0
fi
fi
return 1 # No need to rebuild
return 1
}
build_builder_image() {
if check_if_image_needs_rebuild "$BUILDER_IMAGE" "Dockerfile.builder"; then
local build_flag=()
if $FORCE_REBUILD; then
build_flag+=(--no-cache)
fi
# ---------------------------------------------------------------------------
# Builder image (local-build mode)
# ---------------------------------------------------------------------------
build_builder_image() {
if check_if_image_needs_rebuild "$(local_image_name builder)" "Dockerfile.builder" "$REBUILD_BUILDER"; then
local build_flag=()
if $FORCE_REBUILD || $REBUILD_BUILDER; then build_flag+=(--no-cache); fi
echo ""
echo -e "${BLUE}Building builder image...${NC}"
docker build "${build_flag[@]}" -t "$BUILDER_IMAGE" -f Dockerfile.builder .
docker build "${build_flag[@]}" -t "$(local_image_name builder)" -f Dockerfile.builder .
else
echo -e "${BLUE}Builder image is up to date, skipping build.${NC}"
fi
}
# Run a command inside the builder container with the workspace mounted
# R packages are persisted in jupyterhub_volumes/builder/R_packages
# R_LIBS includes both the builder packages (attachment) and the mounted volume
# ---------------------------------------------------------------------------
# Student + Hub images (local-build mode)
# ---------------------------------------------------------------------------
build_images() {
if $NO_BUILD; then
echo -e "${YELLOW}Skipping image builds (offline/no-build mode).${NC}"
return
fi
if check_if_image_needs_rebuild "$(local_image_name student)" "Dockerfile" "$REBUILD_STUDENT"; then
local student_flag=()
if $FORCE_REBUILD || $REBUILD_STUDENT; then student_flag+=(--no-cache); fi
echo ""
echo -e "${BLUE}Building student image...${NC}"
docker build "${student_flag[@]}" -t "$(local_image_name student)" -f Dockerfile .
else
echo -e "${BLUE}Student image is up to date, skipping build.${NC}"
fi
if check_if_image_needs_rebuild "$(local_image_name hub)" "Dockerfile.hub" "$REBUILD_HUB"; then
local hub_flag=()
if $FORCE_REBUILD || $REBUILD_HUB; then hub_flag+=(--no-cache); fi
echo ""
echo -e "${BLUE}Building JupyterHub image...${NC}"
docker build "${hub_flag[@]}" -t "$(local_image_name hub)" -f Dockerfile.hub .
else
echo -e "${BLUE}JupyterHub image is up to date, skipping build.${NC}"
fi
}
# ---------------------------------------------------------------------------
# Pull images from registry (default mode)
# ---------------------------------------------------------------------------
pull_images() {
if $NO_BUILD; then
echo -e "${YELLOW}Skipping image pull (offline/no-build mode).${NC}"
return
fi
echo ""
echo -e "${BLUE}Pulling images from registry...${NC}"
docker pull "$BUILDER_IMAGE"
docker pull "$HUB_IMAGE"
docker pull "$STUDENT_IMAGE"
}
# ---------------------------------------------------------------------------
# Multi-arch build + push to registry (--publish mode)
# ---------------------------------------------------------------------------
ensure_buildx_builder() {
docker buildx inspect "$BUILDX_BUILDER_NAME" >/dev/null 2>&1 \
|| docker buildx create --name "$BUILDX_BUILDER_NAME" --driver docker-container --bootstrap
}
publish_images() {
local version
version=$(read_version)
# docker buildx --push uses Docker's own credential store, independent of
# skopeo. Verify auth early to get a clear error before a long build.
echo -e "${BLUE}Checking registry authentication...${NC}"
local registry_host="${REGISTRY%%/*}"
if ! docker login "$registry_host" >/dev/null 2>&1; then
echo -e "${YELLOW}Not logged in to ${registry_host}. Running docker login...${NC}"
docker login "$registry_host" || {
echo "Error: authentication to ${registry_host} failed." >&2
echo "Run: docker login ${registry_host}" >&2
exit 1
}
fi
echo ""
echo -e "${BLUE}Publishing images (version ${version}) to ${REGISTRY}${NC}"
echo -e "${BLUE}Platforms: ${PLATFORMS}${NC}"
ensure_buildx_builder
local names=(builder student hub)
local dockerfiles=(Dockerfile.builder Dockerfile Dockerfile.hub)
for i in "${!names[@]}"; do
local name="${names[$i]}"
local df="${dockerfiles[$i]}"
local remote="${REGISTRY}/obijupyterhub-${name}"
echo ""
echo -e "${BLUE}Building and pushing ${name} image...${NC}"
docker buildx build \
--builder "$BUILDX_BUILDER_NAME" \
--platform "$PLATFORMS" \
--tag "${remote}:latest" \
--tag "${remote}:${version}" \
--file "${df}" \
--push \
.
echo -e "${GREEN} ${remote}:latest${NC}"
echo -e "${GREEN} ${remote}:${version}${NC}"
done
echo ""
echo -e "${GREEN}All images published (version ${version}).${NC}"
}
# ---------------------------------------------------------------------------
# Builder container (for website / docs)
# ---------------------------------------------------------------------------
run_in_builder() {
docker run --rm \
-v "${SCRIPT_DIR}:/workspace" \
@@ -164,42 +321,39 @@ run_in_builder() {
bash -c "$1"
}
# ---------------------------------------------------------------------------
# Stack management
# ---------------------------------------------------------------------------
stop_stack() {
echo -e "${BLUE}Stopping existing containers...${NC}"
${dockercompose} down 2>/dev/null || true
HUB_IMAGE="$HUB_IMAGE" STUDENT_IMAGE="$STUDENT_IMAGE" \
${dockercompose} down 2>/dev/null || true
echo -e "${BLUE}Cleaning up student containers...${NC}"
docker ps -aq --filter name=jupyter- | xargs -r docker rm -f 2>/dev/null || true
}
build_images() {
if $NO_BUILD; then
echo -e "${YELLOW}Skipping image builds (offline/no-build mode).${NC}"
return
fi
build_website() {
echo ""
echo -e "${BLUE}Building web site (in builder container)...${NC}"
run_in_builder '
set -e
echo "-> Detecting and installing R dependencies..."
Rscript /workspace/tools/install_quarto_deps.R /workspace/web_src
local build_flag=()
if $FORCE_REBUILD; then
build_flag+=(--no-cache)
fi
# Check and build student image
if check_if_image_needs_rebuild "jupyterhub-student:latest" "Dockerfile"; then
echo ""
echo -e "${BLUE}Building student image...${NC}"
docker build "${build_flag[@]}" -t jupyterhub-student:latest -f Dockerfile .
else
echo -e "${BLUE}Student image is up to date, skipping build.${NC}"
fi
# Check and build JupyterHub image
if check_if_image_needs_rebuild "jupyterhub-hub:latest" "Dockerfile.hub"; then
echo ""
echo -e "${BLUE}Building JupyterHub image...${NC}"
docker build "${build_flag[@]}" -t jupyterhub-hub:latest -f Dockerfile.hub .
else
echo -e "${BLUE}JupyterHub image is up to date, skipping build.${NC}"
fi
echo "-> Rendering Quarto site..."
cd /workspace/web_src
quarto render
find . -name "*.pdf" -print | while read pdfname; do
dest="/workspace/jupyterhub_volumes/web/pages/${pdfname}"
dirdest=$(dirname "$dest")
mkdir -p "$dirdest"
cp "$pdfname" "$dest"
done
python3 /workspace/tools/generate_pdf_galleries.py
python3 /workspace/tools/generate_pages_json.py
'
}
build_obidoc() {
@@ -242,32 +396,11 @@ build_obidoc() {
'
}
build_website() {
echo ""
echo -e "${BLUE}Building web site (in builder container)...${NC}"
run_in_builder '
set -e
echo "-> Detecting and installing R dependencies..."
Rscript /workspace/tools/install_quarto_deps.R /workspace/web_src
echo "-> Rendering Quarto site..."
cd /workspace/web_src
quarto render
find . -name "*.pdf" -print | while read pdfname; do
dest="/workspace/jupyterhub_volumes/web/pages/${pdfname}"
dirdest=$(dirname "$dest")
mkdir -p "$dirdest"
cp "$pdfname" "$dest"
done
python3 /workspace/tools/generate_pdf_galleries.py
python3 /workspace/tools/generate_pages_json.py
'
}
start_stack() {
echo ""
echo -e "${BLUE}Starting JupyterHub...${NC}"
${dockercompose} up -d --remove-orphans
HUB_IMAGE="$HUB_IMAGE" STUDENT_IMAGE="$STUDENT_IMAGE" \
${dockercompose} up -d --remove-orphans
echo ""
echo -e "${YELLOW}Waiting for JupyterHub to start...${NC}"
@@ -276,13 +409,19 @@ start_stack() {
print_success() {
if docker ps | grep -q jupyterhub; then
local version
version=$(read_version 2>/dev/null || echo "?")
echo ""
echo -e "${GREEN}JupyterHub is running!${NC}"
echo -e "${GREEN}JupyterHub is running! (version ${version})${NC}"
echo ""
echo "-------------------------------------------"
echo -e "${GREEN}JupyterHub available at: http://localhost:8888${NC}"
echo "-------------------------------------------"
echo ""
echo "Images in use:"
echo " Hub: ${HUB_IMAGE}"
echo " Student: ${STUDENT_IMAGE}"
echo ""
echo "Password: metabar2025"
echo "Students can connect with any username"
echo ""
@@ -309,6 +448,38 @@ print_success() {
fi
}
# ---------------------------------------------------------------------------
# Setup volume directories
# ---------------------------------------------------------------------------
echo "Starting JupyterHub for Lab"
echo "=============================="
echo ""
echo -e "${BLUE}Building the volume directories...${NC}"
pushd "${SCRIPT_DIR}/jupyterhub_volumes" >/dev/null
mkdir -p caddy/data
mkdir -p caddy/config
mkdir -p course/bin
mkdir -p course/R_packages
mkdir -p jupyterhub
mkdir -p shared
mkdir -p users
mkdir -p web/obidoc
mkdir -p builder/R_packages
popd >/dev/null
pushd "${DOCKER_DIR}" >/dev/null
if [ ! -f "Dockerfile" ] || [ ! -f "docker-compose.yml" ]; then
echo "Error: Run this script from the OBIJupyterHub directory"
exit 1
fi
# ---------------------------------------------------------------------------
# Main flow
# ---------------------------------------------------------------------------
if $STOP_SERVER; then
stop_stack
popd >/dev/null
@@ -316,15 +487,29 @@ if $STOP_SERVER; then
fi
if $UPDATE_LECTURES; then
build_builder_image
if $LOCAL_BUILD; then
build_builder_image
elif ! $NO_BUILD; then
docker pull "$BUILDER_IMAGE" 2>/dev/null \
|| echo -e "${YELLOW}Could not pull builder image, using local cache.${NC}"
fi
build_website
popd >/dev/null
exit 0
fi
stop_stack
build_builder_image
build_images
if $PUBLISH; then
publish_images
pull_images # pull the freshly published images into the local daemon
elif $LOCAL_BUILD; then
build_builder_image
build_images
else
pull_images # default: pull from registry
fi
build_website
build_obidoc
start_stack
+109 -26
View File
@@ -1,17 +1,15 @@
#!/usr/bin/env Rscript
# Script to dynamically detect and install R dependencies from Quarto files
# Uses the {attachment} package to scan .qmd files for library()/require() calls
# Script to dynamically detect and install R dependencies from Quarto files.
# Scans library()/require() calls and remotes::install_git/github() calls.
args <- commandArgs(trailingOnly = TRUE)
quarto_dir <- if (length(args) > 0) args[1] else "."
# Target library for installing packages (the mounted volume)
target_lib <- "/usr/local/lib/R/site-library"
cat("Scanning Quarto files in:", quarto_dir, "\n")
cat("Target library:", target_lib, "\n")
# Find all .qmd files
qmd_files <- list.files(
path = quarto_dir,
pattern = "\\.qmd$",
@@ -26,34 +24,119 @@ if (length(qmd_files) == 0) {
cat("Found", length(qmd_files), "Quarto files\n")
# Extract dependencies using attachment
deps <- attachment::att_from_rmds(qmd_files, inline = TRUE)
if (length(deps) == 0) {
cat("No R package dependencies detected.\n")
quit(status = 0)
# Extract package names from library()/require() calls
extract_cran_packages <- function(files) {
pattern <- "(?:library|require)\\s*\\(\\s*['\"]?([A-Za-z0-9._]+)['\"]?"
pkgs <- character(0)
for (f in files) {
lines <- tryCatch(readLines(f, warn = FALSE), error = function(e) character(0))
m <- regmatches(lines, gregexpr(pattern, lines, perl = TRUE))
hits <- unlist(m)
if (length(hits) > 0) {
extracted <- sub(
"(?:library|require)\\s*\\(\\s*['\"]?([A-Za-z0-9._]+)['\"]?.*",
"\\1", hits, perl = TRUE
)
pkgs <- c(pkgs, extracted)
}
}
unique(pkgs)
}
cat("\nDetected R packages:\n")
cat(paste(" -", deps, collapse = "\n"), "\n\n")
# Extract git/github URLs from remotes::install_git/github() calls
extract_git_packages <- function(files) {
# Matches remotes::install_git('url') or remotes::install_github('user/repo')
pattern <- "remotes::install_(git|github)\\s*\\(\\s*['\"]([^'\"]+)['\"]"
result <- list()
for (f in files) {
lines <- tryCatch(readLines(f, warn = FALSE), error = function(e) character(0))
text <- paste(lines, collapse = "\n")
m <- gregexpr(pattern, text, perl = TRUE)
hits <- regmatches(text, m)[[1]]
for (hit in hits) {
type <- sub("remotes::install_(git|github).*", "\\1", hit, perl = TRUE)
url <- sub("remotes::install_(?:git|github)\\s*\\(\\s*['\"]([^'\"]+)['\"].*",
"\\1", hit, perl = TRUE)
result[[length(result) + 1]] <- list(type = type, url = url)
}
}
result
}
cran_deps <- extract_cran_packages(qmd_files)
git_deps <- extract_git_packages(qmd_files)
# Quarto's implicit runtime dependencies — must be in target_lib (the persistent
# volume), not just somewhere in libPaths, because Quarto spawns its own R session.
quarto_required <- c("rmarkdown", "knitr")
if (length(git_deps) > 0) quarto_required <- c(quarto_required, "remotes")
cat("\nDetected CRAN packages:\n")
cat(paste(" -", unique(c(quarto_required, cran_deps)), collapse = "\n"), "\n")
if (length(git_deps) > 0) {
cat("\nDetected git/github packages:\n")
for (d in git_deps) cat(" -", d$type, ":", d$url, "\n")
}
cat("\n")
# --- Install CRAN packages ---
# Filter out base R packages that are always available
base_pkgs <- rownames(installed.packages(priority = "base"))
deps <- setdiff(deps, base_pkgs)
# Check which packages are not installed
installed <- rownames(installed.packages())
to_install <- setdiff(deps, installed)
# quarto_required: check only in target_lib so they are guaranteed to be there
installed_in_target <- rownames(installed.packages(lib.loc = target_lib))
quarto_missing <- setdiff(quarto_required, c(base_pkgs, installed_in_target))
# other deps: check anywhere in libPaths (they just need to be loadable)
cran_deps <- setdiff(cran_deps, c(base_pkgs, quarto_required))
installed <- rownames(installed.packages())
to_install <- unique(c(quarto_missing, setdiff(cran_deps, installed)))
if (length(to_install) == 0) {
cat("All required packages are already installed.\n")
cat("All CRAN packages already installed.\n")
} else {
cat("Installing missing packages:", paste(to_install, collapse = ", "), "\n\n")
install.packages(
to_install,
lib = target_lib,
repos = "https://cloud.r-project.org/",
dependencies = TRUE
)
cat("\nPackage installation complete.\n")
cat("Installing CRAN packages:", paste(to_install, collapse = ", "), "\n\n")
failed <- character(0)
for (pkg in to_install) {
result <- tryCatch({
withCallingHandlers(
install.packages(pkg, lib = target_lib, repos = "https://cloud.r-project.org/",
dependencies = TRUE, quiet = FALSE),
warning = function(w) {
if (grepl("not available", conditionMessage(w))) invokeRestart("muffleWarning")
}
)
if (!requireNamespace(pkg, quietly = TRUE)) "unavailable" else "ok"
}, error = function(e) "error")
if (result %in% c("unavailable", "error")) {
cat(" [SKIP]", pkg, "- not available on CRAN\n")
failed <- c(failed, pkg)
} else {
cat(" [OK]", pkg, "\n")
}
}
if (length(failed) > 0)
cat("\nNot installed (not on CRAN):", paste(failed, collapse = ", "), "\n")
}
# --- Install git/github packages ---
if (length(git_deps) > 0) {
cat("\nInstalling git/github packages...\n")
for (d in git_deps) {
tryCatch({
if (d$type == "git") {
remotes::install_git(d$url, lib = target_lib, upgrade = "never")
} else {
remotes::install_github(d$url, lib = target_lib, upgrade = "never")
}
cat(" [OK]", d$url, "\n")
}, error = function(e) {
cat(" [FAIL]", d$url, "-", conditionMessage(e), "\n")
})
}
}
cat("\nDependency installation complete.\n")
+1
View File
@@ -0,0 +1 @@
0.1.0