Files
OBIJupyterHub/Readme.md

322 lines
7.8 KiB
Markdown
Raw Normal View History

2025-10-15 07:10:44 +02:00
# JupyterHub Configuration with OrbStack on Mac (all in Docker)
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Prerequisites
- OrbStack installed and running
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## File Structure
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Your `~/jupyterhub-tp` directory should contain:
2025-10-14 17:40:41 +02:00
```
~/jupyterhub-tp/
2025-10-15 07:10:44 +02:00
├── Dockerfile # Image for students (already created)
├── Dockerfile.hub # Image for JupyterHub (new)
2025-10-14 17:40:41 +02:00
├── jupyterhub_config.py # Configuration
2025-10-15 07:10:44 +02:00
├── docker-compose.yml # Orchestration
└── start-jupyterhub.sh # Startup script
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
## Installation Steps
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### 1. Create Directory Structure
2025-10-14 17:40:41 +02:00
```bash
mkdir -p ~/jupyterhub-tp
cd ~/jupyterhub-tp
```
2025-10-15 07:10:44 +02:00
### 2. Create All Necessary Files
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Create the following files with the content from artifacts:
- `Dockerfile` (artifact "Dockerfile for JupyterHub with R and Bash")
- `Dockerfile.hub` (artifact "Dockerfile for JupyterHub container")
- `jupyterhub_config.py` (artifact "JupyterHub Configuration")
2025-10-14 17:40:41 +02:00
- `docker-compose.yml` (artifact "docker-compose.yml")
2025-10-15 07:10:44 +02:00
- `start-jupyterhub.sh` (artifact "start-jupyterhub.sh")
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### 3. Make Startup Script Executable
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
chmod +x start-jupyterhub.sh
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### 4. Start JupyterHub
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
./start-jupyterhub.sh
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### 5. Access JupyterHub
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Open your browser and go to: **http://localhost:8000**
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
You can log in with any username and password: `metabar2025`
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Useful Commands
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### View JupyterHub logs
2025-10-14 17:40:41 +02:00
```bash
docker-compose logs -f jupyterhub
```
2025-10-15 07:10:44 +02:00
### View all containers (hub + students)
2025-10-14 17:40:41 +02:00
```bash
docker ps
```
2025-10-15 07:10:44 +02:00
### Stop JupyterHub
2025-10-14 17:40:41 +02:00
```bash
docker-compose down
```
2025-10-15 07:10:44 +02:00
### Restart JupyterHub (after config modification)
2025-10-14 17:40:41 +02:00
```bash
docker-compose restart jupyterhub
```
2025-10-15 07:10:44 +02:00
### Rebuild after Dockerfile modification
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
# For student image
2025-10-14 17:40:41 +02:00
docker build -t jupyterhub-student:latest -f Dockerfile .
docker-compose restart jupyterhub
2025-10-15 07:10:44 +02:00
# For hub image
2025-10-14 17:40:41 +02:00
docker-compose up -d --build
```
2025-10-15 07:10:44 +02:00
### View logs for a specific student
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
docker logs jupyter-username
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Clean up after lab
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
# Stop and remove all containers
2025-10-14 17:40:41 +02:00
docker-compose down
2025-10-15 07:10:44 +02:00
# Remove student containers
2025-10-14 17:40:41 +02:00
docker ps -a | grep jupyter- | awk '{print $1}' | xargs docker rm -f
2025-10-15 07:10:44 +02:00
# Remove volumes (WARNING: deletes student data)
2025-10-14 17:40:41 +02:00
docker volume ls | grep jupyterhub-user | awk '{print $2}' | xargs docker volume rm
2025-10-15 07:10:44 +02:00
# Clean everything (containers + volumes + network)
2025-10-14 17:40:41 +02:00
docker-compose down -v
docker ps -a | grep jupyter- | awk '{print $1}' | xargs docker rm -f
docker volume prune -f
```
2025-10-15 07:10:44 +02:00
## Managing Shared Data
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Directory Structure for Each Student
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Each student will see these directories in their JupyterLab:
- **`work/`** : Personal workspace (persistent, private)
- **`shared/`** : Shared workspace between all students (read/write)
- **`course/`** : Course files (read-only, you deposit files)
2025-10-15 07:15:05 +02:00
- **`course/R_packages/`** : Shared R packages (read-only for students, only admin can install)
### User Accounts
**Admin Account:**
- Username: `admin`
- Password: `admin2025` (change in docker-compose.yml: `JUPYTERHUB_ADMIN_PASSWORD`)
- Can write to `course/` directory
**Student Accounts:**
- Username: any name
- Password: `metabar2025` (change in docker-compose.yml: `JUPYTERHUB_PASSWORD`)
- Read-only access to `course/` directory
### Installing R Packages (Admin Only)
**From your Mac (recommended):**
```bash
chmod +x install-r-packages-admin.sh
# Install packages
./install-r-packages-admin.sh reshape2 plotly knitr
```
This script:
- Installs packages in the `course/R_packages/` directory
- All students can use them (read-only)
- No need to rebuild the image
**From admin notebook:**
Login as `admin` and create an R notebook:
```r
# Install packages in course directory (admin only)
course_lib <- "/home/jovyan/course/R_packages"
dir.create(course_lib, recursive = TRUE, showWarnings = FALSE)
install.packages(c('reshape2', 'plotly', 'knitr'),
lib = course_lib,
repos = 'http://cran.rstudio.com/')
```
Note: Admin account has write access to the course directory.
### Using R Packages (Students)
Students simply load packages normally:
```r
library(reshape2) # Loads from course/R_packages/ automatically
library(plotly)
```
R automatically finds packages in `/home/jovyan/course/R_packages/` thanks to the `R_LIBS_USER` environment variable.
### List Available Packages
```r
# List all available packages
installed.packages()[,"Package"]
# Or check course packages specifically
list.files("/home/jovyan/course/R_packages")
```
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Deposit Files for Course
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
To put files in the `course/` directory (accessible read-only):
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
# Create a temporary directory
2025-10-14 17:40:41 +02:00
mkdir -p ~/jupyterhub-tp/course-files
2025-10-15 07:10:44 +02:00
# Copy your files into it
cp my_notebooks.ipynb ~/jupyterhub-tp/course-files/
cp my_data.csv ~/jupyterhub-tp/course-files/
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
# Copy into Docker volume
2025-10-14 17:40:41 +02:00
docker run --rm \
-v jupyterhub-course:/target \
-v ~/jupyterhub-tp/course-files:/source \
alpine sh -c "cp -r /source/* /target/"
```
2025-10-15 07:10:44 +02:00
### Access Shared Files Between Students
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Students can collaborate via the `shared/` directory:
2025-10-14 17:40:41 +02:00
```python
2025-10-15 07:10:44 +02:00
# In a notebook, to read a shared file
2025-10-14 17:40:41 +02:00
import pandas as pd
2025-10-15 07:10:44 +02:00
df = pd.read_csv('/home/jovyan/shared/group_data.csv')
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
# To write a shared file
df.to_csv('/home/jovyan/shared/alice_results.csv')
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Retrieve Student Work
2025-10-14 17:40:41 +02:00
```bash
2025-10-15 07:10:44 +02:00
# List user volumes
2025-10-14 17:40:41 +02:00
docker volume ls | grep jupyterhub-user
2025-10-15 07:10:44 +02:00
# Copy files from a specific student
2025-10-14 17:40:41 +02:00
docker run --rm \
-v jupyterhub-user-alice:/source \
2025-10-15 07:10:44 +02:00
-v ~/submissions:/target \
2025-10-14 17:40:41 +02:00
alpine sh -c "cp -r /source/* /target/alice/"
2025-10-15 07:10:44 +02:00
# Copy all shared work
2025-10-14 17:40:41 +02:00
docker run --rm \
-v jupyterhub-shared:/source \
2025-10-15 07:10:44 +02:00
-v ~/submissions/shared:/target \
2025-10-14 17:40:41 +02:00
alpine sh -c "cp -r /source/* /target/"
```
2025-10-15 07:10:44 +02:00
## User Management
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Option 1: Predefined User List
In `jupyterhub_config.py`, uncomment and modify:
2025-10-14 17:40:41 +02:00
```python
2025-10-15 07:10:44 +02:00
c.Authenticator.allowed_users = {'student1', 'student2', 'student3'}
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Option 2: Allow Everyone (for testing)
By default, the configuration allows any user:
2025-10-14 17:40:41 +02:00
```python
c.Authenticator.allow_all = True
```
2025-10-15 07:10:44 +02:00
⚠️ **Warning**: DummyAuthenticator is ONLY for local testing!
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Kernel Verification
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
Once logged in, create a new notebook and verify you have access to:
- **Python 3** (default kernel)
- **R** (R kernel)
- **Bash** (bash kernel)
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Customization for Your Labs
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
### Add Additional R Packages
Modify the `Dockerfile` (before `USER ${NB_UID}`):
2025-10-14 17:40:41 +02:00
```dockerfile
2025-10-15 07:10:44 +02:00
RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')"
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
Then rebuild:
2025-10-14 17:40:41 +02:00
```bash
docker build -t jupyterhub-student:latest -f Dockerfile .
docker-compose restart jupyterhub
```
2025-10-15 07:10:44 +02:00
### Add Python Packages
Add to the `Dockerfile` (before `USER ${NB_UID}`):
2025-10-14 17:40:41 +02:00
```dockerfile
RUN pip install numpy pandas matplotlib seaborn
```
2025-10-15 07:10:44 +02:00
### Distribute Files to Students
Create a `files_lab/` directory and add to the `Dockerfile`:
2025-10-14 17:40:41 +02:00
```dockerfile
2025-10-15 07:10:44 +02:00
COPY files_lab/ /home/${NB_USER}/lab/
RUN chown -R ${NB_UID}:${NB_GID} /home/${NB_USER}/lab
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
### Change Port (if 8000 is occupied)
Modify in `docker-compose.yml`:
2025-10-14 17:40:41 +02:00
```yaml
ports:
2025-10-15 07:10:44 +02:00
- "8001:8000" # Accessible on localhost:8001
2025-10-14 17:40:41 +02:00
```
2025-10-15 07:10:44 +02:00
## Advantages of This Approach
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**Everything in Docker**: No need to install Python/JupyterHub on your Mac
**Portable**: Easy to deploy on another Mac or server
**Isolated**: No pollution of your system environment
**Easy to Clean**: A simple `docker-compose down` is enough
**Reproducible**: Students will have exactly the same environment
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
## Troubleshooting
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**Error "Cannot connect to Docker daemon"**:
- Check that OrbStack is running
- Verify the socket exists: `ls -la /var/run/docker.sock`
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**Student containers don't start**:
- Check logs: `docker-compose logs jupyterhub`
- Verify student image exists: `docker images | grep jupyterhub-student`
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**Port 8000 already in use**:
- Change port in `docker-compose.yml`
2025-10-14 17:40:41 +02:00
2025-10-15 07:10:44 +02:00
**After config modification, changes are not applied**:
2025-10-14 17:40:41 +02:00
```bash
docker-compose restart jupyterhub
```
2025-10-15 07:10:44 +02:00
**I want to start from scratch**:
2025-10-14 17:40:41 +02:00
```bash
docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student
2025-10-15 07:10:44 +02:00
# Then rebuild everything
./start-jupyterhub.sh
2025-10-15 07:15:05 +02:00
```