Merge pull request 'First complete version' (#4) from push-qsttonrunzsp into master

Reviewed-on: #4
This commit was merged in pull request #4.
This commit is contained in:
2025-10-16 18:49:32 +00:00
24 changed files with 1265 additions and 110 deletions

4
.gitignore vendored
View File

@@ -1,4 +1,8 @@
/Affinity
/jupyterhub_volumes/users /jupyterhub_volumes/users
/jupyterhub_volumes/shared /jupyterhub_volumes/shared
/jupyterhub_volumes/jupyterhub /jupyterhub_volumes/jupyterhub
/jupyterhub_volumes/caddy /jupyterhub_volumes/caddy
/**/.DS_Store
/.luarc.json

3
.vscode/settings.json vendored Normal file
View File

@@ -0,0 +1,3 @@
{
"git.enabled": false
}

170
Readme.md
View File

@@ -6,8 +6,6 @@ You must have docker running on your computer
- On MacOS, [OrbStack](https://orbstack.dev/ "A Docker implementation optimised for MacOS") is recommanded - On MacOS, [OrbStack](https://orbstack.dev/ "A Docker implementation optimised for MacOS") is recommanded
##
## Installation Steps ## Installation Steps
### 1. Create Directory Structure ### 1. Create Directory Structure
@@ -16,21 +14,62 @@ You must have docker running on your computer
git clone https://forge.metabarcoding.org/MetabarcodingSchool/OBIJupyterHub.git git clone https://forge.metabarcoding.org/MetabarcodingSchool/OBIJupyterHub.git
``` ```
Enter into the `OBIJupyterHub` directory
```bash
cd OBIJupyterHub
```
#### File Structure #### File Structure
Your `~/OBIJupyterHub` directory should contain: Your `OBIJupyterHub` directory should contain:
``` ```
~/OBIJupyterHub/ OBIJupyterHub
├── Dockerfile # Image for students (already created) ├── start-jupyterhub.sh - The script used to setup and start the server
├── Dockerfile.hub # Image for JupyterHub (new) ├── obijupyterhub - The files describing the docker images and the stack
├── jupyterhub_config.py # Configuration │   ├── Caddyfile
├── docker-compose.yml # Orchestration │   ├── docker-compose.yml
└── start-jupyterhub.sh # Startup script │   ├── Dockerfile
│   ├── Dockerfile.hub
│   ├── jupyterhub_config.py
│   ├── sftpgo_config.json
│   └── start-notebook.sh
├── jupyterhub_volumes - The directory containing the docker volumes
│   ├── caddy
│   ├── course - Read only volume mounted on every student container
│   │   ├── bin
│   │   └── R_packages
│   ├── jupyterhub
│   ├── shared - Read write volume shared in every student container
│   ├── users
│   └── web
│   ├── img
│   │   └── welcome_metabar.webp
│   ├── index.html
│   └── pages
├── Readme.md - This documentation
├── tools
│   ├── generate_pages_json.py
│   └── install_packages.sh
└─── web_src - The quarto document sources used to build the web site
   ├── _output
   ├── _quarto.yml
   ├── 00_home.qmd
   ├── lectures
   │   └── computers
   │   └── regex
   │   ├── lecture_regex.qmd
   │   ├── slides_regex.qmd
   │   └── slides.css
   └── scripts
   └── copy-to-web.sh
``` ```
### 2. Start JupyterHub ### 2. Start JupyterHub
From the terminal, in the `OBIJupyterHub` directory, run the following command:
``` bash ``` bash
./start-jupyterhub.sh ./start-jupyterhub.sh
``` ```
@@ -39,13 +78,14 @@ Your `~/OBIJupyterHub` directory should contain:
Open your browser and go to: **http://localhost:8888** Open your browser and go to: **http://localhost:8888**
You can log in with any username and password: `metabar2025` You can log in as a student with any username and password: `metabar2025`
## Useful Commands ## Useful Commands
### View JupyterHub logs ### View JupyterHub logs
``` bash ``` bash
cd obijupyterhub
docker-compose logs -f jupyterhub docker-compose logs -f jupyterhub
``` ```
@@ -58,26 +98,17 @@ docker ps | grep jupyterhub
### Stop JupyterHub ### Stop JupyterHub
``` bash ``` bash
cd obijupyterhub
docker-compose down docker-compose down
``` ```
### Restart JupyterHub (after config modification) ### Restart JupyterHub (after config modification)
``` bash ``` bash
cd obijupyterhub
docker-compose restart jupyterhub docker-compose restart jupyterhub
``` ```
### Rebuild after Dockerfile modification
``` bash
# For student image
docker build -t jupyterhub-student:latest -f Dockerfile .
docker-compose restart jupyterhub
# For hub image
docker-compose up -d --build
```
### View logs for a specific student ### View logs for a specific student
``` bash ``` bash
@@ -90,6 +121,7 @@ Replace `<username>` by the actual user name of the student.
``` bash ``` bash
# Stop and remove all containers # Stop and remove all containers
cd obijupyterhub
docker-compose down docker-compose down
# Remove student containers # Remove student containers
@@ -121,52 +153,46 @@ work/ # Personal workspace root (persistent)
└── [course materials] # Your course files └── [course materials] # Your course files
``` ```
**R Package Priority:** 1. R checks `work/R_packages/` first (personal, writable) 2. Then `work/course/R_packages/` (shared, read-only, installed by prof) 3. Then system libraries **R Package Priority:**
1. R checks `work/R_packages/` first (personal, writable)
1. Then `work/course/R_packages/` (shared, read-only, installed by prof)
1. Then system libraries
**Important:** Everything is under `work/`, so all student files are automatically saved in their persistent volume. **Important:** Everything is under `work/`, so all student files are automatically saved in their persistent volume.
### User Accounts ### User Accounts
**Admin Account:** - Username: `admin` - Password: `admin2025` (change in docker-compose.yml: `JUPYTERHUB_ADMIN_PASSWORD`) - Can write to `course/` directory **Admin Account:**
**Student Accounts:** - Username: any name - Password: `metabar2025` (change in docker-compose.yml: `JUPYTERHUB_PASSWORD`) - Read-only access to `course/` directory - Username: `admin`
- Password: `admin2025` (change in docker-compose.yml: `JUPYTERHUB_ADMIN_PASSWORD`)
- Can write to `course/` directory
**Student Accounts:**
- Username: any name
- Password: `metabar2025` (change in docker-compose.yml: `JUPYTERHUB_PASSWORD`)
- Read-only access to `course/` directory
### Installing R Packages (Admin Only) ### Installing R Packages (Admin Only)
**From your Mac (recommended):** **From your Mac (recommended):**
``` bash ``` bash
chmod +x install-r-packages-admin.sh
# Install packages # Install packages
./install-r-packages-admin.sh reshape2 plotly knitr tools/install_packages.sh reshape2 plotly knitr
``` ```
This script: - Installs packages in the `course/R_packages/` directory - All students can use them (read-only) - No need to rebuild the image This script: - Installs packages in the `course/R_packages/` directory - All students can use them (read-only) - No need to rebuild the image
**From admin notebook:**
Login as `admin` and create an R notebook:
``` r
# Install packages in course/R_packages (admin only, available to all students)
course_lib <- "/home/jovyan/work/course/R_packages"
dir.create(course_lib, recursive = TRUE, showWarnings = FALSE)
install.packages(c('reshape2', 'plotly', 'knitr'),
lib = course_lib,
repos = 'http://cran.rstudio.com/')
```
Note: Admin account has write access to the course directory.
**Students can also install their own packages:** **Students can also install their own packages:**
Students can install packages in their personal `work/R_packages/`: Students can install packages in their personal `work/R_packages/`:
```r ```r
# Install in personal library (each student has their own) # Install in personal library (each student has their own)
install.packages(c('mypackage')) # Will install in work/R_packages/ install.packages('mypackage') # Will install in work/R_packages/
``` ```
### Using R Packages (Students) ### Using R Packages (Students)
@@ -178,7 +204,11 @@ library(reshape2) # R checks: 1) work/R_packages/ 2) work/course/R_packages/ 3)
library(plotly) library(plotly)
``` ```
R automatically searches in this order: 1. Personal packages: `/home/jovyan/work/R_packages/` (R_LIBS_USER) 2. Prof packages: `/home/jovyan/work/course/R_packages/` (R_LIBS_SITE) 3. System packages R automatically searches in this order:
1. Personal packages: `/home/jovyan/work/R_packages/` (R_LIBS_USER)
1. Prof packages: `/home/jovyan/work/course/R_packages/` (R_LIBS_SITE)
1. System packages
### List Available Packages ### List Available Packages
@@ -212,34 +242,21 @@ docker run --rm \
alpine sh -c "cp -r /source/* /target/" alpine sh -c "cp -r /source/* /target/"
``` ```
### Access Shared Files Between Students
Students can collaborate via the `shared/` directory:
``` python
# In a notebook, to read a shared file
import pandas as pd
df = pd.read_csv('/home/jovyan/work/shared/group_data.csv')
# To write a shared file
df.to_csv('/home/jovyan/work/shared/alice_results.csv')
```
### Retrieve Student Work ### Retrieve Student Work
``` bash ``` bash
# List user volumes # List user volumes
docker volume ls | grep jupyterhub-user docker volume ls | grep 'obijupyterhub_user-'
# Copy files from a specific student # Copy files from a specific student
docker run --rm \ docker run --rm \
-v jupyterhub-user-alice:/source \ -v obijupyterhub_user-alice:/source \
-v ~/submissions:/target \ -v ~/submissions:/target \
alpine sh -c "cp -r /source/* /target/alice/" alpine sh -c "cp -r /source/* /target/alice/"
# Copy all shared work # Copy all shared work
docker run --rm \ docker run --rm \
-v jupyterhub-shared:/source \ -v obijupyterhub_shared:/source \
-v ~/submissions/shared:/target \ -v ~/submissions/shared:/target \
alpine sh -c "cp -r /source/* /target/" alpine sh -c "cp -r /source/* /target/"
``` ```
@@ -266,7 +283,11 @@ c.Authenticator.allow_all = True
## Kernel Verification ## Kernel Verification
Once logged in, create a new notebook and verify you have access to: - **Python 3** (default kernel) - **R** (R kernel) - **Bash** (bash kernel) Once logged in, create a new notebook and verify you have access to:
- **Python 3** (default kernel)
- **R** (R kernel)
- **Bash** (bash kernel)
## Customization for Your Labs ## Customization for Your Labs
@@ -278,11 +299,10 @@ Modify the `Dockerfile` (before `USER ${NB_UID}`):
RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')" RUN R -e "install.packages(c('your_package'), repos='http://cran.rstudio.com/')"
``` ```
Then rebuild: Then restart the server (it rebuilds the images if needed):
```bash ```bash
docker build -t jupyterhub-student:latest -f Dockerfile . ./start-jupyterhub.sh
docker-compose restart jupyterhub
``` ```
### Add Python Packages ### Add Python Packages
@@ -321,23 +341,29 @@ ports:
## Troubleshooting ## Troubleshooting
**Error "Cannot connect to Docker daemon"**: - Check that OrbStack is running - Verify the socket exists: `ls -la /var/run/docker.sock` **Error "Cannot connect to Docker daemon"**:
**Student containers don't start**: - Check logs: `docker-compose logs jupyterhub` - Verify student image exists: `docker images | grep jupyterhub-student` - Check that OrbStack is running
- Verify the socket exists: `ls -la /var/run/docker.sock`
**Port 8000 already in use**: - Change port in `docker-compose.yml` **Student containers don't start**:
**After config modification, changes are not applied**: - Check logs: `docker-compose logs jupyterhub`
- Verify student image exists: `docker images | grep jupyterhub-student`
**Port 8000 already in use**:
- Change port in `docker-compose.yml`
``` bash
docker-compose restart jupyterhub
```
**I want to start from scratch**: **I want to start from scratch**:
``` bash ``` bash
push obijupyterhub
docker-compose down -v docker-compose down -v
docker rmi jupyterhub-hub jupyterhub-student docker rmi jupyterhub-hub jupyterhub-student
popd
# Then rebuild everything # Then rebuild everything
./start-jupyterhub.sh ./start-jupyterhub.sh
``` ```

3
jupyterhub_volumes/web/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
/.quarto/
**/*.quarto_ipynb
/pages/

Binary file not shown.

After

Width:  |  Height:  |  Size: 217 KiB

View File

@@ -1,13 +1,205 @@
<!DOCTYPE html> <!DOCTYPE html>
<html> <html lang="en">
<head> <head>
<title>Course Portal</title> <meta charset="UTF-8">
<title>DNA Metabarcoding Learning Server</title>
<style>
body {
margin: 0;
font-family: sans-serif;
display: flex;
height: 100vh;
overflow: hidden;
}
/* Sidebar */
nav {
width: 250px;
background-color: #2c3e50;
color: white;
display: flex;
flex-direction: column;
justify-content: space-between;
padding: 20px 0;
overflow-y: auto;
}
nav ul {
list-style: none;
padding-left: 15px;
margin: 0;
}
nav li {
margin: 4px 0;
}
nav a {
color: white;
text-decoration: none;
display: block;
padding: 4px 8px;
border-radius: 4px;
font-weight: 500;
}
nav a:hover {
background-color: #34495e;
}
/* Toggle icons */
.folder {
cursor: pointer;
font-weight: bold;
display: flex;
align-items: center;
}
.folder::before {
content: "▸";
display: inline-block;
margin-right: 6px;
transition: transform 0.2s ease;
}
.folder.open::before {
transform: rotate(90deg);
}
ul.collapsed {
display: none;
}
/* Admin links */
nav .admin-links {
border-top: 1px solid #34495e;
margin-top: 10px;
padding-top: 10px;
}
/* Main content */
main {
flex: 1;
display: flex;
flex-direction: column;
overflow: hidden;
align-items: center;
background-color: #f7f7f7;
}
header img {
width: 100%;
max-width: 1000px;
height: auto;
display: block;
}
iframe#content-frame {
flex: 1;
width: 100%;
border: none;
max-width: 1000px;
background-color: white;
}
nav::-webkit-scrollbar {
width: 8px;
}
nav::-webkit-scrollbar-thumb {
background-color: #34495e;
border-radius: 4px;
}
</style>
</head> </head>
<body> <body>
<h1>Welcome</h1> <nav>
<ul> <ul id="nav-menu"></ul>
<li><a href="/jupyter/">JupyterHub</a></li> <ul class="admin-links">
<li><a href="/static/">Static Site</a></li> <li><a href="/jupyter/" target="_blank">JupyterHub</a></li>
<li><a href="/sftp/" target="_blank">Data Admin</a></li>
</ul> </ul>
</nav>
<main>
<header>
<img src="img/welcome_metabar.webp" alt="Welcome Banner">
</header>
<iframe id="content-frame" src=""></iframe>
</main>
<script>
const iframe = document.getElementById("content-frame");
const navMenu = document.getElementById("nav-menu");
/**
* Génère récursivement le menu à partir de l'arborescence JSON
*/
function buildMenu(items, parent) {
items.forEach(item => {
const li = document.createElement("li");
if (item.children) {
const folder = document.createElement("div");
folder.className = "folder";
folder.textContent = item.label;
const subUl = document.createElement("ul");
subUl.classList.add("collapsed");
folder.addEventListener("click", () => {
folder.classList.toggle("open");
subUl.classList.toggle("collapsed");
});
li.appendChild(folder);
li.appendChild(subUl);
parent.appendChild(li);
buildMenu(item.children, subUl);
} else if (item.file) {
const a = document.createElement("a");
a.href = "#";
a.textContent = item.label;
a.addEventListener("click", e => {
e.preventDefault();
iframe.src = "pages/" + item.file;
history.replaceState(null, null, "#" + item.file);
});
li.appendChild(a);
parent.appendChild(li);
}
});
}
fetch('pages/pages.json')
.then(resp => resp.json())
.then(pages => {
buildMenu(pages, navMenu);
// Charger la page par défaut (1ère sans enfants)
let defaultPage = null;
function findFirstFile(items) {
for (const it of items) {
if (it.file) return it.file;
if (it.children) {
const child = findFirstFile(it.children);
if (child) return child;
}
}
return null;
}
defaultPage = findFirstFile(pages);
if (location.hash) {
iframe.src = "pages/" + location.hash.substring(1);
} else if (defaultPage) {
iframe.src = "pages/" + defaultPage;
}
})
.catch(err => {
console.error("Erreur chargement pages.json", err);
iframe.srcdoc = "<p>Impossible de charger le contenu.</p>";
});
</script>
</body> </body>
</html> </html>

View File

@@ -39,6 +39,9 @@ RUN cp $HOME/.cargo/bin/csvlens /usr/local/bin/
RUN mkdir -p /home/${NB_USER}/.local/share/jupyter && \ RUN mkdir -p /home/${NB_USER}/.local/share/jupyter && \
chown -R ${NB_UID}:${NB_GID} /home/${NB_USER} chown -R ${NB_UID}:${NB_GID} /home/${NB_USER}
COPY start-notebook.sh /usr/local/bin/start-notebook.sh
RUN chmod +x /usr/local/bin/start-notebook.sh
# Switch back to Jupyter user # Switch back to Jupyter user
USER ${NB_UID}:${NB_GID} USER ${NB_UID}:${NB_GID}
WORKDIR /home/${NB_USER}/work WORKDIR /home/${NB_USER}/work
@@ -47,3 +50,4 @@ WORKDIR /home/${NB_USER}/work
ENV PATH="/home/${NB_USER}/work/course/bin:${PATH}" ENV PATH="/home/${NB_USER}/work/course/bin:${PATH}"
ENV R_LIBS_USER="/home/${NB_USER}/work/R_packages" ENV R_LIBS_USER="/home/${NB_USER}/work/R_packages"
ENV R_LIBS_SITE="/home/${NB_USER}/work/course/R_packages:/usr/local/lib/R/site-library:/usr/lib/R/site-library" ENV R_LIBS_SITE="/home/${NB_USER}/work/course/R_packages:/usr/local/lib/R/site-library:/usr/lib/R/site-library"

View File

@@ -12,7 +12,9 @@ services:
# Access to Docker socket to spawn student containers # Access to Docker socket to spawn student containers
- /var/run/docker.sock:/var/run/docker.sock - /var/run/docker.sock:/var/run/docker.sock
# JupyterHub database persistence # JupyterHub database persistence
- jupyterhub-data:/srv/jupyterhub - data:/srv/jupyterhub
# The Jupyter user volumes
- users:/volumes
# Mount config file directly (for easy modifications) # Mount config file directly (for easy modifications)
- ./jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py:ro - ./jupyterhub_config.py:/srv/jupyterhub/jupyterhub_config.py:ro
networks: networks:
@@ -36,9 +38,9 @@ services:
- "8888:80" - "8888:80"
volumes: volumes:
- ./Caddyfile:/etc/caddy/Caddyfile - ./Caddyfile:/etc/caddy/Caddyfile
- jupyterhub-caddy-data:/data - caddy-data:/data
- jupyterhub-caddy-config:/config - caddy-config:/config
- jupyterhub-web:/srv # Votre app - web:/srv # Votre app
networks: networks:
- jupyterhub-network - jupyterhub-network
restart: unless-stopped restart: unless-stopped
@@ -58,9 +60,9 @@ services:
SFTPGO_DEFAULT_ADMIN_PASSWORD: admin2025 SFTPGO_DEFAULT_ADMIN_PASSWORD: admin2025
SFTPGO_HTTPD__BINDINGS__0__CLIENT_IP_PROXY_HEADER: X-Real-IP SFTPGO_HTTPD__BINDINGS__0__CLIENT_IP_PROXY_HEADER: X-Real-IP
volumes: volumes:
- jupyterhub-shared:/volumes/shared - shared:/volumes/shared
- jupyterhub-course:/volumes/course - course:/volumes/course
- jupyterhub-web:/volumes/web - web:/volumes/web
- ./sftpgo_config.json:/config/local_config.json:ro - ./sftpgo_config.json:/config/local_config.json:ro
networks: networks:
@@ -73,45 +75,45 @@ networks:
driver: bridge driver: bridge
volumes: volumes:
jupyterhub-data: data:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/jupyterhub device: ../jupyterhub_volumes/jupyterhub
jupyterhub-shared: shared:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/shared device: ../jupyterhub_volumes/shared
jupyterhub-course: course:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/course device: ../jupyterhub_volumes/course
jupyterhub-web: web:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/web device: ../jupyterhub_volumes/web
jupyterhub-caddy-data: caddy-data:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/caddy/data device: ../jupyterhub_volumes/caddy/data
jupyterhub-caddy-config: caddy-config:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/caddy/config device: ../jupyterhub_volumes/caddy/config
jupyterhub-users: users:
driver: local driver: local
driver_opts: driver_opts:
type: none type: none
o: bind o: bind
device: ./jupyterhub_volumes/users device: ../jupyterhub_volumes/users

View File

@@ -1,4 +1,7 @@
import os import os
import logging
from pathlib import Path
# Base configuration coucou # Base configuration coucou
c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner' c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
@@ -7,6 +10,8 @@ c.JupyterHub.spawner_class = 'dockerspawner.DockerSpawner'
c.JupyterHub.log_level = 'DEBUG' c.JupyterHub.log_level = 'DEBUG'
c.Spawner.debug = True c.Spawner.debug = True
VOLUMES_BASE_PATH = '/volumes'
# Docker image to use for student containers # Docker image to use for student containers
c.DockerSpawner.image = 'jupyterhub-student:latest' c.DockerSpawner.image = 'jupyterhub-student:latest'
@@ -21,6 +26,11 @@ c.DockerSpawner.client_kwargs = {'base_url': 'unix:///var/run/docker.sock'}
c.JupyterHub.hub_ip = '0.0.0.0' c.JupyterHub.hub_ip = '0.0.0.0'
c.JupyterHub.hub_connect_ip = 'jupyterhub' c.JupyterHub.hub_connect_ip = 'jupyterhub'
c.DockerSpawner.environment = {
# R package library in read-only course directory under work/
'R_LIBS_USER': '/home/jovyan/work/R_packages',
'R_LIBS_SITE': '/home/jovyan/work/course/R_packages:/usr/local/lib/R/site-library:/usr/lib/R/site-library'
}
# Network configuration for student containers # Network configuration for student containers
c.DockerSpawner.use_internal_ip = True c.DockerSpawner.use_internal_ip = True
c.DockerSpawner.network_name = 'jupyterhub-network' c.DockerSpawner.network_name = 'jupyterhub-network'
@@ -38,18 +48,34 @@ notebook_dir = '/home/jovyan/work'
c.DockerSpawner.notebook_dir = notebook_dir c.DockerSpawner.notebook_dir = notebook_dir
# Personal volume for each student + shared volumes under work/ # Personal volume for each student + shared volumes under work/
# Pre-spawn hook to create user directory
async def create_user_dir(spawner):
"""Create user directory if it doesn't exist"""
user_dir = os.path.join(VOLUMES_BASE_PATH, spawner.user.name)
spawner.log.info(f"Ensured user directory exists: {user_dir}")
Path(user_dir).mkdir(parents=True, exist_ok=True)
os.chmod(user_dir, 0o755)
c.Spawner.pre_spawn_hook = create_user_dir
c.DockerSpawner.volumes = { c.DockerSpawner.volumes = {
# Personal volume (persistent) - root directory # Personal volume (persistent) - root directory
'jupyterhub-user-{username}': '/home/jovyan/work', 'obijupyterhub_shared-{username}' : '/home/jovyan/work',
# Shared volume between all students - under work/ # Shared volume between all students - under work/
'jupyterhub-shared': '/home/jovyan/work/shared', 'obijupyterhub_shared': '/home/jovyan/work/shared',
# Shared read-only volume for course files - under work/ # Shared read-only volume for course files - under work/
'jupyterhub-course': { 'obijupyterhub_course': {
'bind': '/home/jovyan/work/course', 'bind': '/home/jovyan/work/course',
'mode': 'ro' # read-only 'mode': 'ro' # read-only
} }
} }
c.DockerSpawner.volume_driver = 'local'
c.DockerSpawner.volume_driver_opts = {
'type': 'none',
'device': '/volumes',
'o': 'bind'
}
# Memory and CPU configuration (adjust according to your needs) # Memory and CPU configuration (adjust according to your needs)
c.DockerSpawner.mem_limit = '2G' c.DockerSpawner.mem_limit = '2G'
c.DockerSpawner.cpu_limit = 1.0 c.DockerSpawner.cpu_limit = 1.0
@@ -93,3 +119,6 @@ c.JupyterHub.bind_url = 'http://0.0.0.0:8000/jupyter/'
# Timeout # Timeout
c.Spawner.start_timeout = 300 c.Spawner.start_timeout = 300
c.Spawner.http_timeout = 120 c.Spawner.http_timeout = 120
# Post-start hook to create R_packages directory after volumes are mounted
c.DockerSpawner.cmd = ['start-notebook.sh']

View File

@@ -0,0 +1,28 @@
#!/bin/bash
set -e
# Function to create directory
create_r_packages_dir() {
local max_attempts=10
local attempt=1
while [ $attempt -le $max_attempts ]; do
if [ -d "/home/jovyan/work" ]; then
mkdir -p /home/jovyan/work/R_packages
echo "R_packages directory created successfully"
return 0
fi
echo "Waiting for work directory to be mounted (attempt $attempt/$max_attempts)..."
sleep 1
attempt=$((attempt + 1))
done
echo "Warning: Could not verify work directory mount"
return 1
}
# Create R_packages directory
create_r_packages_dir
# Start the single-user server
exec jupyterhub-singleuser "$@"

View File

@@ -3,18 +3,35 @@
# JupyterHub startup script for labs # JupyterHub startup script for labs
# Usage: ./start-jupyterhub.sh # Usage: ./start-jupyterhub.sh
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
DOCKER_DIR="${SCRIPT_DIR}/obijupyterhub/"
set -e # Stop on error set -e # Stop on error
echo "🚀 Starting JupyterHub for Lab" echo "🚀 Starting JupyterHub for Lab"
echo "==============================" echo "=============================="
echo "" echo ""
# Compile the web site
echo ""
echo -e "${BLUE}🔨 Building the volume directories...${NC}"
pushd "${SCRIPT_DIR}/jupyterhub_volumes"
mkdir -p caddy
mkdir -p course/bin
mkdir -p course/R_packages
mkdir -p jupyterhub
mkdir -p shared
mkdir -p users
popd
# Colors for display # Colors for display
GREEN='\033[0;32m' GREEN='\033[0;32m'
BLUE='\033[0;34m' BLUE='\033[0;34m'
YELLOW='\033[1;33m' YELLOW='\033[1;33m'
NC='\033[0m' # No Color NC='\033[0m' # No Color
pushd "${DOCKER_DIR}"
# Check we're in the right directory # Check we're in the right directory
if [ ! -f "Dockerfile" ] || [ ! -f "docker-compose.yml" ]; then if [ ! -f "Dockerfile" ] || [ ! -f "docker-compose.yml" ]; then
echo "❌ Error: Run this script from the jupyterhub-tp/ directory" echo "❌ Error: Run this script from the jupyterhub-tp/ directory"
@@ -39,6 +56,14 @@ echo ""
echo -e "${BLUE}🔨 Building JupyterHub image...${NC}" echo -e "${BLUE}🔨 Building JupyterHub image...${NC}"
docker build -t jupyterhub-hub:latest -f Dockerfile.hub . docker build -t jupyterhub-hub:latest -f Dockerfile.hub .
# Compile the web site
echo ""
echo -e "${BLUE}🔨 Building web site...${NC}"
pushd ../web_src
quarto render
python3 ../tools/generate_pages_json.py
popd
# Start the stack # Start the stack
echo "" echo ""
echo -e "${BLUE}🚀 Starting JupyterHub...${NC}" echo -e "${BLUE}🚀 Starting JupyterHub...${NC}"
@@ -49,6 +74,8 @@ echo ""
echo -e "${YELLOW}⏳ Waiting for JupyterHub to start...${NC}" echo -e "${YELLOW}⏳ Waiting for JupyterHub to start...${NC}"
sleep 3 sleep 3
popd
# Check that container is running # Check that container is running
if docker ps | grep -q jupyterhub; then if docker ps | grep -q jupyterhub; then
echo "" echo ""

View File

@@ -0,0 +1,60 @@
#!/usr/bin/env python3
import os
import re
import json
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
PAGES_DIR = os.path.join(SCRIPT_DIR,"..","jupyterhub_volumes","web","pages")
def clean_label(filename):
"""Nettoie les préfixes numériques et formate un label lisible."""
name = os.path.splitext(os.path.basename(filename))[0]
name = re.sub(r'^\d+_', '', name) # Supprime "00_", "01_", etc.
name = name.replace("_", " ")
return name.capitalize()
def build_tree(directory):
"""Construit récursivement la structure de menu à partir du répertoire donné."""
entries = []
for name in sorted(os.listdir(directory)):
path = os.path.join(directory, name)
# Ignorer les répertoires cachés ou ceux finissant par 'libs'
if os.path.isdir(path):
if name.endswith("libs") or name.startswith("."):
continue
children = build_tree(path)
if children:
entries.append({
"label": clean_label(name),
"children": children
})
elif name.endswith(".html"):
entries.append({
"file": os.path.relpath(path, PAGES_DIR).replace("\\", "/"),
"label": clean_label(name)
})
return entries
def count_pages(tree):
"""Compte le nombre total de fichiers HTML dans l'arborescence."""
total = 0
for node in tree:
if "file" in node:
total += 1
if "children" in node:
total += count_pages(node["children"])
return total
if __name__ == "__main__":
tree = build_tree(PAGES_DIR)
output_path = os.path.join(PAGES_DIR, "pages.json")
with open(output_path, "w", encoding="utf-8") as out:
json.dump(tree, out, indent=2, ensure_ascii=False)
print(f"✅ Generated {output_path} with {count_pages(tree)} HTML pages.")

View File

@@ -5,6 +5,9 @@
set -e set -e
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
VOLUME="obijupyterhub_course"
GREEN='\033[0;32m' GREEN='\033[0;32m'
BLUE='\033[0;34m' BLUE='\033[0;34m'
YELLOW='\033[1;33m' YELLOW='\033[1;33m'
@@ -54,7 +57,7 @@ docker run --rm \
echo "" echo ""
echo -e "${BLUE}💾 Copying to course/R_packages...${NC}" echo -e "${BLUE}💾 Copying to course/R_packages...${NC}"
docker run --rm \ docker run --rm \
-v jupyterhub-course:/target \ -v ${VOLUME}:/target \
-v "${TEMP_DIR}:/source" \ -v "${TEMP_DIR}:/source" \
alpine sh -c "mkdir -p /target/R_packages && cp -r /source/* /target/R_packages/" alpine sh -c "mkdir -p /target/R_packages && cp -r /source/* /target/R_packages/"
@@ -67,7 +70,7 @@ echo -e "${GREEN}✅ Installation complete!${NC}"
echo "" echo ""
echo -e "${BLUE}📦 Installed packages in work/course/R_packages:${NC}" echo -e "${BLUE}📦 Installed packages in work/course/R_packages:${NC}"
docker run --rm \ docker run --rm \
-v jupyterhub-course:/course \ -v ${VOLUME}:/course \
alpine ls -1 /course/R_packages/ alpine ls -1 /course/R_packages/
echo "" echo ""

3
web_src/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
/.quarto/
**/*.quarto_ipynb
/_output/

8
web_src/00_home.qmd Normal file
View File

@@ -0,0 +1,8 @@
---
title: "Home"
format: html
---
# Welcome to the DNA Metabarcoding Learning Server
This is the **home page** of your training platform. Here you can access Jupyter notebooks, learn DNA metabarcoding analysis with OBITools and R, and explore example datasets.

10
web_src/_quarto.yml Normal file
View File

@@ -0,0 +1,10 @@
project:
type: default
output-dir: _output
post-render:
- scripts/copy-to-web.sh
format:
html:
toc: false
self-contained: true

View File

@@ -0,0 +1,335 @@
---
title: "Regular Expressions"
format:
html:
embed-resources: true # pour que les SVG soient inclus
self-contained: true # optionnel : tout est intégré dans le HTML
---
Regular expressions allow describing a fragment of text by authorizing variations in that text. As an example, $tot*o$ describes a piece of text starting with a "t" then an "o" followed by an undetermined number of "t"s and a final "o". We can therefore consider a regular expression as a pattern of the actual text being searched. To clarify the rest of this text, we'll admit the following definitions:
## Definitions
- **Alphabet**: The set of symbols we are allowed to use. For example, DNA is described using a four-letter alphabet ${A, C, G, T}$. Standard UNIX programs using regular expressions (`egrep`, `awk`, etc...) work on a much larger alphabet including all uppercase and lowercase letters, numbers, punctuation marks, and other characters representing formatting actions like line breaks.
- **Text**: The sequence of symbols corresponding to the analyzed document. A text corresponds to an alphabet. A text can therefore represent very diverse things: a chromosome or protein sequence, the output of another program, a series of descriptions of biological objects such as those obtainable by downloading "flat" files from biological databases.
- **Word**: A word is a subset of consecutive symbols from a text. This is a more general definition than that of a word in the French language, which gives word status to a group of letters in a text preceded and followed by a space or punctuation mark.
We'll say that a regular expression is a pattern representing one or more words in a text. Search engines use this pattern to find occurrences of words matching this pattern in a text.
## The Simplest Regular Expression
Any piece of text can be considered as a regular expression that recognizes text identical to itself. For example, $ATG$ recognizes the sequence of three letters A, T, G.
```{mermaid}
graph LR
D((D)) -->|A| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
## Introducing Ambiguities
The main interest of regular expressions is their ability to describe words (text fragments) by authorizing certain ambiguities. There are two main classes of ambiguities. The first allows describing alterations on symbols. The second category allows describing the repetition of symbols. Alterations are introduced in the regular expression by using special characters.
### Symbol Ambiguities
#### Any Character
The first special character is the dot ".". It can recognize any character. If we stick to the example of codons, the regular expression `regex{.TG}` recognizes any character followed by a "T" then a "G".
```{mermaid}
graph LR
D((D)) -->|any| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
#### A Specific Subset of Characters
The dot sometimes offers too much flexibility. There's another mechanism to list an authorized group of characters. Just list the authorized characters between brackets "[" and "]". The expression `regex{[ACGT]}` recognizes one of the four letters "A", "C", "G", or "T".
In bacteria, initiation codons are multiple. Most of the time, codons *ATG*, *TTG*, and *GTG* are recognized as translation initiation codons. These three codons only vary by their first letter, which can be an "A", "T", or "G". The regular expression `regex{[ATG]TG}` recognizes words of three letters starting with a symbol "A", "T", or "G" followed by a "T" and a "G".
```{mermaid}
graph LR
D((D)) -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
#### Any Character Except a Subgroup
Sometimes it's necessary to describe a set of characters as: "any character of the alphabet except for a particular group of symbols." To describe these negated groups, the same notation is used as for character groups described previously. The only difference is that the group must start with the "^" character. The expression `regex{[^A-Z]}` therefore recognizes any character except an uppercase letter, and `regex{[^b]}` recognizes all characters except "b".
### Variations on Symbol Repetition
#### A Symbol Present Zero or One Time
The simplest alteration on the number of occurrences of a symbol is represented by "?". This character added after the description of a symbol indicates that it can be present or absent in the recognized word. That is, present 0 or 1 times.
```{mermaid}
graph LR
D((D)) -->|b| 1((1))
1 -->|a| 2((2))
2 -->|l| 3((3))
3 -->|l| 4((4))
4 -->|o| 5((5))
5 -->|n| 6((6))
5 -->|s| 7((7))
7 -->|n| F((F))
6 --> F
style D fill:#90EE90
style F fill:#FFC0CB
```
#### A Symbol Present an Undetermined Number of Times
A more flexible form regarding the presence or absence of a symbol in words recognized by a regular expression is provided by the "*" character, which indicates the preceding symbol can be absent or present an undetermined number of times.
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|T| 2((2))
2 -->|A| 2
2 -->|T| 3((3))
3 -->|T| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
#### A Symbol Present at Least Once
There's syntax that simplifies writing such a constraint. It uses the "+" character as a marker. Thus, the regular expressions `regex{TTA+TT}` and `regex{TTAA*TT}` are strictly equivalent.
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|T| 2((2))
2 -->|A| 3((3))
3 -->|A| 3
3 -->|T| 4((4))
4 -->|T| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
#### Describing a Repetition Interval
A final notation, recently introduced in regular expression syntax, allows giving a lower and upper bound to the number of occurrences of a symbol. The format uses braces "{" and "}" to frame the two bounds.
### Special Characters
#### Beginning and End of Line
By adding a "^" at the beginning of an expression or "$" at the end of an expression, you can force the recognized word to be at the beginning or end of a line.
#### The Double Meaning of a Character
Each character has two meanings:
- A primary meaning: the symbol represented by the character
- A meta-meaning: Which gives another meaning to the character
You switch from one meaning to the other by preceding the character with the backslash "\".
### Combining Multiple Expressions
The question arises when combining multiple regular expressions with the logical OR operator. I want to build an expression that recognizes the words "papa" or "mama". For this, combine the two simple expressions `regex{papa}` and `regex{mama}` using the "|" character to get the global expression `regex{papa|mama}`.
```{mermaid}
graph LR
D((D)) -->|p| 1((1))
1 -->|a| 2((2))
2 -->|p| 3((3))
3 -->|a| F1((F))
D -->|m| 4((4))
4 -->|a| 5((5))
5 -->|m| 6((6))
6 -->|a| F2((F))
style D fill:#90EE90
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
```
### Subexpressions
#### Subexpressions and Combination of Multiple Expressions
It's possible to isolate a subpart of a regular expression using parentheses "(" and ")".
#### Reusing a Subexpression
Normally, each step of a regular expression is independent of what happened before in the automaton. It's possible thanks to subexpressions to go against this principle by memorizing a sequence of previous states using a subexpression.
## Summary of Authorized Alteration Forms
### Symbol Ambiguity
| Symbol | Recognizes |
|--------|------------|
| . | Any character |
| [ ] | One of the characters listed between brackets |
| [^ ] | Any character except those listed between brackets |
### Repetition Ambiguity
| Symbol | Number of accepted occurrences |
|--------|--------------------------------|
| * | 0 to ∞ |
| ? | 0 or 1 |
| + | 1 or more |
| {x,y} | between x and y occurrences |
| {x,} | at least x occurrences |
### Special Characters
| Symbol | Meaning |
|--------|---------|
| ^ at expression start | beginning of line |
| $ at expression end | end of line |
| \n | line break |
| \t | a tabulation |
## Exercise: Identifying Genes with a Regular Expression
To identify a CDS, we need to combine three regular expressions: one for the initiation codon, the second for non-stop codons, and the last for stop codons.
### Start Codons
In bacteria, there are three start codons: ATG, TTG, and GTG. The corresponding regular expression is: `regex{[ATG]TG}`
```{mermaid}
graph LR
D((D)) -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
### Stop Codons
In most bacteria, there are three different termination codons: TAA (ochre), TAG (amber), and TGA (opale). The regular expression recognizing all stop codons is `regex{T(A[AG]\|GA)}`
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|A| 2((2))
2 -->|A/G| F1((F))
1 -->|G| 3((3))
3 -->|A| F2((F))
style D fill:#90EE90
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
```
### Non-stop Codons
The regular expression recognizing the 61 non-stop codons is:
`regex{[ACG][ACGT][ACGT]\|T([CT][ACGT]\|G[CGT]\|A[CT])}`
```{mermaid}
graph LR
%% État initial
D((D))
style D fill:#90EE90
%% Branche 1: [ACG][ACGT][ACGT]
D -->|A/C/G| A1((1))
A1 -->|A/C/G/T| A2((2))
A2 -->|A/C/G/T| F1((F))
%% Branche 2: T([CT][ACGT]|G[CGT]|A[CT])
D -->|T| B1((3))
%% Sous-branche 2.1: [CT][ACGT]
B1 -->|C/T| B2a((4))
B2a -->|A/C/G/T| F2((F))
%% Sous-branche 2.2: G[CGT]
B1 -->|G| B2b((5))
B2b -->|C/G/T| F3((F))
%% Sous-branche 2.3: A[CT]
B1 -->|A| B2c((6))
B2c -->|C/T| F4((F))
%% États finaux
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
style F3 fill:#FFC0CB
style F4 fill:#FFC0CB
```
### Recognizing a Complete CDS
Recognizing a complete CDS now comes down to assembling the regular expression for starts, that for non-stops (authorizing its repetition), then that recognizing stop codons.
`regex{[ATG]TG([ACG][ACGT][ACGT]\|T([CT][ACGT]\|G[CGT]\|A[CT]))+T(A[GA]\|GA)}`
```{mermaid}
graph LR
%% État initial
D((D))
style D fill:#90EE90
%% Début: [ATG]TG
D -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| 3((3))
%% Boucle principale pour la partie médiane (répétition +)
3 --> 4((4))
%% Alternative 1: [ACG][ACGT][ACGT]
4 -->|A/C/G| 5((5))
5 -->|A/C/G/T| 6((6))
6 -->|A/C/G/T| 7((7))
%% Alternative 2: T([CT][ACGT]|G[CGT]|A[CT])
4 -->|T| 8((8))
%% Sous-alternative 2.1: [CT][ACGT]
8 -->|C/T| 9((9))
9 -->|A/C/G/T| 7((7))
%% Sous-alternative 2.2: G[CGT]
8 -->|G| 10((10))
10 -->|C/G/T| 7((7))
%% Sous-alternative 2.3: A[CT]
8 -->|A| 11((11))
11 -->|C/T| 7((7))
%% Boucle de répétition
7 --> 4
%% Fin: T(A[GA]|GA)
7 -->|T| 12((12))
%% Alternative finale 1: A[GA]
12 -->|A| 13((13))
13 -->|A/G| F((F))
%% Alternative finale 2: GA
12 -->|G| 14((14))
14 -->|A| F((F))
%% État final
style F fill:#FFC0CB
```
To impose that the CDS codes for a protein of at least 100 amino acids, just replace the "+" sign with a constraint on the minimum number of repetitions of non-stop codons to 99.
`[ATG]TG([ACG][ACGT][ACGT]\|T([CT][ACGT]\|G[CGT]\|A[CT])){99,}T(A[GA]\|GA)`

View File

@@ -0,0 +1,6 @@
/* Centre toutes les images dans les slides */
.quarto-figure-default {
display: block;
margin-left: auto;
margin-right: auto;
}

View File

@@ -0,0 +1,369 @@
---
title: "Regular Expressions"
format:
revealjs:
theme: beige # thème des slides
transition: fade # effet de transition entre les slides
---
## Regular Expressions
Pattern matching for text with variations
Example: `tot*o` matches:
- "to" + any number of "t" + "o"
- "toto", "totto", "totttto", etc.
---
## Basic Concepts
**Alphabet**: Set of allowed symbols
- DNA: {A, C, G, T}
- Text: {letters, digits, punctuation, ...}
**Text**: Sequence of symbols from alphabet
**Word**: Subsequence of consecutive symbols
---
## Simple Regular Expression
`ATG` matches exactly "ATG"
```{mermaid}
graph LR
D((D)) -->|A| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Symbol Ambiguities
### Any Character: `.`
`.TG` matches:
- "ATG", "TTG", "GTG", "CTG", ...
```{mermaid}
graph LR
D((D)) -->|any| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Character Classes
`[ATG]TG` matches only:
- "ATG", "TTG", "GTG"
```{mermaid}
graph LR
D((D)) -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Ranges and Negation
**Ranges**: `[A-Z]`, `[0-9]`, `[A-Za-z0-9]`
**Negation**: `[^A-Z]` (anything except uppercase)
---
## Repetition: Zero or One
`ballons?` matches:
- "ballon" (singular)
- "ballons" (plural)
```{mermaid}
graph LR
D((D)) -->|b| 1((1))
1 -->|a| 2((2))
2 -->|l| 3((3))
3 -->|l| 4((4))
4 -->|o| 5((5))
5 -->|n| 6((6))
5 -->|s| 7((7))
7 -->|n| F((F))
6 --> F
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Repetition: Zero or More
`TTA*TT` matches:
- "TTTT", "TTATT", "TTAAATT", ...
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|T| 2((2))
2 -->|A| 2
2 -->|T| 3((3))
3 -->|T| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Repetition: One or More
`TTA+TT` matches:
- "TTATT", "TTAATT", "TTAAATT", ...
- But NOT "TTTT"
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|T| 2((2))
2 -->|A| 3((3))
3 -->|A| 3
3 -->|T| 4((4))
4 -->|T| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Exact Repetition
`A{3,5}` matches:
- "AAA", "AAAA", "AAAAA"
`A{3}` matches exactly "AAA"
---
## Special Characters
- `^` - Start of line
- `$` - End of line
- `\n` - Newline
- `\t` - Tab
Examples:
- `^start` - "start" at beginning of line
- `end$` - "end" at end of line
- `^exact$` - "exact" as entire line
---
## Alternation
`papa|mama` matches either:
- "papa" OR "mama"
```{mermaid}
graph LR
D((D)) -->|p| 1((1))
1 -->|a| 2((2))
2 -->|p| 3((3))
3 -->|a| F1((F))
D -->|m| 4((4))
4 -->|a| 5((5))
5 -->|m| 6((6))
6 -->|a| F2((F))
style D fill:#90EE90
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
```
---
## Grouping
`T(AA|AG|GA)` matches:
- "TAA", "TAG", "TGA"
Instead of incorrect: `TAA|AG|GA`
---
## Backreferences
`([ACGT]{3})\1{9,}` matches:
- Any triplet repeated 10+ times
- Example: "CAGCAGCAGCAG..."
---
## Quick Reference
### Symbol Ambiguity
| Pattern | Matches |
|---------|---------|
| `.` | Any character |
| `[abc]` | a, b, or c |
| `[^abc]` | Not a, b, or c |
### Repetition
| Pattern | Matches |
|---------|---------|
| `?` | 0 or 1 |
| `*` | 0 or more |
| `+` | 1 or more |
| `{n,m}` | n to m times |
---
## Biological Application: Gene Finding
Find Coding Sequences (CDS) in bacterial DNA:
1. Start codon
2. Multiple non-stop codons
3. Stop codon
---
## Start Codons
Bacterial start: ATG, TTG, GTG
Pattern: `[ATG]TG`
```{mermaid}
graph LR
D((D)) -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| F((F))
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Stop Codons
Bacterial stop: TAA, TAG, TGA
Pattern: `T(A[AG]|GA)`
```{mermaid}
graph LR
D((D)) -->|T| 1((1))
1 -->|A| 2((2))
2 -->|A/G| F1((F))
1 -->|G| 3((3))
3 -->|A| F2((F))
style D fill:#90EE90
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
```
---
## Non-Stop Codons
61 codons that aren't stop codons
Pattern: `[ACG][ACGT][ACGT]|T([CT][ACGT]|G[CGT]|A[CT])`
```{mermaid}
graph LR
D((D)) -->|A/C/G| A1((1))
A1 -->|A/C/G/T| A2((2))
A2 -->|A/C/G/T| F1((F))
D -->|T| B1((3))
B1 -->|C/T| B2a((4))
B2a -->|A/C/G/T| F2((F))
B1 -->|G| B2b((5))
B2b -->|C/G/T| F3((F))
B1 -->|A| B2c((6))
B2c -->|C/T| F4((F))
style D fill:#90EE90
style F1 fill:#FFC0CB
style F2 fill:#FFC0CB
style F3 fill:#FFC0CB
style F4 fill:#FFC0CB
```
---
## Complete CDS Pattern
`[ATG]TG([ACG][ACGT][ACGT]|T([CT][ACGT]|G[CGT]|A[CT]))+T(A[GA]|GA)`
- Start codon
- 1+ non-stop codons
- Stop codon
```{mermaid}
graph LR
D((D)) -->|A/T/G| 1((1))
1 -->|T| 2((2))
2 -->|G| 3((3))
3 --> 4((4))
4 -->|A/C/G| 5((5))
5 -->|A/C/G/T| 6((6))
6 -->|A/C/G/T| 7((7))
4 -->|T| 8((8))
8 -->|C/T| 9((9))
9 -->|A/C/G/T| 7
8 -->|G| 10((10))
10 -->|C/G/T| 7
8 -->|A| 11((11))
11 -->|C/T| 7
7 --> 4
7 -->|T| 12((12))
12 -->|A| 13((13))
13 -->|A/G| F((F))
12 -->|G| 14((14))
14 -->|A| F
style D fill:#90EE90
style F fill:#FFC0CB
```
---
## Minimum Length CDS
`[ATG]TG([ACG][ACGT][ACGT]|T([CT][ACGT]|G[CGT]|A[CT])){99,}T(A[GA]|GA)`
Requires at least 100 amino acids (99 non-stop codons + stop)
---
## Summary
- Regular expressions = text patterns
- Symbol ambiguity: `.`, `[]`, `[^]`
- Repetition: `?`, `*`, `+`, `{}`
- Special chars: `^`, `$`, `\n`, `\t`
- Powerful for biological sequence analysis

2
web_src/scripts/copy-to-web.sh Executable file
View File

@@ -0,0 +1,2 @@
#!/bin/bash
rsync -av --delete _output/ ../jupyterhub_volumes/web/pages/

41
xxx.txt Normal file
View File

@@ -0,0 +1,41 @@
.
├── start-jupyterhub.sh
├── docker
│   ├── Caddyfile
│   ├── docker-compose.yml
│   ├── Dockerfile
│   ├── Dockerfile.hub
│   ├── jupyterhub_config.py
│   ├── sftpgo_config.json
│   └── start-notebook.sh
├── jupyterhub_volumes
│   ├── caddy
│   ├── course
│   │   ├── bin
│   │   └── R_packages
│   ├── jupyterhub
│   ├── shared
│   ├── users
│   └── web
│   ├── img
│   │   └── welcome_metabar.webp
│   ├── index.html
│   └── pages
│   └── pages.json
├── Readme.md
├── tools
│   ├── generate_pages_json.py
│   └── install_packages.sh
├── users.json
└─── web_src
   ├── _output
   ├── _quarto.yml
   ├── 00_home.qmd
   ├── lectures
   │   └── computers
   │   └── regex
   │   ├── lecture_regex.qmd
   │   ├── slides_regex.qmd
   │   └── slides.css
   └── scripts
   └── copy-to-web.sh