Enhance documentation and automate R package management #7
3
.gitignore
vendored
3
.gitignore
vendored
@@ -5,6 +5,7 @@
|
||||
/jupyterhub_volumes/caddy
|
||||
/jupyterhub_volumes/course/data/Genbank
|
||||
/jupyterhub_volumes/web/
|
||||
/jupyterhub_volumes/builder
|
||||
/**/.DS_Store
|
||||
/web_src/**/*.RData
|
||||
/web_src/**/*.pdf
|
||||
@@ -16,4 +17,4 @@
|
||||
ncbitaxo_*
|
||||
Readme_files
|
||||
Readme.html
|
||||
tmp.*
|
||||
tmp.*
|
||||
|
||||
40
Readme.md
40
Readme.md
@@ -6,32 +6,51 @@ This project packages the MetabarcodingSchool training lab into one reproducible
|
||||
|
||||
## Prerequisites (with quick checks)
|
||||
|
||||
You need Docker, Docker Compose, Quarto, and Python 3 available on the machine that will host the lab.
|
||||
You only need **Docker and Docker Compose** on the machine that will host the lab. All other tools (Quarto, Hugo, Python, R) are provided via a builder Docker image and do not need to be installed on your system.
|
||||
|
||||
- macOS: install [OrbStack](https://orbstack.dev/) (recommended) or Docker Desktop; both ship Docker Engine and Compose.
|
||||
- Linux: install Docker Engine and the Compose plugin from your distribution (e.g., `sudo apt install docker.io docker-compose-plugin`) or from Docker’s official packages.
|
||||
- Windows: install Docker Desktop with the WSL2 backend enabled.
|
||||
- Quarto CLI: get installers from <https://quarto.org/docs/get-started/>.
|
||||
- Python 3: any recent version is fine (only the standard library is used).
|
||||
|
||||
Verify from a terminal; if a command is missing, install it before moving on:
|
||||
Verify from a terminal:
|
||||
|
||||
```bash
|
||||
docker --version
|
||||
docker compose version # or: docker-compose --version
|
||||
quarto --version
|
||||
python3 --version
|
||||
```
|
||||
|
||||
## How the startup script works
|
||||
|
||||
`./start-jupyterhub.sh` is the single entry point. It builds the Docker images, renders the course website, prepares the volume folders, and starts the stack. Internally it:
|
||||
|
||||
- creates the `jupyterhub_volumes/` tree (caddy, course, shared, users, web…)
|
||||
- creates the `jupyterhub_volumes/` tree (caddy, course, shared, users, web...)
|
||||
- builds the `obijupyterhub-builder` image (contains Quarto, Hugo, R, Python) if not already present
|
||||
- builds `jupyterhub-student` and `jupyterhub-hub` images
|
||||
- detects R package dependencies from Quarto files using the `{attachment}` package and installs them automatically
|
||||
- renders the Quarto site from `web_src/`, generates PDF galleries and `pages.json`, and copies everything into `jupyterhub_volumes/web/`
|
||||
- runs `docker-compose up -d --remove-orphans`
|
||||
|
||||
### Builder image
|
||||
|
||||
The builder image (`obijupyterhub-builder`) contains all the tools needed to prepare the course materials:
|
||||
|
||||
- **Quarto** for rendering the course website
|
||||
- **Hugo** for building the obidoc documentation
|
||||
- **R** with the `{attachment}` package for automatic dependency detection
|
||||
- **Python 3** for utility scripts
|
||||
|
||||
This means you don't need to install any of these tools on your host system. The script automatically builds this image on first run and reuses it for subsequent builds. Use `--force-rebuild` to rebuild the builder image if needed.
|
||||
|
||||
### R package caching for builds
|
||||
|
||||
R packages required by your Quarto documents are automatically detected and installed during the build process. These packages are cached in `jupyterhub_volumes/builder/R_packages/` so they persist across builds. This means:
|
||||
|
||||
- **First build**: All R packages used in your `.qmd` files are detected and installed (may take some time)
|
||||
- **Subsequent builds**: Only missing packages are installed, making builds much faster
|
||||
- **Adding new packages**: Simply use `library(newpackage)` in your Quarto files; the build process will detect and install it automatically
|
||||
|
||||
To clear the R package cache and force a fresh installation, delete the `jupyterhub_volumes/builder/R_packages/` directory.
|
||||
|
||||
You can tailor what it does with a few flags:
|
||||
|
||||
- `--no-build` (or `--offline`): skip Docker image builds and reuse existing images (useful when offline).
|
||||
@@ -67,6 +86,8 @@ OBIJupyterHub
|
||||
└── web_src - Quarto sources for the course website
|
||||
```
|
||||
|
||||
Note: The `obijupyterhub/` directory also contains `Dockerfile.builder` which provides the build environment, the `tools/` directory contains utility scripts including `install_quarto_deps.R` for automatic R dependency detection, and `jupyterhub_volumes/builder/` stores cached R packages for faster builds.
|
||||
|
||||
3) Prepare course materials (optional before first run):
|
||||
- Put notebooks, datasets, scripts, binaries, or PDFs for students under `jupyterhub_volumes/course/`. They will appear read-only at `/home/jovyan/work/course/`.
|
||||
- For collaborative work, drop files in `jupyterhub_volumes/shared/` (read/write for all at `/home/jovyan/work/shared/`).
|
||||
@@ -246,9 +267,12 @@ ports:
|
||||
``` bash
|
||||
pushd obijupyterhub
|
||||
docker-compose down -v
|
||||
docker rmi jupyterhub-hub jupyterhub-student
|
||||
docker rmi jupyterhub-hub jupyterhub-student obijupyterhub-builder
|
||||
popd
|
||||
|
||||
# Optionally clear the R package cache
|
||||
rm -rf jupyterhub_volumes/builder/R_packages
|
||||
|
||||
# Then rebuild everything
|
||||
./start-jupyterhub.sh
|
||||
```
|
||||
|
||||
68
obijupyterhub/Dockerfile.builder
Normal file
68
obijupyterhub/Dockerfile.builder
Normal file
@@ -0,0 +1,68 @@
|
||||
# Dockerfile.builder
|
||||
# Image containing all tools needed to prepare the OBIJupyterHub stack
|
||||
# This allows the host system to only require Docker to be installed
|
||||
|
||||
FROM ubuntu:24.04
|
||||
|
||||
LABEL maintainer="OBIJupyterHub"
|
||||
LABEL description="Builder image for OBIJupyterHub preparation tasks"
|
||||
|
||||
# Avoid interactive prompts during package installation
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
ENV TZ=Etc/UTC
|
||||
|
||||
# Install base dependencies and R
|
||||
RUN apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
ca-certificates \
|
||||
curl \
|
||||
wget \
|
||||
git \
|
||||
rsync \
|
||||
python3 \
|
||||
r-base \
|
||||
r-base-dev \
|
||||
libcurl4-openssl-dev \
|
||||
libssl-dev \
|
||||
libxml2-dev \
|
||||
libfontconfig1-dev \
|
||||
libharfbuzz-dev \
|
||||
libfribidi-dev \
|
||||
libfreetype6-dev \
|
||||
libpng-dev \
|
||||
libtiff5-dev \
|
||||
libjpeg-dev \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Install the attachment package in a separate location (not overwritten by volume mount)
|
||||
# This ensures attachment is always available even when site-library is mounted as a volume
|
||||
ENV R_LIBS_BUILDER=/opt/R/builder-packages
|
||||
RUN mkdir -p ${R_LIBS_BUILDER} \
|
||||
&& R -e "install.packages('attachment', lib='${R_LIBS_BUILDER}', repos='https://cloud.r-project.org/')"
|
||||
|
||||
# Install Hugo (extended version for SCSS support)
|
||||
# Detect architecture and download appropriate binary
|
||||
ARG HUGO_VERSION=0.140.2
|
||||
RUN ARCH=$(dpkg --print-architecture) \
|
||||
&& case "$ARCH" in \
|
||||
amd64) HUGO_ARCH="amd64" ;; \
|
||||
arm64) HUGO_ARCH="arm64" ;; \
|
||||
*) echo "Unsupported architecture: $ARCH" && exit 1 ;; \
|
||||
esac \
|
||||
&& curl -fsSL "https://github.com/gohugoio/hugo/releases/download/v${HUGO_VERSION}/hugo_extended_${HUGO_VERSION}_linux-${HUGO_ARCH}.tar.gz" \
|
||||
| tar -xz -C /usr/local/bin hugo \
|
||||
&& chmod +x /usr/local/bin/hugo
|
||||
|
||||
# Install Quarto using the official .deb package (handles all dependencies properly)
|
||||
ARG QUARTO_VERSION=1.6.42
|
||||
RUN ARCH=$(dpkg --print-architecture) \
|
||||
&& curl -fsSL -o /tmp/quarto.deb "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-${ARCH}.deb" \
|
||||
&& dpkg -i /tmp/quarto.deb \
|
||||
&& rm /tmp/quarto.deb
|
||||
|
||||
# Create working directory
|
||||
WORKDIR /workspace
|
||||
|
||||
# Default command
|
||||
CMD ["/bin/bash"]
|
||||
@@ -7,6 +7,7 @@ set -e
|
||||
|
||||
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
|
||||
DOCKER_DIR="${SCRIPT_DIR}/obijupyterhub/"
|
||||
BUILDER_IMAGE="obijupyterhub-builder:latest"
|
||||
|
||||
# Colors for display
|
||||
GREEN='\033[0;32m'
|
||||
@@ -48,44 +49,102 @@ while [[ $# -gt 0 ]]; do
|
||||
done
|
||||
|
||||
if $STOP_SERVER && $UPDATE_LECTURES; then
|
||||
echo "❌ --stop-server and --update-lectures cannot be used together" >&2
|
||||
echo "Error: --stop-server and --update-lectures cannot be used together" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
echo "🚀 Starting JupyterHub for Lab"
|
||||
echo "Starting JupyterHub for Lab"
|
||||
echo "=============================="
|
||||
echo ""
|
||||
|
||||
echo -e "${BLUE}🔨 Building the volume directories...${NC}"
|
||||
echo -e "${BLUE}Building the volume directories...${NC}"
|
||||
pushd "${SCRIPT_DIR}/jupyterhub_volumes" >/dev/null
|
||||
mkdir -p caddy
|
||||
mkdir -p caddy/data
|
||||
mkdir -p caddy/config
|
||||
mkdir -p course/bin
|
||||
mkdir -p course/R_packages
|
||||
mkdir -p jupyterhub
|
||||
mkdir -p shared
|
||||
mkdir -p users
|
||||
mkdir -p web/obidoc
|
||||
mkdir -p builder/R_packages
|
||||
popd >/dev/null
|
||||
|
||||
pushd "${DOCKER_DIR}" >/dev/null
|
||||
|
||||
# Check we're in the right directory
|
||||
if [ ! -f "Dockerfile" ] || [ ! -f "docker-compose.yml" ]; then
|
||||
echo "❌ Error: Run this script from the jupyterhub-tp/ directory"
|
||||
echo "Error: Run this script from the jupyterhub-tp/ directory"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
check_if_image_needs_rebuild() {
|
||||
local image_name="$1"
|
||||
local dockerfile="$2"
|
||||
|
||||
# Check if image exists
|
||||
if ! docker image inspect "$image_name" >/dev/null 2>&1; then
|
||||
return 0 # Need to build (image doesn't exist)
|
||||
fi
|
||||
|
||||
# If force rebuild, always rebuild
|
||||
if $FORCE_REBUILD; then
|
||||
return 0 # Need to rebuild
|
||||
fi
|
||||
|
||||
# Compare Dockerfile modification time with image creation time
|
||||
if [ -f "$dockerfile" ]; then
|
||||
local dockerfile_mtime=$(stat -c %Y "$dockerfile" 2>/dev/null || echo 0)
|
||||
local image_created=$(docker image inspect "$image_name" --format='{{.Created}}' 2>/dev/null | sed 's/\.000000000//' | xargs -I {} date -d "{}" +%s 2>/dev/null || echo 0)
|
||||
|
||||
if [ "$dockerfile_mtime" -gt "$image_created" ]; then
|
||||
echo -e "${YELLOW}Dockerfile is newer than image, rebuild needed${NC}"
|
||||
return 0 # Need to rebuild
|
||||
fi
|
||||
fi
|
||||
|
||||
return 1 # No need to rebuild
|
||||
}
|
||||
|
||||
build_builder_image() {
|
||||
if check_if_image_needs_rebuild "$BUILDER_IMAGE" "Dockerfile.builder"; then
|
||||
local build_flag=()
|
||||
if $FORCE_REBUILD; then
|
||||
build_flag+=(--no-cache)
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${BLUE}Building builder image...${NC}"
|
||||
docker build "${build_flag[@]}" -t "$BUILDER_IMAGE" -f Dockerfile.builder .
|
||||
else
|
||||
echo -e "${BLUE}Builder image is up to date, skipping build.${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
# Run a command inside the builder container with the workspace mounted
|
||||
# R packages are persisted in jupyterhub_volumes/builder/R_packages
|
||||
# R_LIBS includes both the builder packages (attachment) and the mounted volume
|
||||
run_in_builder() {
|
||||
docker run --rm \
|
||||
-v "${SCRIPT_DIR}:/workspace" \
|
||||
-v "${SCRIPT_DIR}/jupyterhub_volumes/builder/R_packages:/usr/local/lib/R/site-library" \
|
||||
-e "R_LIBS=/opt/R/builder-packages:/usr/local/lib/R/site-library" \
|
||||
-w /workspace \
|
||||
"$BUILDER_IMAGE" \
|
||||
bash -c "$1"
|
||||
}
|
||||
|
||||
stop_stack() {
|
||||
echo -e "${BLUE}📦 Stopping existing containers...${NC}"
|
||||
echo -e "${BLUE}Stopping existing containers...${NC}"
|
||||
docker-compose down 2>/dev/null || true
|
||||
|
||||
echo -e "${BLUE}🧹 Cleaning up student containers...${NC}"
|
||||
echo -e "${BLUE}Cleaning up student containers...${NC}"
|
||||
docker ps -aq --filter name=jupyter- | xargs -r docker rm -f 2>/dev/null || true
|
||||
}
|
||||
|
||||
build_images() {
|
||||
if $NO_BUILD; then
|
||||
echo -e "${YELLOW}⏭️ Skipping image builds (offline/no-build mode).${NC}"
|
||||
echo -e "${YELLOW}Skipping image builds (offline/no-build mode).${NC}"
|
||||
return
|
||||
fi
|
||||
|
||||
@@ -94,20 +153,30 @@ build_images() {
|
||||
build_flag+=(--no-cache)
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${BLUE}🔨 Building student image...${NC}"
|
||||
docker build "${build_flag[@]}" -t jupyterhub-student:latest -f Dockerfile .
|
||||
# Check and build student image
|
||||
if check_if_image_needs_rebuild "jupyterhub-student:latest" "Dockerfile"; then
|
||||
echo ""
|
||||
echo -e "${BLUE}Building student image...${NC}"
|
||||
docker build "${build_flag[@]}" -t jupyterhub-student:latest -f Dockerfile .
|
||||
else
|
||||
echo -e "${BLUE}Student image is up to date, skipping build.${NC}"
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${BLUE}🔨 Building JupyterHub image...${NC}"
|
||||
docker build "${build_flag[@]}" -t jupyterhub-hub:latest -f Dockerfile.hub .
|
||||
# Check and build JupyterHub image
|
||||
if check_if_image_needs_rebuild "jupyterhub-hub:latest" "Dockerfile.hub"; then
|
||||
echo ""
|
||||
echo -e "${BLUE}Building JupyterHub image...${NC}"
|
||||
docker build "${build_flag[@]}" -t jupyterhub-hub:latest -f Dockerfile.hub .
|
||||
else
|
||||
echo -e "${BLUE}JupyterHub image is up to date, skipping build.${NC}"
|
||||
fi
|
||||
}
|
||||
|
||||
build_obidoc() {
|
||||
local dest="${SCRIPT_DIR}/jupyterhub_volumes/web/obidoc"
|
||||
|
||||
if $NO_BUILD; then
|
||||
echo -e "${YELLOW}⏭️ Skipping obidoc build in offline/no-build mode.${NC}"
|
||||
echo -e "${YELLOW}Skipping obidoc build in offline/no-build mode.${NC}"
|
||||
return
|
||||
fi
|
||||
|
||||
@@ -119,73 +188,79 @@ build_obidoc() {
|
||||
fi
|
||||
|
||||
if ! $needs_build; then
|
||||
echo -e "${BLUE}ℹ️ obidoc already present; skipping rebuild (use --build-obidoc to force).${NC}"
|
||||
echo -e "${BLUE}obidoc already present; skipping rebuild (use --build-obidoc to force).${NC}"
|
||||
return
|
||||
fi
|
||||
|
||||
echo ""
|
||||
echo -e "${BLUE}🔨 Building obidoc documentation...${NC}"
|
||||
BUILD_DIR=$(mktemp -d -p .)
|
||||
pushd "$BUILD_DIR" >/dev/null
|
||||
git clone --recurse-submodules \
|
||||
--remote-submodules \
|
||||
-j 8 \
|
||||
https://github.com/metabarcoding/obitools4-doc.git
|
||||
pushd obitools4-doc >/dev/null
|
||||
hugo -D build --baseURL "/obidoc/"
|
||||
mkdir -p "$dest"
|
||||
rm -rf "${dest:?}/"*
|
||||
mv public/* "$dest"
|
||||
popd >/dev/null
|
||||
popd >/dev/null
|
||||
rm -rf
|
||||
echo -e "${BLUE}Building obidoc documentation (in builder container)...${NC}"
|
||||
run_in_builder '
|
||||
set -e
|
||||
BUILD_DIR=$(mktemp -d)
|
||||
cd "$BUILD_DIR"
|
||||
git clone --recurse-submodules \
|
||||
--remote-submodules \
|
||||
-j 8 \
|
||||
https://github.com/metabarcoding/obitools4-doc.git
|
||||
cd obitools4-doc
|
||||
hugo -D build --baseURL "/obidoc/"
|
||||
mkdir -p /workspace/jupyterhub_volumes/web/obidoc
|
||||
rm -rf /workspace/jupyterhub_volumes/web/obidoc/*
|
||||
mv public/* /workspace/jupyterhub_volumes/web/obidoc/
|
||||
cd /
|
||||
rm -rf "$BUILD_DIR"
|
||||
'
|
||||
}
|
||||
|
||||
build_website() {
|
||||
echo ""
|
||||
echo -e "${BLUE}🔨 Building web site...${NC}"
|
||||
pushd ../web_src >/dev/null
|
||||
quarto render
|
||||
find . -name '*.pdf' -print \
|
||||
| while read pdfname ; do
|
||||
dest="../jupyterhub_volumes/web/pages/${pdfname}"
|
||||
dirdest=$(dirname "$dest")
|
||||
mkdir -p "$dirdest"
|
||||
echo "cp '${pdfname}' '${dest}'"
|
||||
done \
|
||||
| bash
|
||||
python3 ../tools/generate_pdf_galleries.py
|
||||
python3 ../tools/generate_pages_json.py
|
||||
popd >/dev/null
|
||||
echo -e "${BLUE}Building web site (in builder container)...${NC}"
|
||||
run_in_builder '
|
||||
set -e
|
||||
echo "-> Detecting and installing R dependencies..."
|
||||
Rscript /workspace/tools/install_quarto_deps.R /workspace/web_src
|
||||
|
||||
echo "-> Rendering Quarto site..."
|
||||
cd /workspace/web_src
|
||||
quarto render
|
||||
find . -name "*.pdf" -print | while read pdfname; do
|
||||
dest="/workspace/jupyterhub_volumes/web/pages/${pdfname}"
|
||||
dirdest=$(dirname "$dest")
|
||||
mkdir -p "$dirdest"
|
||||
cp "$pdfname" "$dest"
|
||||
done
|
||||
python3 /workspace/tools/generate_pdf_galleries.py
|
||||
python3 /workspace/tools/generate_pages_json.py
|
||||
'
|
||||
}
|
||||
|
||||
start_stack() {
|
||||
echo ""
|
||||
echo -e "${BLUE}🚀 Starting JupyterHub...${NC}"
|
||||
echo -e "${BLUE}Starting JupyterHub...${NC}"
|
||||
docker-compose up -d --remove-orphans
|
||||
|
||||
echo ""
|
||||
echo -e "${YELLOW}⏳ Waiting for JupyterHub to start...${NC}"
|
||||
echo -e "${YELLOW}Waiting for JupyterHub to start...${NC}"
|
||||
sleep 3
|
||||
}
|
||||
|
||||
print_success() {
|
||||
if docker ps | grep -q jupyterhub; then
|
||||
echo ""
|
||||
echo -e "${GREEN}✅ JupyterHub is running!${NC}"
|
||||
echo -e "${GREEN}JupyterHub is running!${NC}"
|
||||
echo ""
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo -e "${GREEN}🌐 JupyterHub available at: http://localhost:8888${NC}"
|
||||
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
|
||||
echo "-------------------------------------------"
|
||||
echo -e "${GREEN}JupyterHub available at: http://localhost:8888${NC}"
|
||||
echo "-------------------------------------------"
|
||||
echo ""
|
||||
echo "📝 Password: metabar2025"
|
||||
echo "👥 Students can connect with any username"
|
||||
echo "Password: metabar2025"
|
||||
echo "Students can connect with any username"
|
||||
echo ""
|
||||
echo "🔑 Admin account:"
|
||||
echo "Admin account:"
|
||||
echo " Username: admin"
|
||||
echo " Password: admin2025"
|
||||
echo ""
|
||||
echo "📂 Each student will have access to:"
|
||||
echo "Each student will have access to:"
|
||||
echo " - work/ : personal workspace (everything saved)"
|
||||
echo " - work/R_packages/ : personal R packages (writable)"
|
||||
echo " - work/shared/ : shared workspace"
|
||||
@@ -193,12 +268,12 @@ print_success() {
|
||||
echo " - work/course/R_packages/ : shared R packages by prof (read-only)"
|
||||
echo " - work/course/bin/ : shared executables (in PATH)"
|
||||
echo ""
|
||||
echo "🔍 To view logs: docker-compose logs -f jupyterhub"
|
||||
echo "🛑 To stop: docker-compose down"
|
||||
echo "To view logs: docker-compose logs -f jupyterhub"
|
||||
echo "To stop: docker-compose down"
|
||||
echo ""
|
||||
else
|
||||
echo ""
|
||||
echo -e "${YELLOW}⚠️ JupyterHub container doesn't seem to be starting${NC}"
|
||||
echo -e "${YELLOW}JupyterHub container doesn't seem to be starting${NC}"
|
||||
echo "Check logs with: docker-compose logs jupyterhub"
|
||||
exit 1
|
||||
fi
|
||||
@@ -211,12 +286,14 @@ if $STOP_SERVER; then
|
||||
fi
|
||||
|
||||
if $UPDATE_LECTURES; then
|
||||
build_builder_image
|
||||
build_website
|
||||
popd >/dev/null
|
||||
exit 0
|
||||
fi
|
||||
|
||||
stop_stack
|
||||
build_builder_image
|
||||
build_images
|
||||
build_website
|
||||
build_obidoc
|
||||
|
||||
59
tools/install_quarto_deps.R
Normal file
59
tools/install_quarto_deps.R
Normal file
@@ -0,0 +1,59 @@
|
||||
#!/usr/bin/env Rscript
|
||||
# Script to dynamically detect and install R dependencies from Quarto files
|
||||
# Uses the {attachment} package to scan .qmd files for library()/require() calls
|
||||
|
||||
args <- commandArgs(trailingOnly = TRUE)
|
||||
quarto_dir <- if (length(args) > 0) args[1] else "."
|
||||
|
||||
# Target library for installing packages (the mounted volume)
|
||||
target_lib <- "/usr/local/lib/R/site-library"
|
||||
|
||||
cat("Scanning Quarto files in:", quarto_dir, "\n")
|
||||
cat("Target library:", target_lib, "\n")
|
||||
|
||||
# Find all .qmd files
|
||||
qmd_files <- list.files(
|
||||
path = quarto_dir,
|
||||
pattern = "\\.qmd$",
|
||||
recursive = TRUE,
|
||||
full.names = TRUE
|
||||
)
|
||||
|
||||
if (length(qmd_files) == 0) {
|
||||
cat("No .qmd files found.\n")
|
||||
quit(status = 0)
|
||||
}
|
||||
|
||||
cat("Found", length(qmd_files), "Quarto files\n")
|
||||
|
||||
# Extract dependencies using attachment
|
||||
deps <- attachment::att_from_rmds(qmd_files, inline = TRUE)
|
||||
|
||||
if (length(deps) == 0) {
|
||||
cat("No R package dependencies detected.\n")
|
||||
quit(status = 0)
|
||||
}
|
||||
|
||||
cat("\nDetected R packages:\n")
|
||||
cat(paste(" -", deps, collapse = "\n"), "\n\n")
|
||||
|
||||
# Filter out base R packages that are always available
|
||||
base_pkgs <- rownames(installed.packages(priority = "base"))
|
||||
deps <- setdiff(deps, base_pkgs)
|
||||
|
||||
# Check which packages are not installed
|
||||
installed <- rownames(installed.packages())
|
||||
to_install <- setdiff(deps, installed)
|
||||
|
||||
if (length(to_install) == 0) {
|
||||
cat("All required packages are already installed.\n")
|
||||
} else {
|
||||
cat("Installing missing packages:", paste(to_install, collapse = ", "), "\n\n")
|
||||
install.packages(
|
||||
to_install,
|
||||
lib = target_lib,
|
||||
repos = "https://cloud.r-project.org/",
|
||||
dependencies = TRUE
|
||||
)
|
||||
cat("\nPackage installation complete.\n")
|
||||
}
|
||||
@@ -3,22 +3,31 @@ title: "Biodiversity metrics \ and metabarcoding"
|
||||
author: "Eric Coissac"
|
||||
date: "02/02/2024"
|
||||
bibliography: inst/REFERENCES.bib
|
||||
format:
|
||||
format:
|
||||
revealjs:
|
||||
css: ../../slides.css
|
||||
transition: slide
|
||||
scrollable: true
|
||||
theme: beige
|
||||
theme: beige
|
||||
html-math-method: mathjax
|
||||
embed-resources: true
|
||||
editor: visual
|
||||
---
|
||||
|
||||
```{r setup, include=FALSE}
|
||||
library(knitr)
|
||||
library(knitr)
|
||||
library(Rdpack)
|
||||
library(tidyverse)
|
||||
library(kableExtra)
|
||||
library(gt)
|
||||
library(latex2exp)
|
||||
|
||||
# Install MetabarSchool if not available
|
||||
if (!requireNamespace("MetabarSchool", quietly = TRUE)) {
|
||||
if (!requireNamespace("remotes", quietly = TRUE)) {
|
||||
install.packages("remotes", dependencies = TRUE)
|
||||
}
|
||||
remotes::install_git('https://forge.metabarcoding.org/MetabarcodingSchool/biodiversity-metrics.git')
|
||||
}
|
||||
library(MetabarSchool)
|
||||
|
||||
opts_chunk$set(echo = FALSE,
|
||||
@@ -49,7 +58,7 @@ install.packages("devtools",dependencies = TRUE)
|
||||
Then you can install *MetabarSchool*
|
||||
|
||||
```{r eval=FALSE, echo=TRUE}
|
||||
devtools::install_git("https://git.metabarcoding.org/MetabarcodingSchool/biodiversity-metrics.git")
|
||||
remotes::install_git('https://forge.metabarcoding.org/MetabarcodingSchool/biodiversity-metrics.git')
|
||||
```
|
||||
|
||||
You will also need the *vegan* package
|
||||
@@ -68,11 +77,15 @@ A 16 plants mock community
|
||||
data("plants.16")
|
||||
x = cbind(` ` =seq_len(nrow(plants.16)),plants.16)
|
||||
x$`Relative aboundance`=paste0('1/',1/x$dilution)
|
||||
knitr::kable(x[,-(4:5)],
|
||||
format = "html",
|
||||
row.names = FALSE,
|
||||
align = "rlrr") %>%
|
||||
kable_styling(position = "center")
|
||||
x[,-(4:5)] %>%
|
||||
gt() %>%
|
||||
cols_align(align = "center", columns = 1) %>%
|
||||
cols_align(align = "left", columns = 2) %>%
|
||||
cols_align(align = "right", columns = c(3, 4)) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## The experiment {.flexbox .vcenter}
|
||||
@@ -100,11 +113,14 @@ data("positive.motus")
|
||||
- `positive.count` read count matrix $`r nrow(positive.count)` \; PCRs \; \times \; `r ncol(positive.count)` \; MOTUs$
|
||||
|
||||
```{r}
|
||||
knitr::kable(positive.count[1:5,1:5],
|
||||
format="html",
|
||||
align = 'rc') %>%
|
||||
kable_styling(position = "center") %>%
|
||||
row_spec(0, angle = -45)
|
||||
as.data.frame(positive.count[1:5,1:5]) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = 1) %>%
|
||||
cols_align(align = "center", columns = 2:ncol(positive.count[1:5,1:5])) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
<br>
|
||||
@@ -126,10 +142,14 @@ data("positive.motus")
|
||||
- `positive.samples` a `r nrow(positive.samples)` rows `data.frame` of `r ncol(positive.samples)` columns describing each PCR
|
||||
|
||||
```{r}
|
||||
knitr::kable(head(positive.samples,n=3),
|
||||
format="html",
|
||||
align = 'rc') %>%
|
||||
kable_styling(position = "center")
|
||||
head(positive.samples,n=3) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = 1) %>%
|
||||
cols_align(align = "center", columns = 2:ncol(head(positive.samples,n=3))) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
<br>
|
||||
@@ -151,10 +171,16 @@ data("positive.motus")
|
||||
- `positive.motus` : a `r nrow(positive.motus)` rows `data.frame` of `r ncol(positive.motus)` columns describing each MOTU
|
||||
|
||||
```{r}
|
||||
knitr::kable(head(positive.motus,n=3),
|
||||
format = "html",
|
||||
align = 'rlrc') %>%
|
||||
kable_styling(position = "center")
|
||||
head(positive.motus,n=3) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = 1) %>%
|
||||
cols_align(align = "left", columns = 2) %>%
|
||||
cols_align(align = "right", columns = 3) %>%
|
||||
cols_align(align = "center", columns = 4) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
<br>
|
||||
@@ -172,10 +198,17 @@ table(colSums(positive.count) == 1)
|
||||
```
|
||||
|
||||
```{r}
|
||||
kable(t(table(colSums(positive.count) == 1)),
|
||||
format = "html") %>%
|
||||
kable_styling(position = "center") %>%
|
||||
row_spec(0, align = 'c')
|
||||
as.data.frame(t(table(colSums(positive.count) == 1))) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "center", columns = everything()) %>%
|
||||
tab_style(
|
||||
style = cell_text(align = "center"),
|
||||
locations = cells_column_labels()
|
||||
) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
<br>
|
||||
@@ -195,7 +228,7 @@ positive.motus = positive.motus[are.not.singleton,]
|
||||
Despite all standardization efforts
|
||||
|
||||
```{r fig.height=3}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
hist(rowSums(positive.count),
|
||||
breaks = 15,
|
||||
xlab="Read counts",
|
||||
@@ -209,7 +242,7 @@ Is it related to the amount of DNA in the extract ?
|
||||
## What do the reading numbers per PCR mean? {.smaller}
|
||||
|
||||
```{r echo=TRUE, fig.height=4}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(rowSums(positive.count) ~ positive.samples$dilution,log="y")
|
||||
abline(h = median(rowSums(positive.count)),lw=2,col="red",lty=2)
|
||||
```
|
||||
@@ -288,7 +321,7 @@ table(are.still.present)
|
||||
## Rarefying read count (4) {.flexbox .vcenter}
|
||||
|
||||
```{r echo=TRUE, fig.height=3.5}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(colSums(positive.count) ~ are.still.present, log="y")
|
||||
```
|
||||
|
||||
@@ -360,10 +393,13 @@ knitr::include_graphics("figures/alpha_diversity.svg")
|
||||
E1 = c(A=0.25,B=0.25,C=0.25,D=0.25,E=0,F=0,G=0)
|
||||
E2 = c(A=0.55,B=0.07,C=0.02,D=0.17,E=0.07,F=0.07,G=0.03)
|
||||
environments = t(data.frame(`Environment 1` = E1,`Environment 2` = E2))
|
||||
kable(environments,
|
||||
format="html",
|
||||
align = 'rr') %>%
|
||||
kable_styling(position = "center")
|
||||
as.data.frame(environments) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = everything()) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## Richness {.flexbox .vcenter}
|
||||
@@ -379,10 +415,13 @@ S = rowSums(environments > 0)
|
||||
```
|
||||
|
||||
```{r}
|
||||
kable(data.frame(S=S),
|
||||
format="html",
|
||||
align = 'rr') %>%
|
||||
kable_styling(position = "center")
|
||||
data.frame(S=S) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = everything()) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## Gini-Simpson's index {.smaller}
|
||||
@@ -414,10 +453,13 @@ GS = 1 - rowSums(environments^2)
|
||||
```
|
||||
|
||||
```{r}
|
||||
kable(data.frame(`Gini-Simpson`=GS),
|
||||
format="html",
|
||||
align = 'rr') %>%
|
||||
kable_styling(position = "center")
|
||||
data.frame(`Gini-Simpson`=GS) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = everything()) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## Shannon entropy {.smaller}
|
||||
@@ -443,10 +485,13 @@ H = - rowSums(environments * log(environments),na.rm = TRUE)
|
||||
```
|
||||
|
||||
```{r}
|
||||
kable(data.frame(`Shannon index`=H),
|
||||
format="html",
|
||||
align = 'rr') %>%
|
||||
kable_styling(position = "center")
|
||||
data.frame(`Shannon index`=H) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "right", columns = everything()) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## Hill's number {.smaller}
|
||||
@@ -476,10 +521,17 @@ D2 = exp(- rowSums(environments * log(environments),na.rm = TRUE))
|
||||
```
|
||||
|
||||
```{r}
|
||||
kable(data.frame(`Hill Numbers`=D2),
|
||||
format="html",
|
||||
align = 'rr') %>%
|
||||
kable_styling(position = "center")
|
||||
data.frame(`Hill Numbers` = D2) %>%
|
||||
gt() %>%
|
||||
cols_align(align = "center") %>%
|
||||
tab_style(
|
||||
style = cell_text(weight = "bold"),
|
||||
locations = cells_column_labels()
|
||||
) %>%
|
||||
tab_options(
|
||||
table.align = "center",
|
||||
heading.align = "center"
|
||||
)
|
||||
```
|
||||
|
||||
## Generalized logaritmic function {.smaller}
|
||||
@@ -493,7 +545,7 @@ $$
|
||||
The function is not defined for $q=1$ but when $q \longrightarrow 1\;,\; ^q\log(x) \longrightarrow \log(x)$
|
||||
|
||||
$$
|
||||
^q\log(x) = \left\{
|
||||
^q\log(x) = \left\{
|
||||
\begin{align}
|
||||
\log(x),& \text{if } q = 1\\
|
||||
\frac{x^{(1-q)}-1}{1-q},& \text{otherwise}
|
||||
@@ -505,7 +557,7 @@ $$
|
||||
log_q = function(x,q=1) {
|
||||
if (q==1)
|
||||
log(x)
|
||||
else
|
||||
else
|
||||
(x^(1-q)-1)/(1-q)
|
||||
}
|
||||
```
|
||||
@@ -535,7 +587,7 @@ legend("topleft",legend = qs,fill = seq_along(qs),cex=1.5)
|
||||
## And its inverse function {.flexbox .vcenter}
|
||||
|
||||
$$
|
||||
^qe^x = \left\{
|
||||
^qe^x = \left\{
|
||||
\begin{align}
|
||||
e^x,& \text{if } x = 1 \\
|
||||
(1 + x(1-q))^{(\frac{1}{1-q})},& \text{otherwise}
|
||||
@@ -601,14 +653,14 @@ environments.dq = apply(environments,MARGIN = 1,D_spectrum,q=qs)
|
||||
|
||||
```{r}
|
||||
par(mfrow=c(1,2),bg=NA)
|
||||
plot(qs,environments.hq[,2],type="l",col="red",
|
||||
plot(qs,environments.hq[,2],type="l",col="red",
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qH$'),
|
||||
xlim=c(-0.5,3.5),
|
||||
main="generalized entropy")
|
||||
points(qs,environments.hq[,1],type="l",col="blue")
|
||||
abline(v=c(0,1,2),lty=2,col=4:6)
|
||||
plot(qs,environments.dq[,2],type="l",col="red",
|
||||
plot(qs,environments.dq[,2],type="l",col="red",
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qD$'),
|
||||
main="Hill's number")
|
||||
@@ -657,7 +709,7 @@ plot(qs,H.mock,type="l",
|
||||
xlim=c(-0.5,3.5),
|
||||
main="generalized entropy")
|
||||
abline(v=c(0,1,2),lty=2,col=4:6)
|
||||
plot(qs,D.mock,type="l",
|
||||
plot(qs,D.mock,type="l",
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qD$'),
|
||||
main="Hill's number")
|
||||
@@ -674,7 +726,7 @@ positive.H = apply(positive.count.relfreq,
|
||||
```
|
||||
|
||||
```{r}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(t(positive.H),
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qH$'),
|
||||
@@ -685,7 +737,7 @@ points(H.mock,col="red",type="l")
|
||||
## Biodiversity spectrum and metabarcoding (2) {.flexbox .vcenter .smaller}
|
||||
|
||||
```{r}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(t(positive.H)[,11:31],
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qH$'),
|
||||
@@ -706,7 +758,7 @@ positive.D = apply(positive.count.relfreq,
|
||||
```
|
||||
|
||||
```{r}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(t(positive.D),
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qD$'),
|
||||
@@ -753,7 +805,7 @@ positive.clean.H = apply(positive.clean.count.relfreq,
|
||||
```
|
||||
|
||||
```{r fig.height=3.5}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(t(positive.clean.H),
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qH$'),
|
||||
@@ -771,7 +823,7 @@ positive.clean.D = apply(positive.clean.count.relfreq,
|
||||
```
|
||||
|
||||
```{r}
|
||||
par(bg=NA)
|
||||
par(bg=NA)
|
||||
boxplot(t(positive.clean.D),
|
||||
xlab=TeX('$q$'),
|
||||
ylab=TeX('$^qD$'),
|
||||
@@ -1069,7 +1121,7 @@ BC_{jk}=\frac{\sum _{i=1}^{p}(N_{ij} - min(N_{ij},N_{ik}) + (N_{ik} - min(N_{ij}
|
||||
$$
|
||||
|
||||
$$
|
||||
BC_{jk}=\frac{\sum _{i=1}^{p}|N_{ij} - N_{ik}|}{\sum _{i=1}^{p}N_{ij}+\sum _{i=1}^{p}N_{ik}}
|
||||
BC_{jk}=\frac{\sum _{i=1}^{p}|N_{ij} - N_{ik}|}{\sum _{i=1}^{p}N_{ij}+\sum _{i=1}^{p}N_{ik}}
|
||||
$$
|
||||
|
||||
$$
|
||||
@@ -1159,7 +1211,7 @@ legend("topleft",legend = levels(samples.type),fill = 1:4,cex=1.2)
|
||||
|
||||
````{=html}
|
||||
<!---
|
||||
## Computation of norms
|
||||
## Computation of norms
|
||||
|
||||
```{r guiana_norm, echo=TRUE}
|
||||
guiana.n1.dist = norm(guiana.relfreq.final,l=1)
|
||||
@@ -1168,7 +1220,7 @@ guiana.n3.dist = norm(guiana.relfreq.final^(1/3),l=3)
|
||||
guiana.n4.dist = norm(guiana.relfreq.final^(1/100),l=100)
|
||||
```
|
||||
|
||||
## pCoA on norms
|
||||
## pCoA on norms
|
||||
|
||||
```{r dependson="guiana_norm"}
|
||||
guiana.n1.pcoa = cmdscale(guiana.n1.dist,k=3,eig = TRUE)
|
||||
|
||||
Reference in New Issue
Block a user