Files
OBIJupyterHub/web_src/05_Lectures/00_Computers/unix/unix-modern-bash.qmd
2025-11-16 14:55:30 +01:00

976 lines
29 KiB
Plaintext

---
title: "Introduction to Unix by Example"
subtitle: "A Modern Approach for Bioinformaticians"
author: "Updated Course Material"
date: today
format:
html:
toc: true
toc-depth: 3
number-sections: true
code-fold: false
theme: cosmo
---
# Why Unix?
The Unix operating system was born in AT&T laboratories in the United States, then known as "Bell Labs". Created in the late 1960s, it derives from Multics, another system from the same laboratory about ten years earlier. Unix spread rapidly because Bell Labs distributed its new system as freely modifiable source code. This led to the emergence of Unix families produced by the system's main users: research laboratories on one hand and major computer manufacturers on the other.
From the beginning, Unix development has been closely linked to scientific computing. These intrinsic qualities explain why this operating system is still widely used in many research fields today.
Today, Unix is a registered trademark of The Open Group, which standardizes all Unix systems. However, there is a broader definition that includes "Unix-like" systems such as GNU/Linux. Despite proclaiming in its name not to be Unix (GNU is Not Unix), this family of operating systems has such functional similarities with its ancestor that it's difficult to explain how it isn't Unix.
Nowadays, a Unix system can be installed on virtually any machine, from personal computers to large computing servers. Notably, for several years, Apple's standard operating system on Macintosh computers, macOS, has been a certified Unix system.
# Unix System Overview
Unix is a multitasking and multi-user operating system. This means it can manage the simultaneous use of the same computer by multiple people, and for each person, it allows parallel execution of multiple programs. The multiplicity of users and running programs on the same machine requires particular resource management, involving restricted rights for each user so that one person's work doesn't interfere with another's.
```{mermaid}
flowchart TB
U1["User 1"] --> S1["Shell<br/>(Command Interpreter)"]
U2["User 2"] --> S1
U3["User N"] --> S1
subgraph Unix ["Unix Operating System"]
S1["Shell<br/>(Command Interpreter)"] --> K1["Kernel<br/>(Core System)"]
K1 --> R1["CPU"]
K1 --> R2["Memory"]
K1 --> R3["Disk Storage"]
K1 --> R4["Network"]
end
```
## Users
Each Unix system user needs an account or "machine access right" to work. Each account is identified by a login name.
Associated with each login:
- A password that secures system access
- A user ID (UID) that identifies the user on the machine
- A location on the hard drive to store user files, called Home directory
- A user group, allowing collaborative work (see later)
Information about all users on a machine is typically stored in a text file: `/etc/passwd`
```bash
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
alice:x:1000:1000:Alice Smith:/home/alice:/bin/bash
bob:x:1001:1001:Bob Jones:/home/bob:/bin/bash
```
Each line corresponds to a user. Information is separated by `:` characters. In order: login, encoded password, UID, group ID, full name, home directory, and default shell.
## The File System
The file system of an operating system encompasses all mechanisms for managing storage space (hard drives) on the computer. Data and programs are stored in files. A file can be thought of as a small part of a hard drive dedicated to storing a set of data.
### File Names
In a Unix system, a file name describes a path in a tree. A file name starts with a `/` character and consists of successive node labels describing the file's location in the name tree. Each label is separated from the preceding one by the `/` character.
```{mermaid}
graph TB
root["/"]
root --> bin["/bin"]
root --> etc["/etc"]
root --> home["/home"]
root --> usr["/usr"]
root --> var["/var"]
etc --> passwd["passwd"]
etc --> hosts["hosts"]
home --> alice["alice"]
home --> bob["bob"]
alice --> documents["documents"]
alice --> data["data.txt"]
usr --> local["/local"]
usr --> usrbin["/bin"]
style passwd fill:#e1f5ff
style root fill:#ffe1e1
```
For example, the file `/etc/passwd` indicates that this file is located at a node (directory) named `/etc`, which itself is located at the root of the file name tree `/`.
### Standard Directory Structure
Certain directories are found in many Unix systems:
- `/etc` - Contains system configuration files
- `/var` - Contains system operation information
- `/bin` - Contains basic system programs
- `/usr` - Contains a large part of the system
- `/usr/local` - Contains programs specific to a machine
- `/home` - Contains user home directories
- `/tmp` - Temporary files
### Lexical Rules for File Names
File name labels can contain:
- Alphabetic characters (a-z and A-Z)
- Numeric characters (0-9)
- Punctuation marks (& , $ , * , + , = , . , etc.)
However, using some of these signs can cause problems. It's recommended to use only: `. , % , - , _ , : , =`
**Important**: Unix is case-sensitive. `TODO`, `todo`, `Todo`, and `ToTo` are all different names.
File names starting with a dot `.` are hidden files and typically correspond to configuration files.
### Links
The concept of a link can be compared to a shortcut in other operating systems. A link is a special file that creates additional edges in the file name tree. From a computer science perspective, the tree structure becomes a Directed Acyclic Graph (DAG).
```{mermaid}
graph LR
root["/"]
usr["/usr"]
bin["/bin"]
home["/home"]
alice["alice"]
programs["programs<br/>(link)"]
grep["grep"]
root --> usr
root --> home
usr --> bin
home --> alice
alice --> programs
bin --> grep
programs -.->|symbolic link| bin
style programs fill:#fff2cc
style grep fill:#e1f5ff
```
Creating a link in a Unix file system creates a synonym between the link name and the target file.
### The `.` and `..` Directories
Unix uses links to facilitate navigation in the file name tree. When creating a directory node, the system automatically adds two links under this node named `.` and `..`:
- `.` links to the directory containing it
- `..` points to the parent directory
```{mermaid}
graph TB
root["/"]
home["/home"]
alice["alice"]
dot[". (alice)"]
dotdot[".. (home)"]
docs["documents"]
root --> home
home --> alice
alice --> dot
alice --> dotdot
alice --> docs
dot -.-> alice
dotdot -.-> home
style dot fill:#fff2cc
style dotdot fill:#fff2cc
```
These links mean that for each file, there isn't just one name but an infinite number of possible names. The file `/home/alice/myfile` can also be named:
- `/home/alice/./myfile`
- `/home/alice/../../home/alice/myfile`
- `/home/alice/./././myfile`
### Current Directory and Relative Paths
The hierarchical tree structure of Unix file names is powerful but produces often very long file names. To work around this problem, Unix offers the concept of current directory and relative paths.
**Current Directory**: When working on a machine, you typically work on a set of files located in the same region of the name tree. The common part of all these names is stored in an environment variable called `PWD` (Present Working Directory).
By default, when you log into your Unix account, this variable is initialized with your home directory name. You can change this variable's value using the `cd` command.
**Relative Paths**: Relative file names are expressed relative to the current directory. To know the true name corresponding to a relative name, you concatenate the current directory name and the relative name.
Example:
```bash
# If current directory is: /home/alice/experiment_1
# These files:
/home/alice/experiment_1/sequence.fasta
/home/alice/experiment_1/expression.dat
/home/alice/experiment_1/annotation.gff
# Can be named simply:
sequence.fasta
expression.dat
annotation.gff
```
A relative name is recognized by the fact it doesn't start with `/`. In contrast, complete file names are called absolute paths and always start with `/`.
### Access Rights
Unix is a multi-user system. To protect each user's data from others, each file belongs to a specific user (usually its creator) and a user group. Additionally, each file has access rights concerning:
- The file owner
- The group to which the file belongs
- All other system users
For each of these three user categories, there are read, write, and execute rights:
- **Read right**: Allows reading the file
- **Write right**: Authorizes modifying or deleting the file
- **Execute right**: Allows executing the file if it contains a program
For directories, execute right indicates permission to use it as an element of a file name.
```bash
# Example of file permissions
$ ls -l
-rw-r--r-- 1 alice staff 1024 Nov 03 10:30 data.txt
-rwxr-xr-x 1 alice staff 2048 Nov 03 10:31 script.sh
drwxr-xr-x 2 alice staff 512 Nov 03 10:32 results
```
Rights can be modified by the file owner using the `chmod` instruction.
## Processes
A program corresponds to a sequence of calculation instructions that the computer must execute to perform a task. While it's important to store this instruction sequence for regular reuse, it's equally important to execute it. A process corresponds to the execution of a program.
Since Unix is multitasking and multi-user, the same program can be executed simultaneously by multiple processes. It's therefore important to distinguish between program and process.
### Process Anatomy
A process can be considered as part of the computer's memory dedicated to program execution. This memory chunk can be divided into three main parts: the environment, data area, and program area.
```{mermaid}
flowchart TB
subgraph Process["Process Memory Space"]
direction TB
Env["Environment<br/>- Variables<br/>- File descriptors<br/>- PID/PPID"]
Code["Code Area<br/>- Program instructions"]
Data["Data Area<br/>- Variables<br/>- Computation results"]
end
Parent["Parent Process"] -.->|fork| Process
style Env fill:#e1f5ff
style Code fill:#ffe1e1
style Data fill:#e1ffe1
```
### Process Environment
A process is an isolated memory area where a program executes. Isolation secures the computer by preventing a program from corrupting others' execution. However, during execution, a program must interact with the rest of the computer.
The process environment is dedicated to this interface task. It contains descriptions of system elements the process needs to know. Two main types of information are stored:
**Environment Variables**: Associate a name with a value describing certain system properties. Examples:
- `PWD`: Current Working Directory for interpreting relative paths
- `PATH`: List of directories where available programs are stored
- `HOME`: User's home directory
- `USER`: Current username
**Streams**: Virtual pipes through which data transits. By default, three streams are associated with each process:
- `stdin` (standard input): How a Unix program normally receives data
- `stdout` (standard output): Used by the program to return results
- `stderr` (standard error): Used for error messages and information
```{mermaid}
flowchart LR
Input[("Input<br/>Source")] --> stdin["stdin<br/>(0)"]
stdin --> Process["Process"]
Process --> stdout["stdout<br/>(1)"]
Process --> stderr["stderr<br/>(2)"]
stdout --> Output[("Output<br/>Destination")]
stderr --> Error[("Error<br/>Log")]
style stdin fill:#e1f5ff
style stdout fill:#e1ffe1
style stderr fill:#ffe1e1
```
### Process Lifecycle
Every process has a parent (except the initial process) and inherits all its properties: environment, data area, and program code to execute.
```{mermaid}
stateDiagram-v2
[*] --> Init: System Boot
Init --> Parent: fork()
Parent --> Child1: fork()
Parent --> Child2: fork()
Child1 --> [*]: exit()
Child2 --> [*]: exit()
Parent --> [*]: All children terminated
note right of Parent
PID: 1234
Creates child processes
end note
note right of Child1
PID: 1235
Inherits parent environment
end note
```
Important points:
- Every process has a parent and inherits all its properties
- A child process must terminate before its parent
- When you close your shell, all running programs are terminated unless detached
- A process is created by copying its parent, inheriting its properties except PID
The normal chronology for creating a new process:
1. Call the `fork()` function
2. Test which process continues execution
3. In the child process, call `exec()` to replace the program code
4. At execution end, notify the parent and wait for cleanup
# The Unix Shell - A Working Environment
The Unix shell is the most important program for a Unix user. It's how they interact with their computer. There's a graphical window system under Unix similar to Windows or macOS, called X Window System (X11), which can operate in client/server mode across networks. However, we'll focus on interacting with Unix in "text" mode via the shell.
The Unix shell is a program capable of interpreting a command language. These commands allow users to launch program execution by specifying:
- Data to work on
- Parameters to adjust execution
- What to do with results
Several Unix shells exist, differing mainly in their command language syntax. The two most commonly used today are:
- **bash** (Bourne Again Shell): Modern version of the Bourne shell (sh)
- **zsh** (Z Shell): Enhanced version with additional features
This course focuses on **bash**, the default shell on most Linux systems and macOS.
## Basic Command Structure
A shell command describes how to trigger program execution with all necessary information. As a principle, every program installed on a Unix machine corresponds to a usable command from the shell bearing the program's name, and conversely, every Unix command is the name of an installed program.
```{mermaid}
flowchart LR
Command["Command<br/>(program name)"] --> Options["Options<br/>(flags)"]
Options --> Arguments["Arguments<br/>(input files)"]
Arguments --> Redirection["I/O Redirection<br/>(< > |)"]
style Command fill:#ffe1e1
style Options fill:#e1f5ff
style Arguments fill:#e1ffe1
style Redirection fill:#fff2cc
```
A Unix command line has four main parts:
1. **Command** (required): Program name
2. **Options** (optional): Adjust program behavior
3. **Arguments** (optional): Specify data to process
4. **Redirection** (optional): Control input/output
### The Unix Command
A Unix command is the name of a program installed on the machine. When you execute a command like `ls` or `grep`, you're actually launching execution of an eponymous program stored somewhere on your hard drives.
The machine searches for program files only in a subset of existing directories, described by a list stored in the `PATH` environment variable.
```bash
$ echo $PATH
/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
```
Directories are searched in order. If programs with the same name exist in different directories, the first one found is executed.
To execute a program in a directory not listed in `PATH`, specify its location:
```bash
# Using absolute path
$ /home/alice/myprograms/myscript.sh
# Using relative path (if in the directory)
$ ./myscript.sh
```
The `./` prefix is necessary to indicate the current directory location.
### Command Options
Options alter command functionality by adjusting parameters. Options are recognizable as:
- Short form: Single character preceded by `-` (e.g., `-l`)
- Long form: Complete word preceded by `--` (e.g., `--list`)
Many programs offer both forms for the same option.
```bash
# Short option
$ grep -i root /etc/passwd
# Long option (equivalent)
$ grep --ignore-case root /etc/passwd
```
Some options require arguments:
```bash
# Short form with argument
$ grep -B 2 root /etc/passwd
$ grep -B2 root /etc/passwd # No space also works
# Long form with argument
$ grep --before-context=2 root /etc/passwd
```
Multiple short options can be combined:
```bash
# Separate options
$ grep -i -n root /etc/passwd
# Combined options
$ grep -in root /etc/passwd
```
If an option requires an argument, it must be placed last in the group.
### Command Arguments
Arguments indicate data the program should process, beyond data potentially transmitted through standard input. Depending on how it's programmed, a program can accept one or multiple arguments. Each argument may have a distinct role depending on the program.
To understand each argument's role, consult the program's manual page via `man` command or online help, usually accessible with the `-h` option.
```bash
# Example with multiple arguments
$ cp source.txt destination.txt
# Example with patterns
$ grep "pattern" file1.txt file2.txt file3.txt
```
### I/O Redirection Instructions
This fourth part of a Unix command line is crucial, allowing you to specify how your program should configure its standard inputs/outputs. This is one of the most important things to understand to fully benefit from the Unix system.
## File Name Patterns with Wildcards
It's very common in a Unix command to need to specify multiple file names. When the number of files becomes large, typing these names one by one can be tedious, especially if all file names share common characteristics.
To address this, there's a series of "wildcard" characters to indicate the form of desired file names:
| Wildcard | Matches |
|----------|---------|
| `*` | Zero, one, or more characters |
| `?` | Exactly one character |
| `[...]` | One character from the list |
| `[^...]` | One character NOT in the list |
| `[a-z]` | One character in the range |
Each word in a Unix command line using these characters is replaced during execution by the list of existing file names matching the pattern.
```bash
# List all text files
$ ls *.txt
# Files starting with 'data' and any single character
$ ls data?
# Files starting with uppercase letter
$ ls [A-Z]*
# Files NOT starting with lowercase letter
$ ls [^a-z]*
# Complex pattern
$ ls experiment_[0-9][0-9].dat
```
If no file matches the pattern, a "No match" error is generated.
```bash
$ echo *toto
bash: no matches found: *toto
$ ls /
Applications Library bin home opt usr
Desktop Network cores sbin private var
Developer System dev etc tmp
$ echo /mach*
/mach.sym /mach_kernel /mach_kernel.ctfsys
$ echo /*.*
/atp.mol /mach.sym /mach_kernel.ctfsys /untitled.log
$ echo /[AD]*
/Applications /Desktop_DB /Desktop_DF /Developer
$ echo /[uv]??
/usr /var
```
These file name patterns are most often used with file manipulation commands like copying (`cp`), deletion (`rm`), or listing (`ls`). They're also frequently used in loops to launch the same command on an entire series of datasets.
## Standard I/O Redirection
The property that gives a Unix shell its full power is the standard input/output redirection system. Each process inherits three standard data streams from its parent:
- `stdin`: Standard input stream (file descriptor 0)
- `stdout`: Standard output stream (file descriptor 1)
- `stderr`: Standard error stream (file descriptor 2)
```{mermaid}
flowchart TB
subgraph Default["Default Configuration"]
Keyboard[("Keyboard")] --> stdin1["stdin"]
stdin1 --> Shell1["Shell Process"]
Shell1 --> stdout1["stdout"]
Shell1 --> stderr1["stderr"]
stdout1 --> Screen1[("Screen")]
stderr1 --> Screen1
end
style stdin1 fill:#e1f5ff
style stdout1 fill:#e1ffe1
style stderr1 fill:#ffe1e1
```
### Redirecting Standard Output
To save results generated by a program to a file, add an output redirection instruction at the end of the command line: `>` followed by a file name.
```bash
$ ls /
Applications Desktop Developer Library System
bin cores dev etc home usr var
$ ls / > my_listing
$ ls -l
total 8
drwxr-xr-x 2 alice staff 102 Nov 27 17:18 myprograms
-rw-r--r-- 1 alice staff 241 Dec 3 16:50 my_listing
$ cat my_listing
Applications
Desktop
Developer
Library
System
bin
cores
dev
etc
home
usr
var
```
```{mermaid}
flowchart LR
stdin["stdin"] --> Process["ls /"]
Process --> stdout["stdout"]
Process --> stderr["stderr"]
stdout --> File[("my_listing")]
stderr --> Screen[("Screen")]
style stdout fill:#e1ffe1
style stderr fill:#ffe1e1
```
Important notes:
- If the file doesn't exist, it's created and filled with results
- If the file exists, it's erased and replaced with a new file
- **Be careful**: This can easily overwrite existing files
To append results to an existing file instead of replacing it, use `>>`:
```bash
$ echo "First line" > output.txt
$ echo "Second line" >> output.txt
$ cat output.txt
First line
Second line
```
### Redirecting Standard Input
Input redirection indicates where a program reading from standard input should find its data. Input redirection uses the `<` character.
```bash
$ grep or < my_listing
Network
cores
$ grep or < my_listing > my_selection
$ cat my_selection
Network
cores
```
The `grep` command selects lines of text containing a pattern (or in this example) and copies them to standard output. Input redirection tells the process to read from `my_listing`, and output redirection saves results to `my_selection`.
### Redirecting Output to Another Process (Pipes)
The most powerful redirection mode connects one process's standard output to another's standard input. The first program's results become the second's data. Data passes directly between processes without going through an intermediate file. This creates a "pipe" between processes.
```{mermaid}
flowchart LR
stdin1["stdin"] --> P1["ls /"]
P1 --> pipe["|<br/>pipe"]
pipe --> P2["grep or"]
P2 --> stdout2["stdout"]
P2 --> stderr2["stderr"]
stdout2 --> Screen[("Screen")]
stderr2 --> Screen
style pipe fill:#fff2cc
style stdout2 fill:#e1ffe1
style stderr2 fill:#ffe1e1
```
Syntactically, this is achieved by joining two or more commands with the `|` character:
```bash
$ ls / | grep or
Network
cores
$ ls / | grep or > my_selection
$ cat my_selection
Network
cores
```
In a complex command, a process is created for each command, and data simply transits from one to another.
**Important restrictions:**
- Commands before a pipe cannot redirect stdout to a file (already piped to next command)
- Commands after a pipe cannot redirect stdin from a file (already receiving from previous command)
You can chain multiple pipes:
```bash
# Count lines containing "error" in log file
$ cat logfile.txt | grep error | wc -l
# Sort unique email addresses
$ cat emails.txt | sort | uniq
```
## Building Execution Loops
A computer's value lies in its ability to automatically perform repetitive calculation tasks. Users often find themselves needing to launch the same Unix command for calculations on multiple datasets. If each dataset is saved in a different file with coherent naming (e.g., `gis_vercors.dat`, `gis_belledonne.dat`, `gis_chartreuse.dat`), it's possible to leverage loop structures offered by Unix shells.
### Shell Variables
Working automatically and repetitively requires using variables to store useful, changing information at each iteration. For example, if your Unix command must read data from different files for each execution, you cannot write the file name in your command since it won't always be the same.
You already know environment variables, set up by the `export` command, used to store system configuration information. There are simple variables allowing you to store any information you deem necessary during your Unix session. They're set up with simple assignment:
```bash
$ myvar="hello everyone"
$ echo myvar
myvar
$ echo $myvar
hello everyone
```
To retrieve the value contained in a variable, precede its name with the `$` character.
### The `for` Loop
To solve our problem of repeating the same Unix command multiple times while working on different data files, we'll create a variable that takes each element of a list as its value in turn. In our case, this list will be a list of file names constructed using file name ambiguity characters.
```bash
$ echo /[mnop]*
/mach.sym /mach_kernel /mach_kernel.ctfsys /net /opt /private
$ for f in /[mnop]*; do
> echo "Working with file $f"
> done
Working with file /mach.sym
Working with file /mach_kernel
Working with file /mach_kernel.ctfsys
Working with file /net
Working with file /opt
Working with file /private
```
```{mermaid}
flowchart TD
Start([Start]) --> Init["Initialize loop variable<br/>with first item"]
Init --> Check{More items<br/>in list?}
Check -->|Yes| Execute["Execute commands<br/>in loop body"]
Execute --> Next["Move to next item"]
Next --> Check
Check -->|No| End([End])
style Execute fill:#e1ffe1
```
The syntax is:
```bash
for variable in list; do
commands using $variable
done
```
All Unix commands inserted between `do` and `done` are executed once for each value taken by the variable.
Practical examples:
```bash
# Process multiple data files
$ for file in data*.txt; do
> echo "Processing $file"
> ./analyze.sh $file > results_$file
> done
# Rename multiple files
$ for file in *.jpeg; do
> mv "$file" "${file%.jpeg}.jpg"
> done
# Create numbered directories
$ for i in {1..10}; do
> mkdir experiment_$i
> done
```
### Conditional Execution
Bash also provides conditional structures:
```bash
# if-then-else
$ if [ -f "data.txt" ]; then
> echo "File exists"
> else
> echo "File not found"
> fi
# Test file properties
$ for file in *.txt; do
> if [ -s "$file" ]; then
> echo "$file is not empty"
> fi
> done
```
Common test operators:
| Test | Meaning |
|------|---------|
| `-f file` | File exists and is regular file |
| `-d dir` | Directory exists |
| `-s file` | File exists and is not empty |
| `-r file` | File is readable |
| `-w file` | File is writable |
| `-x file` | File is executable |
# Essential Unix Commands (Alphabetical)
The commands presented here are a subset of all commands available by default on a Unix system. They're presented with a subset of their options. For a complete description of their functionality, refer to online help accessible via the `man` command.
## `awk` - Pattern Scanning and Processing
Named after its authors (Aho, Weinberger, Kernighan), `awk` is a complete programming language. A full description is beyond this course's scope but was perfectly described in "The AWK Programming Language" by its authors.
**Synopsis:**
```bash
awk [-F separator] 'program' [data_file]
```
**Main options:**
- `-F` - Specify column separator
**Examples:**
```bash
# Print second column
$ awk '{print $2}' file.txt
# Sum numbers in first column
$ awk '{sum += $1} END {print sum}' numbers.txt
# Process CSV file
$ awk -F',' '{print $1, $3}' data.csv
```
## `bash` - Bourne-Again Shell
Launches a bash Unix shell. To exit this new shell, press `Ctrl-D` at a prompt.
**Synopsis:**
```bash
bash
```
**Example:**
```bash
$ bash
bash-5.1$ export test_var="hello"
bash-5.1$ exit
$
```
## `bg` - Send Process to Background
Resumes execution of a process suspended by `Ctrl-Z` in the background.
**Synopsis:**
```bash
bg [%job]
```
**Arguments:**
- `%job` - Job number (preceded by %). Get list with `jobs` command.
**Example:**
```bash
$ sleep 30
^Z
[1]+ Stopped sleep 30
$ jobs
[1]+ Stopped sleep 30
$ bg %1
[1]+ sleep 30 &
$ jobs
[1]+ Running sleep 30 &
```
## `cat` - Concatenate Files
Reads content from one or more data streams and copies it identically to standard output.
**Synopsis:**
```bash
cat [file ...]
```
**Arguments:**
- `file` - One or more file names. If none provided, reads from stdin.
**Examples:**
```bash
# Display file content
$ cat file.txt
# Concatenate multiple files
$ cat file1.txt file2.txt > combined.txt
# Number lines
$ cat -n file.txt
```
## `cd` - Change Directory
Changes the current working directory.
**Synopsis:**
```bash
cd [directory]
```
**Arguments:**
- `directory` - New working directory name. Without argument, returns to home.
**Examples:**
```bash
$ pwd
/home/alice
$ cd /usr/local
$ pwd
/usr/local
$ cd ../../home/alice
$ pwd
/home/alice
$ cd
$ pwd
/home/alice
```
## `chmod` - Change File Mode
Changes file access permissions.
**Synopsis:**
```bash
chmod [-R] mode file
```
**Main options:**
- `-R` - Recursive operation on directory contents
**Arguments:**
- `mode` - Permission change description (e.g., `u+x`, `go-w`, `755`)
- `file` - File(s) whose mode should be changed
**Examples:**
```bash
# Add execute permission for user
$ chmod u+x script.sh
# Remove write permission for group and others
$ chmod go-w data.txt
# Set specific permissions with octal
$ chmod 755 program
# Recursive permission change
$ chmod -R 644 documents/
```
## `cp` - Copy Files
Copies a file or directory.
**Synopsis:**
```bash
cp [-R] source destination
```
**Main options:**
- `-R` - Recursive copy for directories
**Arguments:**
- `source` - File(s) to be copied
- `destination` - Copy destination name or directory
**Examples:**
```bash
# Copy file
$ cp source.txt backup.txt
# Copy to directory
$ cp file.txt documents/
# Copy directory recursively
$ cp -R project/ project_backup/
# Copy multiple files to directory
$