newgrp: Changing the Current User’s Group

Overview:
newgrp is a command used to change the group associated with the current user. It allows the user to switch to a different group, affecting file permissions and other operations. This command operates within the context of the currently logged-in user, meaning it cannot be used to change the group for other users.

Syntax:

newgrp [group_name]

Details:
newgrp works similarly to the login command. It allows the user to log in again under the same account but with a different group. The primary effect of running newgrp is that it switches the user’s effective group to the specified one, which will influence operations such as file access permissions.
If no group is specified, newgrp logs the user into the default group associated with the user’s username.
To use newgrp to switch groups, the user must be a member of the specified group. Otherwise, access to that group will be denied. Once a user has switched groups via newgrp, they can revert to their original group by using the exit command to close the current shell session.

Parameters:

  • group_name: The name of the group to switch to.

Example:
To add a user to the docker group:

$ sudo usermod -aG docker username

Replace username with the actual username. To add the current user to the docker group, run:

$ sudo usermod -aG docker $USER

After adding the user to the docker group, a re-login or system restart is required for the changes to take effect. Alternatively, use the following command to reload the user’s group memberships without logging out:

$ newgrp docker

jq: Command-line Tool for Handling JSON Data

Overview
jq is a command-line utility for processing JSON data, similar in functionality to sed and awk but specifically designed for JSON, making it extremely powerful for parsing, filtering, and formatting JSON data.

Usage

jq [options] <jq filter> [file...]
jq [options] --args <jq filter> [strings...]
jq [options] --jsonargs <jq filter> [JSON_TEXTS...]

Description
jq processes JSON input by applying specified filters to JSON text, outputting the result in JSON format to standard output. The simplest filter is . (dot), which outputs the JSON input as-is (with formatting adjustments and IEEE 754 numeric representation). For more complex filters, refer to the jq(1) manual or jq’s official documentation.

Options

  • -c: Compresses output, removing whitespace and newlines to produce compact JSON.
  • -r: Outputs raw format without quotes, useful for string values.
  • -R: Reads raw strings instead of JSON text.
  • -s: Slurps all input JSON objects into a single array.
  • -n: Uses null as the input, often used to create new JSON objects without reading input.
  • -e: Sets exit status based on output.
  • -S: Sorts keys in output JSON objects.
  • -C: Colors JSON output (enabled by default).
  • -M: Monochrome mode, disables JSON coloring.
  • --tab: Indents using tabs.
  • --arg a v: Sets $a to value <v>.
  • --argjson a v: Sets shell variable $a to JSON value <v>.
  • --slurpfile a f: Sets $a as an array from JSON data in file <f>.
  • --rawfile a f: Sets $a to the raw content of file <f>.
  • --args: Treats remaining arguments as string parameters, not files.
  • --jsonargs: Treats remaining arguments as JSON parameters.
  • --: Stops processing options.

Arguments can also be accessed using $ARGS.named[] for named parameters or $ARGS.positional[] for positional parameters.

Parameters

  • file...: One or more files.
  • strings...: One or more strings.
  • JSON_TEXTS...: One or more JSON strings.

Filters

  • .: Refers to the current JSON object, used to access fields.
  • []: Accesses elements in an array.
  • |: Passes output from one expression as input to the next.
  • {}: Creates a new JSON object.
  • [] | .field: Traverses an array and extracts specific fields from each object.

Operators and Built-in Functions
Includes common operators (+, -, *, /, %) and functions like length, has, map, select, unique, min, and max. String functions include split, join, startswith, and endswith. Regex and mathematical functions are also available, such as test, match, sqrt, and range.

Examples

1 Basic Examples

Compact JSON output by removing whitespace and newlines:

echo '{"name": "Alice", "age": 30}' | jq -c .

Output:
{"name":"Alice","age":30}

Raw output without quotes, typically for string values:

echo '{"name": "Alice", "age": 30}' | jq -r .name

Output:
Alice

Combine all input JSON objects into an array:

$ echo '{"name": "Alice", "age": 30}{"name": "Bob", "age": 25}{"name": "Charlie", "age": 35}' | jq -s .

Output:

[
  { "name": "Alice", "age": 30 },
  { "name": "Bob", "age": 25 },
  { "name": "Charlie", "age": 35 }
]

Passing shell variables to jq as JSON variables:

name="Alice"
echo '{}' | jq --arg name "$name" '{name: $name}'

Output:
{ "name": "Alice" }

2 Filter Examples

Access an element in a JSON array:

echo '[{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}]' | jq '.[1]'

Output:

{ "name": "Bob" }

Create a new JSON object with defined key-value pairs:

jq -n '{name: "Alice", age: 30}'

Output:

{ "name": "Alice", "age": 30 }

Traverse an array and extract a specific field from each object:

echo '[{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}]' | jq '.[].name'

Output:

"Alice"
"Bob"
"Charlie"

3 Operators, Control Statements, and Built-in Functions

Use conditional expressions to modify output:

echo '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}, {"name": "Charlie", "age": 35}]' | jq 'map(if .name=="Alice" then "yes" else "no" end)'

Output:

[ "yes", "no", "no" ]

4 jq Scripts

Save complex jq commands in a .jq file and execute them:

Create a file script.jq with:

map(select(.age > 28) | {name: .name, status: (if .age < 35 then "young" else "mature" end)})

Run the script:

echo '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}, {"name": "Charlie", "age": 35}]' | jq -f script.jq

Output:

[
  { "name": "Alice", "status": "young" },
  { "name": "Charlie", "status": "mature" }
]

Alternatively, save JSON data to a file and filter it:

$ echo '[{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}, {"name": "Charlie", "age": 35}]'>input.json
$ jq -f script.jq input.json

Output:

[
  { "name": "Alice", "status": "young" },
  { "name": "Charlie", "status": "mature" }
]

echo: Printing Strings or Variables in Terminal

Overview
The echo command is used to display strings or variables in the terminal.

Usage

echo [options] [args]

Description
echo is ideal for outputting text, displaying variables, and creating formatted outputs.

Options

  • -e: Enables escape characters such as newline \n and tab \t. Common escape sequences include:
    • \n: Newline
    • \t: Horizontal tab
    • \\: Backslash
    • \": Double quote
    • \a: Alert (beep)
    • \b: Backspace
    • \v: Vertical tab
  • -n: Prevents auto newline at the end of the output. By default, echo will automatically add a newline, which -n can disable.

Arguments

  • args: One or more strings or variables to be printed.

Examples

1 Simple Text Output

$ echo "Hello, World!"

Output:
Hello, World!

2 Display Variable Values
You can print variable values by prefixing the variable name with $:

$ name="Alice"
$ echo "Hello, $name"

Output:
Hello, Alice

To display environment variables:

$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

$ echo $HOME
/home/user

$ echo $USER
user

3 Using Command Substitution
echo can use command substitution with $(command) to insert the output of a command:

$ echo "Today is $(date)"

Output:
Today is Sun Oct 25 10:20:22 AM

4 Multi-line Output
Use the -e option to enable escape characters like \n for multi-line output:

$ echo -e "Line 1\nLine 2\nLine 3"

Output:

Line 1
Line 2
Line 3

5 Suppress Auto Newline
To suppress the newline, use -n:

$ echo -n "Hello, World!"

Output:
Hello, World! (without newline)

6 Using Escape Characters
Enable escape characters with -e for features like tabs (\t) and newlines (\n):

$ echo -e "Column1\tColumn2\nData1\tData2"

Output:

Column1    Column2
Data1      Data2

7 Redirect Output to File
Redirect echo output to a file. Use > to overwrite or >> to append:

$ echo "Hello, World!" > output.txt    # Overwrite
$ echo "Another Line" >> output.txt     # Append

8 Output Strings with Special Characters
For special characters (like $, “, ), use single quotes or escape them with \:

$ echo "This is a dollar sign: \$ and a quote: \""

Output:
This is a dollar sign: $ and a quote: “

Monitoring CIFS File System I/O Performance with cifsiostat

Description: cifsiostat is a tool used to monitor the I/O performance of CIFS file systems.

Syntax:

cifsiostat [ options ] [ <interval> [ <count> ] ] [ mount point ]

Overview: cifsiostat is a tool for monitoring I/O performance of CIFS (Common Internet File System) file systems, similar to iostat. It is part of the sysstat package and is specifically designed to display I/O statistics for CIFS client mount points. CIFS is a network file-sharing protocol based on SMB (Server Message Block), commonly used in Windows environments but also supported by Linux and other operating systems.

Options:

  • -k Show I/O activity statistics in kilobytes per second instead of blocks per second.
  • -m Show I/O activity statistics in megabytes per second.

Parameters:

  • Interval The time interval (in seconds) between each output.
  • Count The total number of outputs.
  • Mount Point Show I/O activity statistics for a specific mount point.

Examples:

1 Display I/O activity statistics for all CIFS mount points:

    cifsiostat

    2 Output statistics every 5 seconds for a total of 10 times:

    cifsiostat 5 10

    3 Show statistics for a specific mount point:

    cifsiostat /mnt/cifs

    4 View I/O statistics for two mount points:

    cifsiostat /mnt/shared /mnt/backup

    The output of cifsiostat is similar to iostat, including the number of read and write operations, the amount of data read and written, the average I/O size, and I/O wait times. An example output is shown below:

    Example Output:

    Filesystem: /mnt/cifs
    rMB/s    wMB/s    rIO/s    wIO/s   rSizeKB   wSizeKB
    0.000    0.012    1.00     10.00   0.00      4.00

    Explanation of Fields:

    rMB/s MB read per second.
    wMB/s MB written per second.
    rIO/s Number of read I/O requests per second.
    wIO/s Number of write I/O requests per second.
    rSizeKB Average size (in KB) of each read operation.
    wSizeKB Average size (in KB) of each write operation.

    NFS I/O Monitoring with nfsiostat: A Quick Guide

    The nfsiostat command is a tool for displaying I/O statistics for each NFS (Network File System) mount point on the client. Similar to iostat, it helps monitor NFS performance by providing real-time data on the I/O activities of mounted NFS points.

    Usage

    nfsiostat [ interval [ count ] ] [ options ] [ <mount point> ]

    Installation

    On Ubuntu, you can install nfsiostat by running the following command:

    sudo apt install nfs-common

    Key Options

    • -a, --attr: Displays statistics related to the attribute cache.
    • -d, --dir: Displays statistics related to directory operations.
    • -p, --page: Displays statistics related to the page cache.
    • -s, --sort: Sorts NFS mount points by operations per second (ops/second), useful for identifying the most active I/O points.
    • -l LIST, --list=LIST: Only displays statistics for the first LIST number of mount points, allowing a focused view of key NFS points.

    Parameters

    • interval: Sets the time interval (in seconds) between reports. The command will keep running until manually stopped or until the specified count of reports is reached.
    • count: Specifies the total number of reports to generate before terminating.
    • mount point: The NFS mount point(s) to monitor. You can specify one or more mount points to view detailed statistics for specific NFS mounts instead of all mounted NFS points by default.

    Examples

    • Report I/O statistics every 5 seconds, for a total of 10 times:
    nfsiostat 5 10

    Identify which mount points have the highest I/O activity:

    nfsiostat -s

    Print statistics for the first 2 mount points:

    nfsiostat -l 2

    View I/O statistics for the /mnt/nfs mount point:

    nfsiostat /mnt/nfs

    This tool provides a valuable way to monitor the performance of NFS mounts in real time, helping system administrators identify potential bottlenecks and optimize their NFS configurations.

    Monitoring System Resources with pidstat: CPU, Memory, Threads, and Device I/O Usage

    pidstat is a versatile command-line tool designed to monitor system resource usage, including CPU, memory, threads, and device I/O, across all or selected processes.

    Overview:

    pidstat is used to monitor the usage of CPU, memory, threads, device I/O, and other system resources for all or specific processes.

    Syntax:

    pidstat [options] [<interval> [<count>]]

    Description:

    When pidstat runs for the first time, it displays statistics from system startup. Subsequent executions will show data since the last run. Users can specify the interval and the number of times statistics should be displayed. It is part of the sysstat package, a performance monitoring toolkit, and can be accessed after installing sysstat.

    Options:

    • -u – Display CPU usage of each process.
    • -r – Display memory usage of each process.
    • -d – Display I/O usage of each process.
    • -p <pid> – Specify process ID to monitor.
    • -w – Display context switch details for each process.
    • -t – Show additional thread statistics.
    • -V – Display the version of the tool.
    • -h – Display header in a more compact format to fit narrower terminal windows.
    • -I – On SMP systems, displays CPU usage per core.
    • -l – Show command name and all parameters.
    • -T {TASK | CHILD | ALL} – Scope of reported statistics:
      • TASK: Report stats for the specified task (process).
      • CHILD: Report stats only for child processes, useful for performance monitoring of a process’s descendants.
      • ALL: Comprehensive stats for both the task and all its child processes.
    • -C <command> – Monitor the status of processes associated with a specific command.

    Parameters:

    • interval: Time between displays (in seconds).
    • count: Number of times to display, default is continuous.

    Example Usage:

    Basic Usage:

    $ pidstat

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
    16:03:44   0         1      0.00    0.00    0.00    0.00    0.00     0  systemd
    16:03:44   0         2      0.00    0.00    0.00    0.00    0.00     0  kthreadd
    16:03:44   0        16      0.00    0.00    0.00    0.00    0.00     0  ksoftirqd/0

    Display CPU Usage for All Processes:

    $ pidstat -u -p ALL

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
    16:17:49   0         1      0.00    0.00    0.00    0.00    0.00     0  systemd
    16:17:49   0         2      0.00    0.00    0.00    0.00    0.00     0  kthreadd

    Display CPU Usage for a Specific Process:

    $ pidstat -u -p 1

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID       PID    %usr %system  %guest   %wait    %CPU   CPU  Command
    16:18:07   0         1      0.00    0.00    0.00    0.00    0.00     1  systemd

    Display I/O Usage:

    $ pidstat -d

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID       PID   kB_rd/s   kB_wr/s kB_ccwr/s iodelay  Command
    16:26:23   1000      1606    0.23      0.02      0.02       0  systemd

    Display Context Switches:

    $ pidstat -w

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID       PID   cswch/s nvcswch/s  Command
    16:29:11   0         1     0.19      0.04  systemd

    Display Thread Statistics:

    $ pidstat -t

    Example output:

    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024-10-19   _x86_64_   (2 CPU)
    
    Time       UID      TGID       TID    %usr %system  %guest   %wait    %CPU   CPU  Command
    16:31:15   0         1         -    0.00    0.00    0.00    0.00    0.00     0  systemd

    Field Explanations:

    • PID: Process ID
    • %usr: CPU percentage used in user space
    • %system: CPU percentage used in kernel space
    • %guest: CPU percentage used in virtual machine
    • %CPU: Total CPU percentage used by the process
    • CPU: CPU core number
    • Command: Command associated with the process

    pidstat provides an in-depth look at resource usage, helping users understand how individual processes consume system resources, making it invaluable for performance monitoring and troubleshooting.

    iostat: A Tool for Monitoring System I/O Statistics

    Functionality:
    iostat is a utility used to gather and report system I/O statistics, commonly employed for analyzing disk performance.

    Syntax:
    iostat [options]

    Overview:
    In addition to I/O statistics, iostat can also display CPU usage information.

    Common Options:

    • -c: Display CPU usage only.
    • -d: Show device utilization statistics only.
    • -k: Display statistics in kilobytes per second instead of blocks per second.
    • -m: Display statistics in megabytes per second.
    • -p: Display statistics for block devices and all utilized partitions.
    • -t: Display the time each report is generated.
    • -V: Display version information and exit.
    • -x: Display extended I/O statistics.

    Examples:

    To display the usage of all devices at the current time:

    $ iostat -x
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)  2024年10月18日  _x86_64_  (2 CPU)
    
    avg-cpu:  %user  %nice  %system  %iowait  %steal  %idle
               0.10   0.02    0.27     0.01    0.00   99.60
    
    Device      r/s  rkB/s  rrqm/s  %rrqm  r_await  rareq-sz  w/s  wkB/s  wrqm/s  %wrqm  w_await  wareq-sz  %util
    loop0      0.00   0.00    0.00   0.00    0.00     1.21   0.00    0.00   0.00   0.00    0.00     0.00    0.00
    ...
    sda        0.30  10.96    0.10  24.42    0.35    36.60   0.51   23.23   0.64  55.59    0.41    45.13    0.02

    Field Descriptions:

    • Device: Device name.
    • r/s: Number of read requests per second.
    • w/s: Number of write requests per second.
    • rkB/s: Kilobytes read per second.
    • wkB/s: Kilobytes written per second.
    • %util: Percentage of time the device was busy with I/O requests.

    To display statistics for the device sda:

    $ iostat -x /dev/sda
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)  2024年10月18日  _x86_64_  (2 CPU)
    
    avg-cpu:  %user  %nice  %system  %iowait  %steal  %idle
               0.10   0.02    0.27     0.01    0.00   99.60
    
    Device      r/s  rkB/s  rrqm/s  %rrqm  r_await  rareq-sz  w/s  wkB/s  wrqm/s  %wrqm  w_await  wareq-sz  %util
    sda        0.30  10.95    0.10  24.42    0.35    36.60   0.51   23.21   0.64  55.57    0.41    45.09    0.02

    To display only device usage without CPU statistics:

    $ iostat -xd /dev/sda

    For overall system I/O status:

    $ iostat
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)  2024年10月18日  _x86_64_  (2 CPU)
    
    avg-cpu:  %user  %nice  %system  %iowait  %steal  %idle
               0.10   0.02    0.27     0.01    0.00   99.60
    
    Device     tps  kB_read/s  kB_wrtn/s  kB_dscd/s  kB_read  kB_wrtn  kB_dscd
    loop0      0.00      0.00      0.00      0.00        17        0        0
    ...
    sda        0.81     10.95     23.21      0.00   4791860  10157237       0

    To display CPU I/O statistics only:

    $ iostat -c

    To display statistics in megabytes per second:

    $ iostat -m

    mpstat: Displaying CPU Statistics for Each Available CPU

    Functionality:
    The mpstat command is used to display statistics for each available CPU.

    Syntax:
    mpstat [ options ]

    Overview:
    mpstat (Multi-Processor Statistics) is a tool for displaying statistics on the performance of individual CPUs, functioning as a real-time monitoring utility. While similar to vmstat, mpstat focuses solely on CPU performance statistics. To install mpstat on Ubuntu or CentOS, you can use the following commands:

    For Ubuntu:

    sudo apt install sysstat

    For CentOS:

    sudo yum install sysstat

    Options:

    • -P: Specify a CPU number or use ALL to display statistics for all CPUs.

    Examples:

    Running mpstat without any options will display overall performance statistics for all CPUs:

    $ mpstat
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024年10月15日   _x86_64_   (2 CPU)
    
    13:48:03     CPU   %usr   %nice   %sys   %iowait   %irq   %soft   %steal   %guest   %gnice   %idle
    13:48:03     all   0.11   0.02    0.10     0.01    0.00    0.17     0.00     0.00     0.00   99.60

    Using the -P ALL option provides both overall CPU performance statistics as well as detailed statistics for each individual CPU:

    $ mpstat -P ALL
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024年10月15日   _x86_64_   (2 CPU)
    
    13:59:40     CPU   %usr   %nice   %sys   %iowait   %irq   %soft   %steal   %guest   %gnice   %idle
    13:59:40     all   0.11   0.02    0.10     0.01    0.00    0.17     0.00     0.00     0.00   99.60
    13:59:40       0   0.11   0.02    0.10     0.01    0.00    0.31     0.00     0.00     0.00   99.45
    13:59:40       1   0.10   0.02    0.10     0.01    0.00    0.03     0.00     0.00     0.00   99.75

    You can also specify a particular CPU by using the -P n option, where n represents the CPU number starting from 0:

    $ mpstat -P 0
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024年10月15日   _x86_64_   (2 CPU)
    
    14:18:14     CPU   %usr   %nice   %sys   %iowait   %irq   %soft   %steal   %guest   %gnice   %idle
    14:18:14       0   0.11   0.02    0.10     0.01    0.00    0.31     0.00     0.00     0.00   99.45
    
    $ mpstat -P 1
    Linux 6.8.0-45-generic (Ubuntu22-VirtualBox)   2024年10月15日   _x86_64_   (2 CPU)
    
    14:18:17     CPU   %usr   %nice   %sys   %iowait   %irq   %soft   %steal   %guest   %gnice   %idle
    14:18:17       1   0.10   0.02    0.10     0.01    0.00    0.03     0.00     0.00     0.00   99.75

    Field Descriptions:

    • %usr: Percentage of CPU time spent in user mode (excluding processes with a negative nice value). Calculation: (usr/total)*100
    • %nice: Percentage of CPU time spent on processes with a negative nice value. Calculation: (nice/total)*100
    • %sys: Percentage of CPU time spent in kernel mode. Calculation: (system/total)*100
    • %iowait: Percentage of time spent waiting for I/O operations. Calculation: (iowait/total)*100
    • %irq: Percentage of CPU time spent handling hardware interrupts. Calculation: (irq/total)*100
    • %soft: Percentage of CPU time spent handling software interrupts. Calculation: (softirq/total)*100
    • %idle: Percentage of CPU time spent idle, excluding time waiting for I/O operations. Calculation: (idle/total)*100

    AWK: A Powerful Text Processing Tool and Programming Language

    Overview:
    AWK is a robust text processing tool and programming language, primarily used for formatting, analyzing, and processing text on Unix and Linux systems. It excels at handling structured text like tables, CSV files, and logs.

    Syntax:

    awk -f 'scripts' -v var=value filename
    awk 'BEGIN{ print "start" } pattern{ commands } END{ print "end" }' filename

    Explanation:
    AWK reads files or input streams (including stdin) line by line, processing text data based on user-specified patterns and actions. It is particularly useful for structured text. While AWK can be used directly from the command line, it is more often employed through scripts. As a programming language, AWK shares many features with C, such as arrays and functions.

    Options:

    • -F: Specifies the field separator (can be a string or regular expression).
    • -f 'scripts': Reads AWK commands from the script file 'scripts'.
    • -v var=value: Assigns a value to a variable, passing external variables to AWK.

    AWK Script Structure:

    • pattern: Matches specific lines.
    • { commands }: Executes actions on matching lines.
    • filename: The file to be processed by AWK.

    An AWK script typically consists of three optional parts: a BEGIN block, pattern matching, and an END block. The workflow proceeds as follows:

    1. Execute the BEGIN statement.
    2. Process each line from the file or standard input, executing the pattern matching.
    3. Execute the END statement.

    AWK Built-in Variables:

    • $0: The current record (line).
    • $n: The nth field (column) of the current record, where $1 is the first column and $n is the nth column.
    • FS: The field separator (default is a space or tab), can be customized with the -F option.
    • OFS: Output field separator (used for formatted output).
    • RS: Record separator (default is newline).
    • ORS: Output record separator (default is newline).
    • NR: Current line number (starting from 1).
    • NF: Number of fields (columns) in the current line.

    AWK Operators:

    • Arithmetic Operators:
      +, -, *, /, %, ^
      Increment and decrement operators (++, --) can be used as prefixes or suffixes.
    • Assignment Operators:
      =, +=, -=, *=, /=, %=, ^=
    • Regular Expression Operators:
      ~: Matches regular expression.
      !~: Does not match regular expression.
    • Logical Operators:
      ||: Logical OR
      &&: Logical AND
    • Relational Operators:
      <, <=, >, >=, !=, ==
    • Other Operators:
      $: Refers to a field by its number.
      Space: Concatenates strings.
      ?:: Ternary operator.
      in: Checks if a key exists in an array.

    AWK Regular Expression Syntax:

    • ^: Matches the start of a line.
    • $: Matches the end of a line.
    • .: Matches any single character.
    • *: Matches zero or more occurrences of the preceding character.
    • +: Matches one or more occurrences of the preceding character.
    • ?: Matches zero or one occurrence of the preceding character.
    • []: Matches any character in the specified range.
    • [^]: Matches any character not in the specified range.
    • () and |: Subexpressions and alternations.
    • \: Escape character.
    • {m}: Matches exactly m occurrences of a character.
    • {m,}: Matches at least m occurrences.
    • {m,n}: Matches between m and n occurrences.

    AWK Built-in Functions:

    • toupper(): Converts all lowercase letters to uppercase.
    • length(): Returns the length of a string.

    Custom Functions in AWK: AWK scripts can include user-defined functions. For example:

    function square(x) {
      return x * x;
    }

    To use the function:

    awk '{ print square($1) }' file.txt

    AWK Control Flow Statements:

    • if-else: Conditional statements.
    • while and do-while: Loops.
    • for: Standard loops, including array traversal with for-in.
    • break and continue: Loop control.
    • exit: Terminates the script execution.
    • next: Skips the remaining commands for the current line.
    • return: Returns a value from a function.
    • ?:: Ternary operator for conditional expressions.

    AWK Arrays: AWK supports associative arrays, meaning array indexes can be strings as well as numbers. Arrays in AWK don’t need to be declared or sized; they are created as soon as you assign a value to an index.

    Examples:

    1. Basic Example:

    $ echo "hello" | awk 'BEGIN{ print "start" } END{ print "end" }'
    start
    end

    2. Using Built-in Variables: To print the first and third columns of a file:

    awk '{ print $1, $3 }' test.txt

    3. Using External Variables:

    $ a=100
    $ b=100
    $ echo | awk '{ print v1 * v2 }' v1=$a v2=$b
    10000

    4. Using Regular Expressions: To print the second column of lines starting with “a”:

    awk '/^a/ { print $2 }' test.txt

    5. Using Built-in Functions: Convert all lowercase letters to uppercase:

    awk '{ print toupper($0) }' test.txt

    6. Handling Different Delimiters: For CSV files with comma-separated values:

    awk -F ',' '{ print $1, $2 }' test.csv

    7. Writing and Running AWK Scripts: Save an AWK script to a file (e.g., script.awk):

    BEGIN { FS=","; OFS=" - " }
    { print $1, $3 }

    Run the script:

    awk -f script.awk test.csv

    Conclusion:

    AWK is a versatile and powerful tool for text processing, offering rich features like pattern matching, regular expressions, and scripting capabilities. From simple one-liners to complex data analysis scripts, AWK excels at processing structured text efficiently and flexibly.

    A Guide to the sed Stream Editor

    Function Overview:
    sed is a stream editor that reads text from files or input streams line by line, edits the text according to user-specified patterns or commands, and then outputs the result to the screen or a file. When used in conjunction with regular expressions, it is incredibly powerful.

    Syntax:

    sed [options] 'command' file(s)
    sed [options] -f scriptfile file(s)

    Explanation:
    sed first stores each line of the text in a temporary buffer called the “pattern space.” It then processes the content of this buffer according to the given sed commands. Once the processing is complete, the result is output to the terminal, and sed moves on to the next line. The content of the file itself is not altered unless the -i option is used. sed is mainly used to edit one or more text files, simplify repeated text file operations, or create text transformation scripts. Its functionality is similar to awk, but sed is simpler and less capable of handling column-specific operations, while awk is more powerful in that regard.

    Options:

    • -e: Use the specified commands to process the input text file.
    • -n: Suppress automatic output (only prints lines modified when used with the p command).
    • -h: Display help information.
    • -V: Display version information.

    Parameters:

    • command: The command to be executed.
    • file(s): One or more text files to be processed.
    • scriptfile: A file containing a list of commands to execute.

    Common Actions:

    • a: Append text after the current line.
    • i: Insert text before the current line.
    • c: Replace the selected lines with new text.
    • d: Delete the selected lines.
    • D: Delete the first line of the pattern block.
    • s: Replace specified characters.
    • h: Copy the pattern block’s content to an internal buffer.
    • H: Append the pattern block’s content to the internal buffer.
    • g: Retrieve content from the internal buffer and replace the text in the current pattern block.
    • G: Retrieve content from the internal buffer and append it to the current pattern block.
    • l: List non-printable characters in the text.
    • L: Similar to l, but specifically for handling non-ASCII characters.
    • n: Read the next input line and apply the next command to it instead of reapplying the first command.
    • N: Append the next input line to the current pattern block and insert a new line between them, changing the current line number.
    • p: Print the matching lines.
    • P: Print the first line of the pattern block.
    • q: Quit sed.
    • b label: Branch to the location marked by label in the script; if the label doesn’t exist, the branch goes to the end of the script.
    • r file: Read lines from a file.
    • t label: Conditional branch to a marked location, starting from the last line. If the condition is met, or a T/t command is used, the branch jumps to the specified label or the end of the script.
    • T label: Error branch. If an error occurs, this branches to the labeled command or the end of the script.
    • w file: Write the processed block of the pattern space to the end of a file.
    • W file: Write the first line of the pattern space to the end of a file.
    • !: Execute the following commands on all lines not selected by the current pattern.
    • =: Print the current line number.
    • #: Extend comments to the next newline character.

    Replacement Commands:

    • g: Global replacement within a line (used with the s command).
    • p: Print the line.
    • w: Write the line to a file.
    • x: Exchange the text in the pattern block with the text in the internal buffer.
    • y: Translate one character to another (not used with regular expressions).
    • &: Reference to the matched string.

    Basic Regular Expression (BRE) Syntax in sed:

    • ^: Match the beginning of a line.
    • $: Match the end of a line.
    • .: Match any single character except a newline.
    • *: Match zero or more of the preceding characters.
    • []: Match a single character from a specified range.
    • [^]: Match a single character not in the specified range.
    • (..): Capture a substring.
    • &: Save the matched text for later use in replacements.
    • <: Match the start of a word.
    • >: Match the end of a word.
    • x{m}: Match exactly m occurrences of x.
    • x{m,}: Match at least m occurrences of x.
    • x{m,n}: Match between m and n occurrences of x.

    To match the start of a word, use \<. To match the end of a word, use \>.

    Extended Regular Expression (ERE) Syntax in sed:

    • \b: Match a word boundary (not supported by default in sed regular expressions).
    • +: Match one or more occurrences of the preceding character.

    Practical Examples:

    1 Print specific lines:
    To print only lines 1 and the last line:

    sed -n '1p;$p' test.txt

    2 Delete lines:
    To delete the second line:

    sed '2d' filename

    3 Basic match and replace:
    Replace spaces with hyphens:

    echo "hello world" | sed 's/ /-/g'

    4 Advanced match and replace:
    Reverse words in a string:

    echo "abc def ghi" | sed 's/\([a-zA-Z]*\) \([a-zA-Z]*\) \([a-zA-Z]*\)/\3 \2 \1/'

    5 Multiple edits:
    Replace “Hello” with “Hi” and “Goodbye” with “Farewell” in one command:

    sed 's/Hello/Hi/; s/Goodbye/Farewell/' example.txt

    6 Read a file:
    Insert content from an external file after lines matching a pattern:

    sed '/Line 2/r extra.txt' data.txt

    7 Write to a file:
    Save processed content into a new file:

    sed 's/World/Everyone/' input.txt > output.txt

    In summary, sed is a versatile and efficient tool for editing text in a stream, offering powerful pattern matching and text transformation capabilities when combined with regular expressions. From basic line printing to advanced text manipulation, sed serves a wide range of text processing needs.