gawk Command in Linux



gawk is a command used in Linux that allows you to perform text processing and data manipulation. Its a GNU version of the gawk programming language that you can use to scan and process patterns on your system.

By using gawk command, you can search patterns within text files and perform actions on the matching lines. It allows you to use variables, string functions, arithmetic operations and logical operators. Its a pretty handy tool that can be used to perform tasks like generating reports, transforming data files and formatting output.

Table of Contents

Here is a comprehensive guide to the options available with the gawk command in linux −

How to Install gawk Command in Linux?

By default, gawk utility doesnt come preinstalled on any Linux system, but you can manually install it directly from your default systems repository.

On Linux systems like Ubuntu, Debian and other such systems that uses APT package manager, you can simply run the following command to install gawk

sudo apt install gawk
How to Install gawk Command

For systems using the YUM or DNF package managers, such as Red Hat, CentOS, and Fedora, you can install gawk with the following commands −

sudo yum install gawk

Or,

sudo dnf install gawk

For Arch Linux, you can use the Pacman package manager to install gawk

sudo pacman -S gawk

For OpenSUSE systems, use the Zypper package manager to install gawk

sudo zypper install gawk

The above command will download and install the gawk package along with its dependencies, and ensure that the utility is ready for use on your system.

Syntax of gawk Command

The basic syntax to use the gawk command on Linux is as follows −

gawk [options] 'program' file

Here,

  • gawk is the command itself.
  • [options] are optional flags that modify the behavior of gawk.
  • 'program' is the gawk program or script, usually enclosed in single quotes.
  • file is the input file to be processed.

gawk Command Options

There are a large number of options you can use with the gawk command, these are described in the table below −

OptionDescription
-b, --character-as-bytesHandles all input data as single-byte characters.
-c, --traditionalEnables traditional awk compatibility mode.
-C, --copyrightDisplays the abbreviated GNU copyright information on the standard output and exits successfully.
-d[file], --dump-variables[=file]Generate a sorted list of global variables, including their types and final values.
-D[file], --debug[=file]Enables debugging mode.
-e program-text, --source program-textEnables specifying the awk program directly on the command line.
-E file, --exec fileReads the awk program from a file and exits after processing it.
-F fs, --field-separator fsDefines the field separator as fs
-f file, --file fileLoads the awk program from a file rather than specifying it on the command line.
-i file, --include filePreloads an awk source file prior to running the main program.
-I, --traceEnables tracing of the awk program.
-l library, --load libraryLoads a dynamic extension library.
-L[value], --lint[=value]Issues warnings for constructs that are either non-portable or questionable.
-M, --bignumActivates gawks mpfr library for performing arbitrary-precision arithmetic.
-n, --non-decimal-dataIdentifies octal and hexadecimal values within input data.
-N, --use-lc-numericUses the locales numeric formatting.
-o file, --pretty-print[=file]Redirects a pretty version of output to a file.
-O, --optimizeEnables optimization of the awk program.
-p[profile], --profile[=profile]Enables profiling mode.
-P, --posixEnables strict POSIX compliance.
-r, --reintervalEnables interval expressions in regular expressions.
-s, --no-optimizeDisables optimization.
-S, --sandboxRuns gawk in sandbox mode, restricting certain operations for security.
-t, --lint-oldIssues warnings for constructs that may not be compatible with older versions.
-V, --versionDisplays version information and exits.

Examples of gawk Command in Linux

The following are a few basic examples of gawk command in Linux systems −

  • Print All Lines
  • Print a Specific Column
  • Print Lines Matching a Pattern
  • Add Line Numbers
  • Calculate the Sum of a Column

Print All Lines

By default, gawk command prints every line of data from a specified file, this can be done using the following example −

gawk '{print}' filename.txt
Print All Lines Using gawk Command

Print a Specific Column

The gawk command can be used to print a specific column of each line in a specified file. Lets say, to print the second column from a file named myfile.txt, use −

gawk '{print $2}' myfile.txt
Print Specific Column Using gawk Command

Print Lines Matching a Pattern

You can also use gawk command to print lines from a file that contain a specific word. For example, to print a line that contains the word Users from a file called file.txt, run −

gawk '/Users/ {print}' file.txt
Print Lines Matching Pattern Using gawk Command

Add Line Numbers

Apart from printing the values using a specific pattern, you can also print each line with the line numbers. For example −

gawk '{print NR, $0}' filename.txt
Add Line Numbers Using gawk Command

In the above command, NR is a built-in variable that is used to represent the current record number (line number). $0 is a built-in variable that specifies the entire current record (line).

Calculate the Sum of a Column

With gawk command, it is also possible to calculate the sum of the values in a column. For example, to calculate the sum of the values in the first column in a file named file.txt, use −

gawk '{sum += $1} END {print sum}' filename.txt
Calculate Sum of Column Using gawk Command

In the above command, {sum += $1} instructs command to add the value in the first column ($1) to the variable sum for each line it processes. While the END {print sum} part ensures that after all lines have been processed, the total sum accumulated in the sum variable is printed.

Thats how you can use the gawk command on your Linux system.

Conclusion

The gawk command is a useful Linux utility that is widely adopted for text processing and data manipulation. By using the gawk command, you can search for patterns, perform actions on matching lines, utilize logical operators, string functions and arithmetic operations.

This tutorial has covered the installation of the gawk command on various Linux distributions, detailed its syntax, and explored the different options available. Additionally, it has also provided practical examples to help you understand how to effectively use the command on Linux systems.