Introduction
The comm
command is a simple Linux utility for comparing files with focus on the common content. The command compares two sorted files line by line and displays results in three columns.
The following guide will explain how to use the Linux comm
command with examples.
Prerequisites
- A system running Linux.
- Access to the terminal.
comm Command Syntax
The comm
command is run with at least two arguments stating the names of the files to be compared. Adding options before file names customizes the output.
The basic comm
syntax is:
comm [file_name_1] [file_name_2]
comm [options] [file_name_1] [file_name_2]
However, executing comm
without any options doesn't provide results and prints the following output:
comm Command Options
Using comm
is simple, but appending options provides additional customization.
The table below includes all comm
options:
Command | Description |
---|---|
-1 | Prints the output without the first column, hiding the lines unique to the first file. |
-2 | Hides the second column in the output (lines unique to the second file). |
-3 | Does not print the column containing the common lines. |
--check-order | Checks whether files are sorted. |
--nocheck-order | Prints the result without checking whether the files are sorted. |
--output-delimiter=[any character] | Replaces the default spaces in the output with another character. |
--total | Shows the total number of lines in each column. |
-z | Displays the output lines as NULL-terminated instead of the newline-terminated default output. |
--help | Shows help information. |
--version | Displays version information. |
Linux comm Examples
The comm
command works with two sorted files. To understand how comm
works, set up a test environment:
- Create two test files.
- Name the files (for example, File_1 and File_2).
- Add different words or numbers to each file.
- Make sure some content overlaps.
Note: Linux offers numerous ways to create files from the terminal. The easiest way is to use the touch command.
Use the cat command to display the File_1 and File_2 content. The output shows that files overlap in three words (art, dog, and way):
The following sections use File_1 and File_2 to explain how comm
works.
Compare Two Files
Compare two sorted files line by line with:
comm File_1 File_2
The command prints results in three columns:
- Column 1 shows only values unique to the first file.
- Column 2 prints items present only in the second file.
- Column 3 displays content common for both files: art, dog, and way.
Hide Columns
Use arguments -1
, -2
, and -3
and the combinations to display only particular columns. For instance, print only lines common for both files with:
comm -12 File_1 File_2
Using -12
with comm
hides the first and second columns, leaving only the one containing lines shared by both files.
On the other hand, -3
hides column three and displays lines unique to each file:
comm -3 File_1 File_2
Ignore Case
Comparison with comm
is case sensitive. All words in File_1 and File_2 are lowercase, so comm
recognizes art, dog, and way as common for both files.
However, if, for example, the word art is uppercase in File_1 but lowercase in File_2, the output is:
Case differences with the word art/Art between two files prompt comm
to register the word as unique to each file. While comm
does not accept -i
as an option to ignore case, the tr option provides a workaround.
Use Tr
on the two files to convert case and then redirect the output to temporary files (Temp_1, Temp_2):
tr A-Z a-z <File_1 > Temp_1
tr A-Z a-z <File_2 > Temp_2
Tr
converts content in both files to lowercase and creates new files to save the output (Temp_1, Temp_2). The original files remain unchanged.
Run comm
on Temp_1 and Temp_2 to compare files while "ignoring" case:
comm Temp_1 Temp_2
The command outputs lowercase art as the common word for both files.
Compare Unsorted Files
The comm
command only produces valuable output when sorted files are used as arguments. When comm
works with unsorted files, the output is not usable and always prints an error message.
For example, the following two files are not sorted, as shown with cat
:
When you use comm
on unsorted files, the output prints:
While comm
pairs certain lines and produces an output, the output is incomplete and unusable. The error message specifies that neither file is in sorted order.
To verify that the files are not sorted, use comm --check-order
:
comm --check-order Not_Sorted_File_1 Not_Sorted_File_2
The --check-order
option prints the error message and stops comm
from comparing files at the first unsorted item.
To force comm
to print an output and hide the error message, use --nocheck-order
:
comm --nocheck-order Not_Sorted_File_1 Not_Sorted_File_2
The output is not reliable. For instance, the word art is present in columns one and two, even though it's common for both files.
The surefire way to use comm
with unsorted files is to apply sort. Execute the following:
comm <(sort Not_Sorted_File_1 ) <(sort Not_Sorted_File_2)
The output shows that the words art, dog, and way are common for both files.
Note that sort
without arguments only affects the standard output and does not change the source files. Still, sort
can be used with other comm
options.
For example, to print only words common for both unsorted files, use:
comm -12 <(sort Not_Sorted_File_1 ) <(sort Not_Sorted_File_2)
To sort source files and then execute comm
, use sort -o
. The -o
argument saves the sorted output to a specific file.
For example, sort Not_Sorted_File_1 and save the output to that same file with:
sort -o Not_Sorted_File_1 Not_Sorted_File_1
Running cat
after sorting the file shows that the file is sorted now. Repeat the same process for the second file:
sort -o Not_Sorted_File_2 Not_Sorted_File_2
Run comm
to compare files:
comm Not_Sorted_File_1 Not_Sorted_File_2
Compare Directories
Use comm
with ls to compare file names in two directories. For example, compare Directory1 and Directory2:
comm <(ls Directory1) <(ls Directory2)
The first column represents file names unique to Directory1, the second those unique to Directory2, and the third one represents file names common for both folders.
When running comm
with ls
, the command only looks at file names, not the content. Files listed as common for both folders could still differ even if having the same name.
Use comm with STDIN
To compare a file with standard terminal input, use a hyphen as one of the arguments with comm
.
For example, compare File_1 with the standard input using:
comm File_1 -
Write text to compare File_1 directly to the terminal or use a text editor.
After hitting Enter, the command prints output in three columns, using STDIN instead of File_2.
The first column represents content unique to File_1, the second shows words found only in the standard input, and the third words common for both.
Change the Default Separator
The comm
output separates columns with spaces by default. To change the separator, use the --output-delimiter
option.
For example, to use *
instead of spaces, run:
comm --output-delimiter=* File_1 File_2
The output shows that words in File_1 have no asterisk, those in File_2 have one asterisk, and items common for both files have two asterisks.
Show Line Counts
Show the total number of lines in each column with the --total
option:
comm --total File_1 File_2
The output prints the number of lines at the bottom of each column.
Conclusion
After following the steps from this tutorial, you know how to compare files line by line with the comm
command.
Next, learn a different way to compare files with the diff command.