unix_commands:uniq

The uniq command in Linux and Unix-like systems is used to filter out the duplicate lines from a sorted file. It compares the adjacent lines and removes the duplicate lines, preserving only one copy of the duplicated lines.

Here is the basic syntax for the uniq command:

uniq [options] [file]

file is the name of the file you want to filter out the duplicate lines. Here are some examples of the uniq command in action:

1. To filter out the duplicate lines from a file called “file1.txt”

$ sort file1.txt | uniq

Note: The sort command is used to sort the lines of a file before running the uniq command, as the uniq command only works on sorted input.

2. To display the count of the occurrences of each line

$ sort file1.txt | uniq -c

Note: The -c option is used to display the count of the occurrences of each line.

3. To ignore the case when comparing lines

$ sort -f file1.txt | uniq

Note: The -f option is used to ignore the case when comparing lines. In other words, sort -f will convert all lowercase characters to their uppercase equivalent before comparison, that is, perform case-independent sorting before running the uniq command.

4. To only display the unique lines

$ sort file1.txt | uniq -u

Note: The -u option is used to only display the unique lines

5. To only display the duplicate lines

$ sort file1.txt | uniq -d

Note: The -d option is used to only display the duplicate lines

The uniq command is useful for filtering out the duplicate lines from a sorted file, it compares the adjacent lines and removes the duplicate lines, preserving only one copy of the duplicated lines. This command can be useful when you have a large file and you want to identify and remove duplicate lines, for example when you are working on a data-cleaning task. Additionally, the uniq command can be used in conjunction with other commands like sort and grep to further filter and manipulate the output of the command. It can also be used to count the number of occurrences of each line in a file, which can be useful for analyzing data or for troubleshooting purposes.