The //**uniq**// command in Linux and Unix-like systems is used to filter out the duplicate lines from a sorted file. It compares the adjacent lines and removes the duplicate lines, preserving only one copy of the duplicated lines.
Here is the basic syntax for the **uniq** command:
uniq [options] [file]
file is the name of the file you want to filter out the duplicate lines.
Here are some examples of the uniq command in action:
1. To filter out the duplicate lines from a file called "file1.txt"
$ sort file1.txt | uniq
Note: The sort command is used to sort the lines of a file before running the uniq command, as the uniq command only works on sorted input.
2. To display the count of the occurrences of each line
$ sort file1.txt | uniq -c
Note: The **-c** option is used to display the count of the occurrences of each line.
3. To ignore the case when comparing lines
$ sort -f file1.txt | uniq
Note: The **-f** option is used to ignore the case when comparing lines. In other words, **sort -f** will convert all lowercase characters to their uppercase equivalent before comparison, that is, perform case-independent sorting before running the **uniq** command.
4. To only display the unique lines
$ sort file1.txt | uniq -u
Note: The **-u** option is used to only display the unique lines
5. To only display the duplicate lines
$ sort file1.txt | uniq -d
Note: The **-d** option is used to only display the duplicate lines
The **uniq** command is useful for filtering out the duplicate lines from a sorted file, it compares the adjacent lines and removes the duplicate lines, preserving only one copy of the duplicated lines. This command can be useful when you have a large file and you want to identify and remove duplicate lines, for example when you are working on a data-cleaning task. Additionally, the **uniq** command can be used in conjunction with other commands like [[unix_commands:sort|sort]] and [[unix_commands:grep|grep]] to further filter and manipulate the output of the command. It can also be used to count the number of occurrences of each line in a file, which can be useful for analyzing data or for troubleshooting purposes.