My Learnings: February 2014

Wednesday, 5 February 2014

File comparison

cmp filename1 filename2

cmp -- is one of the four principle UNIX file comparison utilities.

   It compares 2 files, and returns the positions where they differ.

comm -options filename1 filename2

comm -- This utility is used in comparing two files, produces three

   columns of output. The first contains lines unique to the
   first file, the second, lines unique to the second, and the
   third column, lines common to both files. By placing the
   numbers [1], [2], and/or [3] in the [options] position, any
   one (or more) of these columns can be suppressed.

To compare and scan large files

bdiff      Compares two large files.

bfs        Scans a large file to determine where to split

           it into smaller files.

To print out basic info on an account

finger

Format: finger username

To automatically correct mistakes in directory names

Use shopt -s cdspell to correct the mistakes in the cd command automatically.

cd /etc/mall
-bash: cd: /etc/mall: No such file or directory

# shopt -s cdspell
# cd /etc/mall
# pwd
/etc/mail

Monday, 3 February 2014

To find duplicate and distinct records

1. To display the number of occurrences and the records.

sort f1.txt | uniq -c
2 class
3 jar
1 bin

2.To display only the duplicate records.

sort f1.txt | uniq -d
class
jar

3.To display the distinct records

sort f1.txt | uniq
or
sort f1.txt | uniq -u

bin

Sunday, 2 February 2014

To remove duplicate lines in the first file1.txt and output the results to the second file.

uniq myfile1.txt > myfile2.txt

syntax :

uniq [option] filename

The options of uniq command are:

c : Count of occurrence of each line.
d : Prints only duplicate lines.
D : Print all duplicate lines
f : Avoid comparing first N fields.
i : Ignore case when comparing.
s : Avoid comparing first N characters.
u : Prints only unique lines.
w : Compare no more than N characters in lines

The default behavior of the uniq command is to suppress the

duplicate line. Note that, you have to pass sorted input to

the uniq, as it compares only successive lines.

If the lines in the file are not in sorted order, then use the

sort command and then pipe the output to the uniq command. 

> sort example.txt | uniq