Wednesday 24 April 2013

Script to Compare two files in and out

In my day today activities  you may come across a situation where in you need to compare two files  either of below situations.

Q:
1)  List the entries which are in File A and not in File B
2) List the entries which are in File  B and not in File A
3) List the entries which are common in File A and File B
4) List the entries merging both File A and File B contents  with out duplicates.

Solution Shell Script:

#!/bin/sh
echo "Enter an Option from the below"
echo "enter Option 1 : to choose records which are in first file not there in second file"
echo "enter Option 2 : to choose records which are not there in first file and there in second file"
echo "enter Option 3 : to choose records which are common in both the file"
echo "enter Option 4 : to merge two files with unique records"
read option
echo "you have choosen option $option"
echo "enter first file name"
read first_file
echo "$first_file"
echo "enter second file name"
read second_file
echo "$second_file"
echo "enter output file name"
read output_filename

#### to choose records which are in first file not there in second file

fun_1(){
sort $first_file>${first_file}_sorted
sort $second_file>${second_file}_sorted
comm -23 ${first_file}_sorted ${second_file}_sorted>${output_filename}
}

#### to choose records which are not there in first file and there in second file

fun_2(){
sort $first_file>${first_file}_sorted
sort $second_file>${second_file}_sorted
comm -13 ${first_file}_sorted ${second_file}_sorted>${output_filename}
}

#### to choose records which are common in both the file

fun_3(){
sort $first_file>${first_file}_sorted
sort $second_file>${second_file}_sorted
comm -12 ${first_file}_sorted ${second_file}_sorted>${output_filename}
}

#### to merge two files with unique records

fun_4(){
sort $first_file $second_file|uniq>${output_filename}
}

#### check whether both the input files are exists and regular files then only we can proceed

if [ -f $first_file -a -f $second_file ];then
echo "both files are exists and regular files"

#### according to the chosen option function will be called

if [ $option -eq 1 -o $option -eq 2 -o $option -eq 3 -o $option -eq 4 ];then
fun_${option}
rm -f ${first_file}_sorted ${second_file}_sorted
else
echo "choose proper option"
fi

else
echo "mentioned input files are not correct "
fi

Execution &O/P

bash-4.2$ cat a.txt
ORANGE2009040910002|10|1
GREEN200903304001052|1101|2
GREEN200903304001053|1101|2
GREEN200903304001054|1101|2
GREEN200903304001055|1101|2
ORANGE2009040910001|10|3
ORANGE2009040910004|10|1
ORANGE2009040910006|10|1
ORANGE2009040910003|10|1
ORANGE2009040910008|10|1
ORANGE2009040910005|10|1
ORANGE2009040910010|10|1
ORANGE2009040910007|10|1
ORANGE2009040910012|10|1
GREEN200903304001052|1101|2
GREEN200903304001053|1101|2
GREEN200903304001054|1101|2
GREEN200903304001055|1101|2

bash-4.2$ cat b.txt
ORANGE2009040910002|10|1
ORANGE2009040910001|10|3
ORANGE2009040910004|10|1
ORANGE2009040910006|10|1
ORANGE2009040910003|10|1
ORANGE2009040910008|10|1
ORANGE2009040910005|10|1
ORANGE2009040910010|10|1
ORANGE2009040910007|10|1
ORANGE2009040910012|10|1
GREEN200903304001008|1101|2
GREEN200903304001007|1101|2
GREEN200903304001010|1101|2
GREEN200903304001009|1101|2
GREEN200903304001011|1101|2
GREEN200903304001012|1101|2
GREEN200903304001014|1101|2

Execution:


bash-4.2$ comm.sh
Enter an Option from the below
enter Option 1 : to choose records which are in first file not there in second file
enter Option 2 : to choose records which are not there in first file and there in second file
enter Option 3 : to choose records which are common in both the file
enter Option 4 : to merge two files with unique records
3
you have choose option 3
enter first file name
a.txt
a.txt
enter second file name
b.txt
b.txt
enter output file name
c.txt
both files are exists and regular files

bash-4.2$ cat c.txt
ORANGE2009040910001|10|3
ORANGE2009040910002|10|1
ORANGE2009040910003|10|1
ORANGE2009040910004|10|1
ORANGE2009040910005|10|1
ORANGE2009040910006|10|1
ORANGE2009040910007|10|1
ORANGE2009040910008|10|1
ORANGE2009040910010|10|1
ORANGE2009040910012|10|1

0 blogger-disqus:

Post a Comment