Issue
I am executing diff
command for the below two files and output is not being displayed in a clear way. Below are the files and requirement.
I have used sort
for the files as it is beneficial with files with larger content in order to get accurate differences. Without using sort
for files with larger content, few values which are present in both the files are being displayed as differences, hence using sort
in diff
fixed it.
However I want output in a more formatted way as demonstrated below.
file1.txt
[head]
a=1
b=2
c=3
[tail]
d=0
e=2
file2.txt
[head]
a=1
b=2
c=4
[tail]
d=2
e=2
[HT]
a=24
Command used:
diff <(sort file1.txt) <(sort file2.txt) | grep '>' | sed 's/> *//'
diff output:
a=24
c=2
d=4
[HT]
Need to achieve:
[HT]
a=24
[head]
c=4
[tail]
d=2
Executed the code provided by Nic3500:
I almost got the expected output but when comparing configuration files,Iam getting section mysqld] multiple times like the below and files which have .
are not being displayed properly and [mysqld] needs to be displayed only once instead of multiple times.
ex:
[mysqld]
max_connections=200
max_user_connections=300
[mysqld]
slow_query_log
query_cache_size=0
[mysqld]
ignore_db_dir
[mysqld].slow_query_log_file='/home/slow_query
log
Above the . after slow_query.log is being printed after [mysqld] and can that be printed like
slow_query_log_file='/home/slow_query.log'
Same thing for below:
[mysqld].log-error='/home/mysqld
log
Needs to be displayed as log-error='/home/mysqld.log'
Solution
I got something working, it is not super elegant or full of bash wizardry, but it works.
EDIT May 11th: fixed the problem with the .
in the values. I had not tested with values containing .
characters. It also fixed problems when a variable does not have a value (i.e. no =
sign in the line).
#!/bin/bash
# Takes a file in ini format and flattens it for easy diff
flatten_file ()
{
file="$1"
tmpfile="$2"
section=""
while IFS= read -r line
do
# empty lines are ignored
if [[ "$line" =~ $^ ]]
then
continue
else
# Section lines are not written, but kept in a variable
if [[ "$line" =~ \[.*\] ]]
then
section="$line"
else
# Other lines are written in the temp file
echo "$section.$line" >>"$tmpfile"
fi
fi
done < "$file"
}
################################################################################
# Check arguments
if [[ $# -ne 2 ]]
then
echo "ERROR: must have 2 arguments, the files to compare." >&2
exit 1
else
file1="$1"
if [[ ! -f "$file1" ]]
then
echo "ERROR: $file1 does not exist." >&2
exit 2
fi
file2="$2"
if [[ ! -f "$file2" ]]
then
echo "ERROR: $file2 does not exist." >&2
exit 3
fi
fi
# flatten file1
tmpfile1="/tmp/$$_file1.tmp"
>"$tmpfile1"
flatten_file "$file1" "$tmpfile1"
cat "$tmpfile1" >&2
echo "DEBUG ======================================" >&2
# flatten file2
tmpfile2="/tmp/$$_file2.tmp"
>"$tmpfile2"
flatten_file "$file2" "$tmpfile2"
cat "$tmpfile2" >&2
echo "DEBUG ======================================" >&2
tmpdiffout="/tmp/$$_diffout.tmp"
diff "$tmpfile1" "$tmpfile2" | grep '>' | sed 's/> *//' >"$tmpdiffout"
cat "$tmpdiffout" >&2
echo "DEBUG ======================================" >&2
# Print the output of diff into the ini file format
sectionoutput=""
while IFS= read -r outputline
do
# for every line, print the section IF that section has no been output to screen already
extracted_section=$(echo "$outputline" | cut -d'.' -f1)
if [[ "$extracted_section" != "$sectionoutput" ]]
then
echo "$extracted_section"
sectionoutput="$extracted_section"
fi
# for every line, output the variable and value in that line
echo "$outputline" | cut -d'.' -f2-
done < "$tmpdiffout"
# Cleanup
rm -f "$tmpfile1"
rm -f "$tmpfile2"
rm -f "$tmpdiffout"
- To run this script, do:
./script.bash file1 file2 2>/dev/null
- To see debug messages, do:
./script.bash file1 file2
Function flatten_file()
transforms your file formats (looks like an ini file) into what @Charles Duffy proposed in the comments.
Your file1 becomes
[head].a=1
[head].b=2
[head].c=3
[tail].d=0
[tail].e=2
And similarly for file2. Then the same diff as you used is run to get the diffs.
Finally the diff output is formatted back into the desired format. The output is therefore:
[head]
c=4
[tail]
d=2
[HT]
a=24
Note that is conserves the order of sections, so [HT] is at the bottom. I do not see why you would want it on top since that does not follow the diff output order.
FYI if you are comparing strictly ini files, check https://superuser.com/questions/28738/how-to-compare-two-or-more-ini-files. That is why I was asking about other languages - tools.
Answered By - Nic3500 Answer Checked By - David Marino (WPSolving Volunteer)