Issue
I have a file containing thousands of records which are grouped into sub-groups based on the first 6-digits of their identity numbers they have in common, but some records are duplicates. I am trying to create a bash script to read in the file, find duplicate records and the identity number they share, and print out the identity numbers and duplicate records under them.
Current-Script:
#!/bin/bash
########## script to find duplicate records & their ID
INPUT="sourceFile.txt"
while read varName; do
echo "$varName"
if [ "$varName" = "NEXT" ]; then
sort $INPUT | uniq -d
echo "END OF ONE ID-NUMBER IN FILE"
fi
done < "$INPUT"
Sample INPUT_FILE:
NEXT
123456-
# requesting: displayName
displayName: Alpha Beta
displayName: Charly Delta Echo
displayName: Xerox Yingyang Zenox
displayName: Xerox Yingyang Zenox
NEXT
123999-
# requesting: displayName
displayName: Golf Harvey Indigo
displayName: Jaguar Kingston Lambda
displayName: Alma Nano Matter
displayName: Oxygen Pascal Queen
displayName: Romeo Saint Tropez Unicorn
displayName: Vauxhall Wellignton Woolwhich
displayName: Rodrigo Compton Hilside
displayName: Vauxhall Wellignton Woolwhich
NEXT
DESIRED OUTPUT/ EXPECTED OUPUT:
NEXT
123456-
displayName: Xerox Yingyang Zenox
displayName: Xerox Yingyang Zenox
END OF ONE ID-NUMBER IN FILE
NEXT
123999-
displayName: Vauxhall Wellignton Woolwhich
displayName: Vauxhall Wellignton Woolwhich
Thank you for anticipated ideas and clues.
Solution
I have no idea why you want the duplicate lines twice and I do not understand what the line "END OF ONE ID-NUMBER IN FILE" is doing in the middle of the output.
The following displays just the duplicates.
#! /bin/bash
read -r next; unset next
while true; do
read -r id || break
read -r comment; unset comment
dns=()
while read -r dn; do
if [[ $dn =~ ^NEXT$ ]]; then
printf 'NEXT\n'
printf '%s\n' "$id"
printf '%s\n' "${dns[@]}" | sort | uniq -d
break
else
dns+=("$dn")
fi
done
done
If you really want to hard code the name of the input file, you can add the following line in the beginning:
exec < sourceFile.txt
Answered By - ceving Answer Checked By - David Goodson (WPSolving Volunteer)