Issue
I have file1
(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),
(2,'a lot of brazil 4.2.3.1, 'some other info',0,null, 12345),
(3,'a lot of india 3.4.2.1, 'some other info',0,null, 12345),
(4,'a lot of laos 1.3.4.5, 'some other info',0,null, 12345),
(5,'a lot of china 1.2.3.5, 'some other info',0,null, 12345);
and file2
(1'a lot of singapore A.B.C.D 'some other info',0,null, 12345),
(2,'a lot of brazil E.F.G.H, 'some other info',0,null, 12345),
(3,'a lot of india H.I.J.K, 'some other info',0,null, 12345),
(4,'a lot of laos L.M.N.O, 'some other info',0,null, 12345),
(5,'a lot of china P.Q.R.S, 'some other info',0,null, 12345);
I have created a script but to copy and replace with LINE number but need input to look for SINGAPORE
in file 1 and copy next word 1.2.3.4
and look for singapore
in file2 and replace the next word here from 1.2.3.4
- A.B.C.D
and the final file2 looks like this
(1'a lot of singapore 1.2.3.4 'some other info',0,null, 12345),
Python script or Awk or sed
any script will be helpful.
So far I have created this to copy and replace line numbers
sed -i '2d' File2.txt
awk 'NR==5380{a=$0}NR==FNR{next}FNR==2{print a}1' file1.txt file2.txt
Solution
Here is a simple Awk script to look for the replacement text from the first input file and replace the corresponding token in the second input file.
awk -v country="singapore" 'NR == FNR {
for (i=2; i<=NF; i++) if ($(i-1) == country) token = $i; next }
$0 ~ country { for(i=2; i<=NF; i++) if ($(i-1) == country) $i = token
} 1' file1 file2 >newfile2
When we are reading file1
, NR == FNR
is true. We loop over the input tokens and check for one which matches country
; if we find one, we set token
to that value. This means that if there are multiple matches on the country keyword, the last one in the first input file will be extracted.
The next
statement causes Awk to skip the rest of the script for this input file, so the lines from file1
are only read, and not processed further.
If we fall through to the last line, we are now reading file2
. If we see a line which contains the keyword, we perform a substitution on the keyword after the country
keyword. (This requires the keyword to be an isolated token, not a substring within a longer word etc.) The final 1
causes all lines which get this far to be printed back to standard output, thus generating a copy of the second file with any substitutions performed.
If you have any control over the data format used here, perhaps try to figure out a way to get the input in a less haphazard ad-hoc format, like JSON.
Answered By - tripleee Answer Checked By - Robin (WPSolving Admin)