Issue
I have this file:
m64071_220512_054244/12584899/ccs rev pet047-10055 ACGTGCGACCTTGTGA TTGAGGGTTCAAACGTGCGACCTTGTGA
m64071_220512_054244/128321000/ccs rev pet047-10055 ACGTGCGACCTTGTGA TTGAGGGTTCAAACGTGCGACCTTGTGA
m64071_220512_054244/132186699/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/134874748/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
I need to tr
and reverse
the fields $4 and $5 only if $2==rev
Expect:
m64071_220512_054244/12584899/ccs rev pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/128321000/ccs rev pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/132186699/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/134874748/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
I tried:
perl -lpe 'if(/rev/) {$rev=/rev/;next}; if ($rev) {$F[4,5]=~tr/ATGC/TACG/; $F[4,5]=reverse $F[4,5]; print "@F"}' file
I also tried to go with Awk (Execute bash command inside awk and print command output)
awk '{
if($2==rev)
{
cmd1="echo \047" $4 "\047 | rev | tr \047ATGC\047 \047TACG\047"
cmd2="echo \047" $5 "\047 | rev | tr \047ATGC\047 \047TACG\047"
newVar1=((cmd1 | getline line) > 0 ? line : "failed")
newVar2=((cmd2 | getline line) > 0 ? line : "failed")
close(cmd)
print $1, $2, $3, newVar1, newVar2
}
else {print}
}' file
Solution
To follow the question's attempt:
perl -w -lanE'
if ($F[1] eq "rev") {
for (@F[3,4]) { tr/ATGC/TACG/; $_ = reverse $_ }
}
say "@F"
' file
Can put this on one line, I spread it out for readability (or copy-paste it as is into most shells). Or, put it in a program file of course, specially if there is more to do.
Edit -- comments on the code in the question
To have it break up the input string into the
@F
array ("autosplit") need-a
flagSince you are explicitly printing what is needed use
-n
flag, not-p
Fields 4 and 5 in the line are array elements 3 and 4
I assume that by
$F[4,5]
you mean the two array elements (which should be 3,4). That, then, should be@F[3,4]
-- and with-w
flag, for warnings, we'd hear about itMore importantly, we cannot bind a regular expression or a
tr
pattern to a list, but only to a single scalar. In order to apply thattr
to multiple items need to iterate over them, like above†The code in the block of that loop over
@F[3,4]
elements runs on each element of the array and modifies it in place (since$_
is just an alias to array elements), first withtr
then by (reversing and) assigning. All that can also be written as$_ = reverse tr/ATGC/TACG/r for @F[3,4];
The
r
modifier makestr
return the changed string, which is thenreverse
-ed and then assigned back to the currently processed array element, via the$_
alias for it
† With @ary =~ /.../
the LHS is first evaluated so we get something like
$ary[0], ..., $ary[-1] =~ /.../;
All terms up to the last one are discarded (see the comma operator for the logic of it), one by one, and then $ary[-1] =~ /.../
evaluated and its return returned.
Answered By - zdim Answer Checked By - Katrina (WPSolving Volunteer)