Issue
On Linux server, I have pipe-delimited files that have pipe (|) characters in some string columns. The files are pipe-delimited but the delimiter pipe is always in double quotes while pipes in the text/string can have any characters, space, etc around them. I would like to replace the non-delimiter pipes within the strings with a dash sign (-)
Here is an example:
col1|col2|col3|col4
"1"|"This is my column 2 |Although there is pipe here, it is not a delimiter pipe."|"And this is my 3rd column '|" with a pipe followed by a double quote"|"|and finally this 4th column starts with a | that is not'|a delimiter
I have tried some SED and AWK commands but could not properly exclude the delimiters ("|") from the replacement in the strings.
I am trying to get an output like this:
col1|col2|col3|col4
"1"|"This is my column 2 -Although there is pipe here, it is not a delimiter pipe."|"And this is my 3rd column '-" with a pipe followed by a double quote"|"-and finally this 4th column starts with a - that is not'-a delimiter
Solution
One of the ways I can think of is to replace "|" with some special character (## in this case) and replace all occurances of | with - then replace back ## with "|"
cat file.txt | sed -e "s/\"|\"/##/g" | tr "|" - | sed -e "s/##/\"|\"/g"
Output -
col1-col2-col3
"1"|"This is my column 2 -Although there is pipe here, it is not a delimiter pipe."|"And this is my 2nd column '-" with a pipe followed by a double quote"|"-and finally this 3rd column starts with a - that is not'-a delimiter
Answered By - Ananth Sathvick Answer Checked By - Mildred Charles (WPSolving Admin)