Thursday, October 6, 2022

[SOLVED] How to convert a variable containing sed-arguments to an array?

Issue

I'm using

sed (GNU sed) 4.4
GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)

I've a complicated set of sed arguments in a bash script, about 20 different -e expression. Here is a simple example as a one-liner. It converts aa bb cc to aaBBcc:

sed -e 's# ##g' -e 's#b#B#g' <<< "aa bb cc"

or

k=('-e' 's# ##g'    '-e' 's#b#B#g'); sed "${k[@]}" <<< "aa bb cc"

However, there are 20-ish -e expression and most are complicated. The script is only for me, so it doesn't have to follow convention or policy. To make the arguments readable / editable (to me), I assign them to a variable with extra whitespace, columnated, indented .... Here is a simplified version of what I mean:

#!/bin/bash
k="-e s#    #    #g \
   -e s# b  # B  #g \
  "

That simplified example doesn't show how useful that approach is to me. Anyway, here is the "working" script:

#!/bin/bash
k="-e s#    #    #g \
   -e s# b  # B  #g \
  "
k=$(sed -e 's# ##g'         <<< "$k")  #1 remove all spaces
k=$(sed -e 's|###|# ##|g'   <<< "$k")  #2 put needed space back in
k=$(sed -e 's#-e#|-e #g'    <<< "$k")  #3 delimit the args with "|"
k=$(sed -e 's#|##'          <<< "$k")  #4 remove the leading "|"
z=$IFS; IFS="|"; k=($k); IFS=$z        #5 convert variable to array
sed "${k[@]}" <<< "aa bb cc"           #6 process the string

Output is:

aaBBcc

It works and it is readable for me. But it is really complicated, and took me quite awhile to figure out how to massage k into a form that sed would take.

It fails to work if I quote the expressions, as in -e 's#b#B#g'

Is there a less complicated way, and/or a way to quote the expressions? Must work with k whitespaced as above, sed 4.4, bash 4.4.12(1).

#######################################################

added 2022-09-26 14:58 PST:

Here is a real world script for converting a URL before bookmarking. The caveat is that I wrote for my usage. I don't have to figure out what the code is trying to do because I already know the paradigm, I invented it, or reinvented it.

https://www.ebay.com/sch/i.html?_from=R40&_trksid=123456&_nkw=%28vintage%2Cvtg%29+%28polartec%2Cfleece%29+%28full%2Czip%2Czips%2Czipper%2Czippered%2Czipping%29+-%28hilfiger%2C%22old+navy%22%2Chooded%2Ccamo%2Ccamouflage%2Cvest%2Csmall%2Cmedium%2Cxl%2Cxxl%2Chalf%2Cquarter%2C%221%2F4%22%2C%221%2F2%22%2C+lined%2Cwinnie%2Ctoddler%2Ckids%2Cladies%2Cwomens%2Cwomen%29&_sacat=11450&LH_TitleDesc=0&_odkw=%28vintage%2Cvtg%29+fleece+%28full%2Czip%2Czips%2Czipper%2Czippered%2Czipping%29+-%28hilfiger%2C%22old+navy%22%2Chooded%2Ccamo%2Ccamouflage%2Cvest%2Csmall%2Cmedium%2Cxl%2Cxxl%2Chalf%2Cquarter%2C%221%2F4%22%2C%221%2F2%22%2C+lined%2Cwinnie%2Ctoddler%2Ckids%2Cladies%2Cwomens%2Cwomen%29&_osacat=11450&_sop=10&LH_PrefLoc=3&_ipg=240&_udhi=99

into

https://www.ebay.com/sch/i.html?&_nkw=(vintage,vtg)+(polartec,fleece)+(full,zip,zips,zipper,zippered,zipping)+-(hilfiger,old+navy,hooded,camo,camouflage,vest,small,medium,xl,xxl,half,quarter,1/4,1/2,+lined,winnie,toddler,kids,ladies,womens,women)&_sacat=1145011450&_sop=10&LH_PrefLoc=3&_udhi=99&_ipg=240
#!/bin/bash
echo
k="-e s# [&]*_from=R40      #            #   \
   -e s# [&]*_trk[^&]*      #            #   \
   -e s# [&]*_odkw[^&]*     #            #   \
   -e s# [&]*_osacat[^&]    #            #   \
   -e s# [&]*_sacat=0       #            #   \
   -e s# [&]*LH_TitleDesc=0 #            #   \
   -e s# ++                 # +          #g  \
   -e s# %2F                # /          #g  \
   -e s# %28                # (          #g  \
   -e s# %29                # )          #g  \
   -e s# %2C                # ,          #g  \
   -e s# %22                #            #g  \
   -e s# &_ipg=[0-9]*       #            #   \
   -e s# $                  # \&_ipg=240 #   \
   "
k=$(sed -e 's# ##g'         \
        -e 's|###|# ##|g'   \
        -e 's#-e#|-e #g'    \
        -e 's#|##'          \
        <<< "$k"            \
   )
z=$IFS; IFS="|"; k=($k); IFS=$z        
sed "${k[@]}" <<< "$1"


Solution

Why not just use ; to join multiple set operations into a single parameter? Something like this:

k="s#    #    #g;
   s# b  # B  #g;
"
k=$(sed 's# ##g; s|###|# ##|g' <<<"$k") # Clean up spaces in $k
sed "$k" <<< "aa bb cc"

Result: "aaBBcc". Your big pattern would look like this:

k="s# [&]*_from=R40      #            #   ;
   s# [&]*_trk[^&]*      #            #   ;
   s# [&]*_odkw[^&]*     #            #   ;
   s# [&]*_osacat[^&]    #            #   ;
   s# [&]*_sacat=0       #            #   ;
   s# [&]*LH_TitleDesc=0 #            #   ;
   s# ++                 # +          #g  ;
   s# %2F                # /          #g  ;
   s# %28                # (          #g  ;
   s# %29                # )          #g  ;
   s# %2C                # ,          #g  ;
   s# %22                #            #g  ;
   s# &_ipg=[0-9]*       #            #   ;
   s# $                  # \&_ipg=240 #   ;
   "

You could also do the whitespace-mangling as part of the sed command with a command substitution:

sed "$(sed 's# ##g; s|###|# ##|g' <<<"$k")" <<< "$1"


Answered By - Gordon Davisson
Answer Checked By - Timothy Miller (WPSolving Admin)