Thursday, April 7, 2022

[SOLVED] Deleting multiple words from a file using terminal

Issue

I have a list of words word1 word2 word3 which I want to delete from a file file.txt. How can i do that using terminal.

id='dv3'>

Solution

Assuming that:

  • Replacements should only occur for whole words, not just any substrings.
  • Replacements should occur in-place - i.e., the results should be written back to the input file.

  • GNU sed (adapted from @jaypal's comment):

    sed -r -i 's/\b(word1|word2|word3)\b//g' file.txt
    
  • FreeBSD/OSX sed:

    sed -E -i '' 's/[[:<:]](word1|word2|word3)[[:>:]]//g' file.txt
    

Variant solution in case the search words can be substrings of each other:

# Array of sample search words.
words=( 'arrest' 'arrested' 'word3' )

# Sort them in reverse order and build up a list of alternatives
# for use with `sed` later ('word3|arrested|arrest').
# Note how the longer words among words that are substrings of
# each other come before the shorter ones.
reverseSortedAlternativesList=$(printf '%s\n' "${words[@]}" | sort -r  | tr '\n' '|')
# Remove the trailing '|'.
reverseSortedAlternativesList=${reverseSortedAlternativesList%|}

# GNU sed:
sed -r -i 's/\b('"$reverseSortedAlternativesList"')\b//g' file.txt

# FreeBSD/OSX sed:
sed -E -i '' 's/[[:<:]]('"$reverseSortedAlternativesList"')[[:>:]]//g' file.txt


Answered By - mklement0
Answer Checked By - David Marino (WPSolving Volunteer)