Tuesday, January 4, 2022

[SOLVED] Find text strings that have changed in HTML files

Issue

I need to change some terminology used in my web app. I have a bunch of HTML files in my git repository, and I want to find all occurrences of certain strings in the text of the HTML. I don't want to match strings in the tags because changing that would require changes in my scripts as well. Maybe I'll change them too later for consistency, but for now I just want to focus on the UI.

Suppose I want to replace "Assessment" with something else. Given:

<li ng-repeat="item in assessments">
     <h4>Assessment {{item.title}}</h4>
</li>

I just want to know that there's something on the second line that needs to change.


Solution

Git has a feature called textconv that can do transformations on files before presenting them. Usually this is used to display diffs of binary files, but you can use it to convert anything.

  1. Install html2text.
  2. Add this to your ~/.gitconfig:

    [diff "html"]
        textconv=html2text
    
  3. Add this to your repository's .gitattributes file:

    *.html diff=html
    
  4. Run git grep with the --textconv option:

    git grep --textconv -i assessment -- *.html
    # * *** Assessment {{item.title}} ***
    

Textconv is off by default for git grep, but on by default for git diff. So you'll probably want to remove that line from .gitattributes after you're done, otherwise you'll have to run git diff --no-textconv to see the changes to the tags (and make patches).



Answered By - z0r