Thursday, April 7, 2022

[SOLVED] How to replace a string in a whole Git history?

Issue

I have one of my passwords commited in probably few files in my Git repo. Is there some way to replace this password with some other string in whole history automatically so that there is no trace of it? Ideally if I could write simple bash script receiving strings to find and replace by and doing whole work itself, something like:

./replaceStringInWholeGitHistory.sh "my_password" "xxxxxxxx"

Edit: this question is not a duplicate of that one, because I am asking about replacing strings without removing whole files.


Solution

At the beginning I'd like to thank ElpieKay, who posted core functions of my solutions, which I've only automatized.

So, finally I have script I wanted to have. I divided it into pieces which depend on each other and can serve as independent scripts. It looks like this:

censorStringsInWholeGitHistory.sh:

#!/bin/bash
#arguments are strings to censore

for string in "$@"
do
  echo ""
  echo "================ Censoring string "$string": ================"
  ~/replaceStringInWholeGitHistory.sh "$string" "********"
done

usage:

~/censorStringsInWholeGitHistory.sh "my_password1" "my_password2" "some_f_word"

replaceStringInWholeGitHistory.sh:

#!/bin/bash
# $1 - string to find
# $2 - string to replace with

for branch in $(git branch | cut -c 3-); do
  echo ""
  echo ">>> Replacing strings in branch $branch:"
  echo ""
  ~/replaceStringInBranch.sh "$branch" "$1" "$2"
done

usage:

~/replaceStringInWholeGitHistory.sh "my_password" "********"

replaceStringInBranch.sh:

#!/bin/bash
# $1 - branch
# $2 - string to find
# $3 - string to replace with

git checkout $1
for file in $(~/findFilesContainingStringInBranch.sh "$2"); do
  echo "          Filtering file $file:"
  ~/changeStringsInFileInCurrentBranch.sh "$file" "$2" "$3"
done

usage:

~/replaceStringInBranch.sh master "my_password" "********"

findFilesContainingStringInBranch.sh:

#!/bin/bash

# $1 - string to find
# $2 - branch name or nothing (current branch in that case)

git log -S "$1" $2 --name-only --pretty=format: -- | sort -u

usage:

~/findFilesContainingStringInBranch.sh "my_password" master

changeStringsInFileInCurrentBranch.sh:

#!/bin/bash

# $1 - file name
# $2 - string to find
# $3 - string to replace

git filter-branch -f --tree-filter "if [ -f $1 ];then sed -i s/$2/$3/g $1;fi"

usage:

~/changeStringsInFileInCurrentBranch.sh "abc.txt" "my_password" "********"

I have all those scripts located in my home folder, what is necessary for proper working in this version. I'm not sure that's the best option, but for now I cannot find better one. Of course every script has to be executable, what we can achieve with chmod +x ~/myscript.sh.

Probably my script is not optimal, for big repos it will process very long, but it works :)

And, at the very end, we can push our censored repo to any remote with:

git push <remote> -f --all

Edit: important hint from ElpieKay:

Don't forget to delete and recreate tags that you have pushed. They are still pointing to the old commits that may contain your password.

Maybe I'll improve my script in future to do this automatically.



Answered By - Karol Selak
Answer Checked By - Marie Seifert (WPSolving Admin)