Tuesday, January 4, 2022

[SOLVED] How to extract branch name using regex and sed?

Issue

How can I extract the branch name from a string using bash? For example, I have the following command:

branch=$(git branch -a --contains $sha)

This may return:

  1. * branch-1.0 (the prefix is always an asterisk)

  2. branch-2.0 remotes/origin/branch-2.0 (here may be a new line instead of a space)

  3. master remotes/origin/master (here may be a new line instead of a space)

And I need only the branch name (and only once) - master, branch-2.0 or branch-1.0. I know it can be done with the sed command, but I can't figure out how.

I use the following regex: (branch-[0-9].[0-9])|(master)


Solution

This is how it can be done in Bash, without using an external regex parser:

# Read reference name path in an array splitting entries by /
IFS=/ read -ra refname < <(
  # Obtain full branch reference path that contains this sha
  git branch --format='%(refname)' --contains="$sha"
)

# Branch name is the last array element
branchname="${refname[-1]}"

printf 'The git branch name for sha: %s\nis: %s\n' "$sha" "$branchname"

Or using a POSIX-shell grammar only:

# Read reference path
refname=$(
  # Obtain full branch reference path that contains this sha
  git branch --format='%(refname)' --contains="$sha"
)

# Trim-out all leading path to get only the branch name
branchname="${refname##*/}"

printf 'The git branch name for sha: %s\nis: %s\n' "$sha" "$branchname"

EDIT:

As Philippe mentionned --format='%(refname:short) will directly return the branch name without path, thus saving the need for further processing to extract it from the full reference path.

branchname=$(git branch --format='%(refname:short)' --contains="$sha")


Answered By - Léa Gris