Issue
I have a long string, which contains a filename somewhere in it. I want to return just the filename.
How can I do this in a shell script, i.e. using sed, awk etc?
The following works in python, but I need it to work in a shell script.
import re
def find_filename(string, match):
string_list = string.split()
match_list = []
for word in string_list:
if match in word:
match_list.append(word)
#remove any characters after file extension
fullfilename = match_list[0][:-1]
#get just the filename without full directory
justfilename = fullfilename.split("/")
return justfilename[-1]
mystr = "the string contains a lot of irrelevant information and then a filename: /home/test/this_filename.txt: and then more irrelevant info"
file_ext = ".txt"
filename = find_filename(mystr, file_ext)
print(filename)
this_filename.txt
EDIT adding shell script requirement
I would call shell script like this:
./test.sh "the string contains a lot of irrelevant information and then a filename: /home/test/this_filename.txt: and then more irrelevant info" ".txt"
test.sh
#!/bin/bash
longstring=$1
fileext=$2
echo $longstring
echo $fileext
Solution
With bash
and a regex:
#!/bin/bash
longstring="$1"
fileext="$2"
regex="[^/]+\\$fileext"
[[ "$longstring" =~ $regex ]] && echo "${BASH_REMATCH[0]}"
Output:
this_filename.txt
Tested only with your example.
See: The Stack Overflow Regular Expressions FAQ
Answered By - Cyrus