Issue
In a root folder, I have a subfolder structure containing some mp3-files. I want to copy all subfolders and the mp3-files to an sd-card, which can be read by an mp3-player. Simply copying the entire root folder to the card messes up the mp3-file order. This problem is known an documented in the internet. One solution is to use rsync, which does not work for me - the other is to copy the files one after the other and with that the order is fine for my player. So I wrote a bash script to walk through my subfolder structure, create the some subfolders on the sd-card and copy the mp3-files into them. Since my subfolder structure is kind of random, I decided to write a recursive function to be called in each subfolder like this:
ScanKo ()
{
cd "$1"
for i in *
do
if [[ -d "$i" ]]
then
mkdir -p "$2/$i"
ScanKo "$1/$i" "$2/$i"
elif [[ "${i##*.}" = "mp3" ]]
then
cp "$1/$i" "$2/$i"
fi
done
}
ScanKo "/home/monkey/source_root/" "/media/stick/destiny_root/"
The function ScanKo gets two parameters: source and destiny root folders. I cd into the source folder and use a for loop to scan through everything. If I hit a subfolder, I create this one on the destiny point (the sd-card) and call the function again with the subfolders as the parameters. If I hit an mp3-file, I copy it to the destiny. It only works in theory. It seems, that bash looses its context, as soon as a child function returns to its mother from it was called.
I resolved this by implementing the same approach with a recursive process. The according bash script for the process with the name "sdMachen" looks like this:
cd "$1"
for i in *
do
if [[ -d "$i" ]]
then
mkdir -p "$2/$i"
bash "$3/${0##*/}" "$1/$i" "$2/$i" "$3"
elif [[ "${i##*.}" = "mp3" ]]
then
cp "$1/$i" "$2/$i"
fi
done
I call the script on the command line with three parameters:
bash sdMachen "/home/monkey/source_root/" "/media/stick/destiny_root/" "$(pwd)"
I need the third parameter, just so I can go back the location where sdMachen is located. The part
"$3/${0##*/}"
is needed just so the right bash script is found each time. Basically it is equivalent to
/folder-where-script-is/sdMachen
It works with the recursive process - but why does it not work with the recursive bash function?
Solution
There are many pitfalls associated with DIY recursive filesystem traversal in shell code. Apart from the cd
issue that you encountered, other common problems include the fact that *
doesn't expand to all of the entries in a directory by default, and symbolic links can cause infinite loops in code that doesn't handle them carefully. Fortunately there are established safe mechanisms for doing recursive traversals safely. The globstar mechanism in Bash is one of them. It was introduced in Bash 4.0 but was vulnerable to crashes caused by circular symlinks until Bash 4.3.
The traditional way to do recursive traversal in shell code is to use find. This Shellcheck-clean code demonstrates one way to use find
to solve your problem:
#! /bin/bash -p
srcroot=$1
destroot=$2
find "${srcroot/#-/.\/-}" -type f -name '*.mp3' -printf '%P\0' \
| while IFS= read -r -d '' mp3path; do
srcpath=$srcroot/$mp3path
destpath=$destroot/$mp3path
destdir=${destpath%/*}
[[ -d $destdir ]] || mkdir -p -v -- "$destdir"
cp -v -- "$srcpath" "$destpath"
done
- I've presented the code as a standalone shell program but it's easy to incorporate it into a function if you want to do that. Localize all of the variables that it uses.
"${srcroot/#-/.\/-}"
is normally the same as"$srcroot"
, but it expands strings like-dir1/dir2
to./-dir1/dir2
to prevent strings that begin with-
being treated as options, instead of directories, byfind
. See Substituting part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of the syntax.- See BashFAQ/001 (How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?) for an explanation of
find ... | while IFS= read -r -d '' ...
. It ensures that the code can handle arbitrary file paths, including (unusual) ones that contain newline characters. - See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for an explanation of
${destpath%/*}
. [[ -d $destdir ]] ||
could be removed without breaking the code, but it might run significantly more slowly due to running many unnecessarymkdir
processes.- See Bash Pitfalls #2 (cp $file $target) for an explanation of the
--
arguments in themkdir
andcp
commands.
Answered By - pjh Answer Checked By - David Goodson (WPSolving Volunteer)