Thursday, November 18, 2021

[SOLVED] Loop over directories one level deep and execute script with directory name as argument in bash in that directory

Issue

I have a batch script which I want to use to run on all directories at a specific level. It calls a script(recon1.sh) which takes directory name as an argument and stores result folders in each sub-directory. What I want is that when pathway_name is specified from the commandline(bash batch_recon1.sh pathway_X), it cd into each protein folder(protein_A,protein_B, ....protein_Z) and execute recon1.sh and do this for all proteins under the pathway folder. Currently, it ends after one protein(protein_A) doesn't start for protein_B and others. How can I fix this? I have tried with simpler script which loops over sub-folders only one level deep and write file_names in that directory to text file, which works perfectly fine, but for some reason this code(batch_recon1.sh for recon1.sh) isn't working. Can someone help?

folder structure:

[software folder]

${HOME}/ProjName/software/batch_recon1.sh

${HOME}/ProjName/software/recon1.sh

[Project folder]

${HOME}/ProjName/pathways/Pathway_X/protein_A/

${HOME}/ProjName/pathways/Pathway_X/protein_A/protein_A_id
${HOME}/ProjName/pathways/Pathway_X/protein_A/protein_A_searchRes
${HOME}/ProjName/pathways/Pathway_X/protein_A/protein_A_alignment

${HOME}/ProjName/pathways/Pathway_X/protein_B/

${HOME}/ProjName/pathways/Pathway_X/protein_B/protein_B_id
${HOME}/ProjName/pathways/Pathway_X/protein_B/protein_B_searchRes
${HOME}/ProjName/pathways/Pathway_X/protein_B/protein_B_alignment

recon1.sh takes protein_name as arg (e.g. recon1.sh protein_A)

so ${dirname} should be (i.e.) "protein_A" not full path to protein folder.

used as (from command line) bash batch_recon1.sh Pathway_X

code for batch_recon1.sh:

    #!/bin/bash
    # -*- coding: utf-8 -*-
        
    set -e
    current_path=$(pwd)
    
    pathway_name=$1
    path_to_folder=${HOME}/ProjName/pathways/${pathway_name}
    path_to_software_folder=${HOME}/ProjName/software

    cd ${path_to_folder}
    echo '----running batch_reconcile1.sh on pathway:'$@
        
        


    for fol in "${path_to_folder}"/*/; do
      [ -d "${fol}" ] || continue ## if not a directory skip
      dirname="$(basename "${fol}")"
      (cd "${fol}" && bash ${path_to_software_folder}/recon1.sh ${dirname} )
      cd ..
    done

Solution

The script is probably exiting early because recon1.sh is failing, and you have set -e (exit if a command fails).

It's probably failing due to cd .. which should not be there.

It may be also be better to keep attempting recon1.sh on all proteins, even if one fails (that's up to you).

Replace

(cd "${fol}" && bash ${path_to_software_folder}/recon1.sh ${dirname} )
cd ..

With:

cd "${fol}" || continue
bash ${path_to_software_folder}/recon1.sh "${dirname}" || echo "recon1.sh failed for ${dirname}" >&2

The working directory gets set correctly. If recon1.sh still fails for any reason, an error is printed, but the script won't exit, and the next protein is attempted.



Answered By - dan