Issue
I've been trying to find an efficient way to rename lots of files, by removing a specific component of the filename, in bash shell in linux. Filenames are like:
DATA_X3.A2022086.40e50s.231.2022087023101.csv
I want to remove the 2nd to last element entirely, resulting in:
DATA_X3.A2022086.40e50s.231.csv
I've seen suggestions to use perl-rename, that might handle this (I'm not clear), but this system does not have perl-rename available. (Has GNU bash 4.2, and rename from util-linux 2.23)
Solution
I like extended globbing and parameter parsing for things like this.
$: shopt -s extglob
$: n=DATA_X3.A2022086.40e50s.231.2022087023101.csv
$: echo ${n/.+([0-9]).csv/.csv}
DATA_X3.A2022086.40e50s.231.csv
So ...
for f in *.csv; do mv "$f" "${f/.+([0-9]).csv/.csv}"; done
This assumes all the files in the local directory, and no other CSV files with similar formatting you don't want to rename, etc.
edit
In the more general case where the .csv is not immediately following the component to be removed, is there a way to drop the nth dot-separated component in the filename? (without a more complicated sequence to string-split in bash (always seems cumbersome) and rebuild the filename?
There is usually a way. If you know which field needs to be removed -
$: ( IFS=. read -ra line <<< "$n"; unset line[4]; IFS=".$IFS"; echo "${line[*]}" )
DATA_X3.A2022086.40e50s.231.csv
Breaking that out:
( # open a subshell to localize IFS
IFS=. read -ra line <<< "$n"; # inline set IFS to . to parse to fields
unset line[4]; # unset the desired field from the array
IFS=".$IFS"; # prepend . as the OUTPUT separator
echo "${line[*]}" # reference with * to reinsert
) # closing the subshell restores IFS
I will confess I am not certain why the inline setting of IFS doesn't work on the reassembly. /shrug
This is a simple split/drop-field/reassemble, but I think it may be an X/Y Problem
If what you are doing is dropping the one field that has the date/timestamp info, then as long as the format of that field is consistent and unique, it's probably easier to use a version of the first approach.
Is it possible you meant for DATA_X3.A2022086.40e50s.231.2022087023101.csv
's 5th field to be 20220807023101? i.e., August 7th of 2022 @ 02:31:01 AM? Because if that's what you mean, and it's supposed to be 14 digits instead of 13, and that is the only field that is always supposed to be exactly 14 digits, then you don't need shopt
and can leave the field position floating -
$: n=DATA_X3.A2022086.40e50s.231.20220807023101.csv
$: $: echo ${n/.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]./.}
DATA_X3.A2022086.40e50s.231.csv
Answered By - Paul Hodges Answer Checked By - Dawn Plyler (WPSolving Volunteer)