Issue
Trying to run an awk in a subprocess but pass that awk some variable using f strings
with open(file, 'w') as file:
for file in glob.glob('*-ec2-*_dailyusage.json.csv'):
command = f'awk "{{print {date} \",\" $0}}" "{file}" >> "{new_file}"'
subprocess.run(command, shell=True)
was expecting to append each collom from the {file} into {new_file}
Solution
Demonstrating how to do this still calling awk from Python, but without the security and correctness problems of the prior approach:
# open output file only once, for write
with open(new_file, 'w') as outfile:
# loop over possible input files
for infile_name in glob.glob('*-ec2-*_dailyusage.json.csv'):
# open an input file
with open(infile_name, 'r') as infile:
# run awk with input from infile, appending to outfile
subprocess.run(['awk',
'-v', f'date={date}',
'{print date "," $0}'],
stdin=infile, stdout=outfile)
(I wouldn't use awk for this use case myself, but this answer may be useful if you have a real-world use case with a more complex awk script that's tested/vetted and might not be trivial to rewrite in pure Python).
Note the pertinent changes:
- We aren't using
shell=True
. When usingshell=True
, your command (or, when the command is a list, the first element of that list) is parsed as code, which would allow a date of$(rm -rf ~)
to delete your home directory; it also generally creates a new set of things to go wrong that you don't have any need for. - We're substituting values only into awk as variables specified with
-v var=value
, not by injecting those values into code. Just as avoidingshell=True
prevents values from having unexpected meanings to your shell, avoiding substituting into awk code prevents values from having unexpected meaning to awk itself. - We're opening stdin and stdout from Python. Because we aren't using
shell=True
, there's no shell to recognize>>file
and treat it as a directive telling it to openfile
for append; butstdout=open('file', 'a')
has the same effect. Similarly, usingstdin=open('file', 'r')
ensures that the filename is treated exactly as a filename and can't be unexpectedly something with a different meaning to awk -- preventing still more potentially surprising behaviors. - Because we're opening the output file only once, instead of re-opening it per copy of awk, we can open it for write, rather than for append. (This has the happy side effect that when you run your script twice, you don't get two copies of the output in a single file).
Answered By - Charles Duffy Answer Checked By - Mildred Charles (WPSolving Admin)