Issue
I'm using the following Bash version in a up-to-date CentOS 7 VM:
GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
The following code performs as expected (take note of -x -y):
set -- -x -y; OPTIND=1; while getopts xy opt; do echo $opt; OPTIND=$OPTIND; done
x
y
However, when I combine the two short options to -xy, an infinite loop happens:
set -- -xy; OPTIND=1; while getopts xy opt; do echo $opt; OPTIND=$OPTIND; done
x
x
x ... infinite output
The trigger is the OPTIND=$OPTIND
assignment. If this is removed, the behavior doesn't happen. It feels like there is some hidden substring indexing that is going on:
-xy
isn't just index$1
thatOPTIND
could describe as a single integer- it feels like index
${1:1:1}
for the x - and
${1:2:1}
for the y
Perhaps these are indicated by other getopts parameters not described in the man page. Can anyone shed any light on this that might help me resolve this one issue when writing my wrapper to allow for partial argument/nested getopts handling?
For the curious: I've implemented some wrapper functions for nested getopts processing. These allow for the partial processing of arguments during which functions might be called that also use the wrapper functions to do getopts processing. I save the OPTIND
values in a stack array variable and as one pops out of a nesting, the OPTIND
needs to be reset. It all works quite nicely except for the case where one uses concatenated short flag arguments. (The implementation also makes specifying long options possible.)
Solution
How can this be remedied in Bash?
Well, by patching the sources. But I wouldn't like to do it, I believe the current behavior is the right one.
You could add an additional function to getopt.c
and expose it via some special variable in variables.c
that would allow manipulating getopts
internal state.
Or even simpler - I see getopts.def loadable builtin which you could patch to add some additional option to serialize/deserialize getopts state.
And you can also provide your own implementation of getopts
as a bash function with your custom semantics and custom state serializer/deserializer.
Can anyone shed any light on
From posix getopts:
If the application sets OPTIND to the value 1, a new set of parameters can be used: either the current positional parameters or new arg values. Any other attempt to invoke getopts multiple times in a single shell execution environment with parameters (positional parameters or arg operands) that are not the same in all invocations, or with an OPTIND value modified to be a value other than 1, produces unspecified results.
From that we know:
- setting OPTIND=1 resets the internal state of getopt ()
- it is not specified what should happen if you modify
OPTIND
.
The behavior you are seeing is documented - setting OPTIND=1
as $OPTIND
is equal to 1
resets getopt()
, which results in endless loop, as one would expect. Except for that, bash documentation does not specify what should happen when you modify OPTIND
. Do not do it. Your expectation that setting OPTIND
to custom value will affect getopts
in specific ways is not based on anything. It will not.
resolve this one issue when writing my wrapper to allow for partial argument/nested getopts handling?
If you are writing your own argument parsing module, do not use getopts
and do not depend on undefined, unspecified nor implementation defined behavior. I suggest to do it the same way GNU getopt
does - produce a shell source-able string in separate sub-process, instead of relying on global variables OPT*
and contributing to spaghetti code.
Do not nest getopts
, it is not re-entrant and there is no way to affect it's internal state and it uses global variables. getopts
sets OPTIND
, it is not required to read it, except for the case when OPTIND
is reset to 1
, in which case getopts
is reset. Any other value may just be ignored. Just call one getopts
after another.
Answered By - KamilCuk