Issue
Tying a script to a specific interpreter via a so-called shebang line is a well-known practice on POSIX operating systems. For example, if the following script is executed (given sufficient file-system permissions), the operating system will launch the /bin/sh
interpreter with the file name of the script as its first argument. Subsequently, the shell will execute the commands in the script skipping over the shebang line which it will treat as a comment.
#! /bin/sh
date -R
echo hello world
Possible output:
Sat, 01 Apr 2017 12:34:56 +0100
hello world
I used to believe that the interpreter (/bin/sh
in this example) must be a native executable and cannot be a script itself that, in turn, would require yet another interpreter to be launched.
However, I went ahead and tried the following experiment nonetheless.
Using the following dumb shell saved as /tmp/interpreter.py
, …
#! /usr/bin/python3
import sys
import subprocess
for script in sys.argv[1:]:
with open(script) as istr:
status = any(
map(
subprocess.call,
map(
str.split,
filter(
lambda s : s and not s.startswith('#'),
map(str.strip, istr)
)
)
)
)
if status:
sys.exit(status)
… and the following script saved as /tmp/script.xyz
,
#! /tmp/interpreter.py
date -R
echo hello world
… I was able (after making both files executable), to execute script.xyz
.
5gon12eder:/tmp> ls -l total 8 -rwxr-x--- 1 5gon12eder 5gon12eder 493 Jun 19 01:01 interpreter.py -rwxr-x--- 1 5gon12eder 5gon12eder 70 Jun 19 01:02 script.xyz 5gon12eder:/tmp> ./script.xyz Mon, 19 Jun 2017 01:07:19 +0200 hello world
This surprised me. I was even able to launch scrip.xyz
via another script.
So, what I am asking is this:
- Is the behavior observed by my experiment portable?
- Was the experiment even conducted correctly or are there situations where this doesn't work? How about different (Unix-like) operating systems?
- If this is supposed to work, is it true that there is no observable difference between a native executable and an interpreted script as far as invocation is concerned?
Solution
See boldfaced text below:
This mechanism allows scripts to be used in virtually any context normal compiled programs can be, including as full system programs, and even as interpreters of other scripts. As a caveat, though, some early versions of kernel support limited the length of the interpreter directive to roughly 32 characters (just 16 in its first implementation), would fail to split the interpreter name from any parameters in the directive, or had other quirks. Additionally, some modern systems allow the entire mechanism to be constrained or disabled for security purposes (for example, set-user-id support has been disabled for scripts on many systems). -- WP
And this output from
COLUMNS=75 man execve | grep -nA 23 " Interpreter scripts" | head -39
on a Ubuntu 17.04 box, particularly lines #186-#189 which tells us what works on Linux, (i.e. scripts can be interpreters, up to four levels deep):
166: Interpreter scripts 167- An interpreter script is a text file that has execute permission 168- enabled and whose first line is of the form: 169- 170- #! interpreter [optional-arg] 171- 172- The interpreter must be a valid pathname for an executable file. 173- If the filename argument of execve() specifies an interpreter 174- script, then interpreter will be invoked with the following argu‐ 175- ments: 176- 177- interpreter [optional-arg] filename arg... 178- 179- where arg... is the series of words pointed to by the argv argu‐ 180- ment of execve(), starting at argv[1]. 181- 182- For portable use, optional-arg should either be absent, or be 183- specified as a single word (i.e., it should not contain white 184- space); see NOTES below. 185- 186- Since Linux 2.6.28, the kernel permits the interpreter of a script 187- to itself be a script. This permission is recursive, up to a 188- limit of four recursions, so that the interpreter may be a script 189- which is interpreted by a script, and so on. -- 343: Interpreter scripts 344- A maximum line length of 127 characters is allowed for the first 345- line in an interpreter scripts. 346- 347- The semantics of the optional-arg argument of an interpreter 348- script vary across implementations. On Linux, the entire string 349- following the interpreter name is passed as a single argument to 350- the interpreter, and this string can include white space. How‐ 351- ever, behavior differs on some other systems. Some systems use 352- the first white space to terminate optional-arg. On some systems, 353- an interpreter script can have multiple arguments, and white spa‐ 354- ces in optional-arg are used to delimit the arguments. 355- 356- Linux ignores the set-user-ID and set-group-ID bits on scripts.
Answered By - agc