Hashplings

This topic was published by and viewed 884 times since "". The last page revision was "".

Viewing 1 post (of 1 total)
  • Author
    Posts
  • DevynCJohnson
    DevynCJohnson
    Keymaster
    • Topics - 444
    • @devyncjohnson

    Most users of Unixoid systems (Unix and Unix-like systems) have probably seen or heard about "shebangs", "hashplings", "interpreter directive", "interpreter designator", "hash-exclaim", etc. Those are names for that single-line at the beginning of scripts. The hashpling comes in many different forms and has an important purpose.

    The hashpling is the first line of a script, and it begins with "#!". When a script is executed, the program-loader will read the first line and see the hashpling. Then, the program-loader will pass the script to the specified interpreter in the hashpling. Once the interpreter receives the script, the script will be read and executed. The interpreter itself views the hashpling as it would any other comment. The hashpling only has importance to the program-loader or any software that checks magic numbers (discussed later).

    NOTE: A script is an executable file written in plain text that begins with a hashpling.

    The hashpling is in the format #!/PATH/TO/INTERPRETER [args]. A space between "#!" and the interpreter's path is purely optional. The presence or absence of a space does not matter. Some interpreters can accept parameters that the program-loader got from the hashpling. This means the hashpling #!/bin/csh -f is equivalent to running "csh -f SCRIPT" in a command-line. However, not all interpreters can do this. For instance, running "python -O SCRIPT" will byte-compile the specified script before execution. However, #!/usr/bin/python -O would not achieve the same results. It may be possible to make that hashpling work, but a programmer would need to create a wrapper or some type of multi-line-hashpling (also discussed later).

    The hashpling specifies an interpreter by using an absolute path to the executable or a shortcut to the proper executable. However, some systems may store the needed interpreter in a location that differs for other systems. For example, Python may be under /usr/bin/, /usr/local/bin, or some other location. Therefore, developers need a more cross-platform (portable) way of writing a hashpling. Thus, programmers can use a special hashpling that is portable, such as #!/usr/bin/env. The "env" hashpling searches the system's $PATH for the specified executable. This means scripts with #!/usr/bin/env python will be passed to "env" which will then view the hashpling and search for the interpreter. Once found, "env" will pass the script to the Python interpreter. "env" is nearly always in /usr/bin/, so portability issues are very minimal. Systems that place "env" somewhere else may create a shortcut under /usr/bin/ that points to "env".

    NOTE: OpenServer and Unicos are two examples of systems that store "env" under /bin/ rather than the usual /usr/bin/. On such systems, the admins should create a soft-link, or the program's installation script should create one.

    Shell scripts are a common exception to the use of #!/usr/bin/env hashplings. For example, nearly all Bash, Csh, Posix Shell (sh), ksh, etc. interpreters use #!/bin/*, where "*" is the shell interpreter's filename. If the shell interpreter is not installed on the system, then the system typically has shortcuts that point to a compatible shell.

    hashpling1

    Hashplings can also use relative paths, although this is strongly discouraged. However, it still works. For illustration, open a terminal in the home folder and type ln -s /bin/bash ./test (creates a soft-link/shortcut to Bash). Then, place this hashpling (#!./test) and the code echo "This works!" in a script. Then, give the script executable permissions (chmod +x ./script.sh). When executed, the expected output is seen (This works!). Despite the fact relative paths work, programmers should avoid them if the program is planned to be run on various systems.

    hashpling2

    On systems that use magic numbers to identify files, the "#!" in the hashpling are the magic numbers. The ASCII code (in hexadecimal) for the two characters is the magic number (23 and 21). The byte string is "0x23 0x21". When an executable file is run, the program-loader will see "0x23 0x21" and know that the executable is a script. Whether the script be encoded in ASCII or UTF-8, the magic number (!#) will still be "0x23 0x21". However, scripts using UTF-8 with BOM will not execute because the Byte-Order-Mark (BOM) is placed before "#!". Thus, the byte-string would be "0xEF 0xBB 0xBF 0x23 0x21". The first two hex numbers that will be read are "0xEF 0xBB" rather than "0x23 0x21" as expected in scripts. As a result, the script will not be executed successfully.

    hashpling magic number
    hashpling magic number

    Script using BOM (below)

    hashpling-bom

    NOTE: Many other encodings disrupt the hashpling including UTF-7, UTF-16*, and UTF-32*.

    hashpling-utf16le

    A special part of the Linux kernel that is called "binfmt_misc" permits various formats of executable files to be recognized. Without this part of the kernel, scripts would not execute. Binfmt_misc also allows jar files to be executed (if the Java Virtual Machine is installed). When WINE and Mono are installed, binfmt_misc can also pass Windows, DOS, and .NET files to the proper place for execution. Under /proc/sys/fs/binfmt_misc/, users can view files that specify various information used by binfmt_misc for executing the files.

    TIP: Installing "binfmt-support" and "binfmtc" can increase the amount of executable formats your system can support.

    hashpling6

    Hashplings can span multiple lines, but such hashplings are rare. Multiple-line-hashplings may be written in Bash, but the main code in the script is something else. Technically, these scripts with multiple-line-hashplings are called polyglots. A polyglot is a file that contains two (or more) different programming languages that are all fully functional. The "hashpling code" typically sets up an environment for the primary code, or prepares a compiler to compile the primary code. For actual examples of this concept, visit http://rosettacode.org/wiki/Multiline_shebang which show examples for various programming languages.

    Further Reading

Viewing 1 post (of 1 total)