Embedding Other Languages in BASH

This topic was published by and viewed 1869 times since "". The last page revision was "".

Viewing 1 post (of 1 total)
  • Author
    Posts

  • DevynCJohnson
    Keymaster
    • Topics - 437
    • @devyncjohnson

    Bash is a commonly used scripting language for shell scripts. A script is a small list of instructions for the computer to follow. A scripting language is a programming language used in a script. Shell scripts are scripts written in a shell language, and a shell language is a programming language used within a terminal. In BASH scripts (shell scripts written in BASH) users can use more than just BASH to write the script. There are commands that allow users to embed other scripting languages into a BASH script. This article will explain how to embed several languages into one script.

    NOTE: this article may be easier to understand if you are familiar with BASH and writing script.

    To embed a language into the script, the programmer must write the script up to the point where another language is needed. Assume the script requires saving the number pi (3.1415993......) to a variable. The programmer knows Lua script well, but does not know how to write equivalent code in BASH. The programmer may do this for efficiency and performance reasons. As a result, the programmer types the following:

    export PI=$(lua5.2 -e "print(string.format(\"%f\", math.pi))")

    The BASH part of the code says to save Lua script's output as a global variable "PI". Here is the trick to embedding other scripting languages. The Lua interpreter is executed "lua5.2". The parameter "-e" means execute the following code. The Lua programming in the quotes is the embedded Lua code in the shell script. In general, to embed any scripting language, execute the interpreter, which in most cases is the name of the language. For instance, the command for the Python scripting language is "python". If the programmer wants to use a specific version of the language, the programmer can type: python3.3 for Python version 3.3.

    NOTE: an interpreter is a program that reads code and performs the commanded tasks.

    NOTE: only embed code if it is absolutely necessary. The main purpose for embedding code is if the primary programming language cannot perform the task as efficiently as the secondary (embedded) language. Every language has its strengths and weaknesses. For instance, Perl works well with text manipulation and Scilab works well with numbers and math.

    Below are the formats for various shell scripting languages. Place the embedded programming in quotes.

    • python -c ""
    • python3 -c ""
    • jython -c ""
    • cython -c ""
    • ruby -e ""
    • jruby -e ""
    • rubyjs -e ""
    • perl -e ""
    • csh -c ""
    • tcsh -c ""
    • mksh -c ""
    • ksh -c ""
    • zsh -c ""
    • ash -c ""
    • dash -c ""
    • fish -c ""
    • coffee -e ""
    • lua5.2 -e ""
    • scilab -e ""

    The setup is the same for all these languages and interpreters. The parameter "-c" stands for command. This is exactly the same as "-e" used in other interpreters. For many languages, the programmer may need to look up whether to use "-c" or "-e", or the programmer could execute the code each way to see which works. Another fact to take note of, some interpreters require the programmer to specify a version. This is seen in Lua (the example used in this article is lua5.2). The command "python" reads Python2 code, so programmers must type "python3" when using Python3 code. The two versions of Python are different.

    One disadvantage to embedding other languages is the programmer must escape characters that will be unintentionally read by the interpreter for BASH scripts. Here is an example of what should NOT be done:

    perl -e "print "Hello world.\n""

    BASH's interpreter will think that "print" is the only Perl command because BASH saw what it thought was the closing quote. Then, BASH is confused by this code after Perl is executed: (Hello world.\n""). Backslashes "\" can fix this problem. The backslash tells the BASH interpreter to ignore the following character. Then, only the embedded code will handle the escaped character. Once fixed, the code will look like this:

    perl -e "print \"Hello world.\n\""

    When embedding in BASH scripts, some special characters to escape include quotes, dollar signs, literal backslashes, and some other characters.

    The embedded code can send and receive output and input in many forms. The embedded code can send output through piping, writing, saving, and printing. Also, the code can receive input through BASH variables, files, user input, and pipes.

    To send output via pipe, a pattern like this will work: INTERPRETER -c "EMBEDDED" | BASH COMMAND OR EMBEDDED CODE. For illustration, this pattern would look like this:

    ksh -c "df" | cat > ./save_to_file

    In that example, code executed in ksh to produce the free disk space information and piped to BASH to save in a file.

    To write the output straight to a file without piping to a BASH command, use code of this form:

    dash -c "whereis firefox" > ./firefox_path

    This code uses dash to find all the folders and files pertaining to an application called Firefox and writes the output to a file. This pattern is more efficient then piping to another command. As the saying goes, "cut out the middle man". This saying is an excellent tip in computer programming.

    To save the output to a BASH variable, follow this pattern: VARIABLE_NAME=$(INTERPRETER -c "EMBEDDED CODE"). BASH will capture the output of the embedded code and save it to the variable. For example, DATA=$(python3 -c "print(7+3)") will use Python3 to print the value of seven plus three and save it to a variable in BASH.

    To have the output of the embedded code print to the screen without going through any other commands or variables, try this: INTERPRETER -c "EMBEDDED CODE". For instance, jruby -e "puts \"Hello world\"" prints the output to the screen.

    To input data to embedded code with a BASH variable, use this template: INTERPRETER -c "COMMANDS $BASH_VARIABLE MORE_COMMANDS". Actual code would look like this

    export VAR="/dev"; ash -c "ls $VAR"

    In BASH code, the variable "VAR" is set to /dev. Then, the ash code executes "ls" with the variable. This works because the BASH interpreter replaces all variables with the set values before executing. So, when BASH sees the dollar sign, it knows that it is a variable, but it does not know the variable is mixed with ash code. However, if the user escapes the dollar sign with a backslash, then BASH will ignore it and the ash interpreter will treat the variable as its own. This would allow a programmer to type this code

    export VAR="/dev"; ash -c "VAR=\"/etc\"; ls \$VAR"

    The result is a list of files in /etc rather than /dev.

    To use information from a file, write the embedded code and use the command that the language normally uses to open files. Here is a Python3 example:

    python3.3 -c "import sys, io; DATA = open('./my/file', 'r').read()" | sed -e "s|FIND|REPLACE|gI" | less

    This code opens a file using Python3 code and pipes it to a BASH command - sed. This command then makes changes to this data using the instructions given via sed code (s|FIND|REPLACE|gI). After the FINDs are replaced with REPLACE, then the output is piped to the less command for the user to read.

    To send the user's typed input directly to the embedded code, use that language's command that allows users to type the input. In BASH, mksh, dash, and some other shell languages, that command is "read". So, a programmer could type

    mksh -c "read INPUT; echo \"You typed '\$INPUT'\""

    This would allow the code to accept input from users. That line waits for a user to type and hit enter. After pressing enter, this line would print "You typed 'I love Linux.'".

    Programmers can design their code so BASH could pipe data to embedded code. The programmer must use the command specific to each language that allows piping. With Python3, a programmer would type: cat ./fix/my/file | python3.3 -c "import re, sys; DATA = re.sub('FIND', 'REPLACE', sys.stdin.read()); print(DATA)" > ./new/file. This line uses BASH to open and pipe the contents of a file to Python. Python will then find the specified text and replace it with text the programmer typed. The output is then written to a new file via BASH code. The essential command that enabled the piping was "sys.stdin.read()". This Python command takes the piped data and the command itself acts like a variable. All languages that handle pipes have some equivalent of this command.

    Many readers may be wondering "Can I do embedding with other programming languages other than BASH?". The answer is yes. Many programming languages allow users to use other languages within the code of some other programming language. However, embedding and mixing languages has advantages and disadvantages.

    Advantages

    • Larger possibilities
    • Better performance (when used correctly)
    • Helps in writing code when the programmer can only think of the code for the needed algorithm in a different language

    Disadvantages

    • Slower code (when used incorrectly)
    • Requires the programmers and editors to understand more than one programming language
    • More required interpreters (a BASH-Lua hybrid script would require users to have Lua installed on the system)
    • The source code appears fragmented, that is, programmers see many languages mixed together. This can make it difficult to find specific code.

    If hybrid coding is used correctly, a high-performance program can be made with ease. In general, keep all code as simple as possible.

Viewing 1 post (of 1 total)