GNU-Toolchain

This topic was published by and viewed 488 times since "". The last page revision was "".

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #995
    DevynCJohnson
    DevynCJohnson
    Keymaster
    • Topics - 444
    • @devyncjohnson

    The GNU Toolchain is a popular set of programming tools commonly used in Linux systems. The toolchain contains GNU Make, GCC, GNU Binutils, GNU Bison, GNU m4, GNU Debugger, and the GNU build system. Each of these tools help programmers make and compile their code to produce a program or library. Programmers wanting to use Linux as their development environment may wish to become knowledgeable of these tools.

    GNU Make is commonly known as the "make" command. "Make" reads special files called "makefiles". These makefiles contain information on how the code should be compiled. The makefile specifies where the source code is located, compiler options, where to install (if desired), program version, etc. Users can type "make" in a command-line, and "make" will search the current directory for makefiles. Alternately, users can type in the name of the specific makefile.

    The GNU Compiler Collection (GCC) is a popular compiler set used by most Linux systems. "gcc" is the C compiler, and "g++" is the C++ compiler. Other compilers in this set include gccgo (Go), gcj (Java), gfortran (Fortran), and GNAT (Ada).

    The GNU Binutils are a set of tools that manage binary files such as libraries, object files (*.o), assembly source code, etc. These tools are used by GNU Make, GCC, and the GNU debugger. There are many formats for assembly and binary data, so this toolset allows the GNU Toolchain to access various data and perform tasks on such files. The GNU Binutils use the libbfd (Binary File Descriptor library) to support the many formats. The GNU Assembler and the GNU Linker are also part of this package. The assembler creates the object files while the linker creates executables by "linking" object files into one file. Various other tools make up this package.

    GNU Bison is a parser generator. This means that it reads the specification of a programming language's "grammar" (in the form of tokens) and creates a parser. Tokens are strings that represent some special group or part of the programming. A lexer generator (usually flex) reads the lex file (Lexer.l) and converts it to C source code, which can be compiled to yield a lexer. The lexer reads the source code of a program and produces tokenized code. The parser generator (in this case Bison), reads a specification of the programming language's syntax and makes a parser that checks the syntax structure of the specified language. The parser reads the tokenized code (from a lexer) to see if the syntax is correct.

    NOTE: "flex" is not part of the GNU project.

    For illustration, flex reads a lex file that indicates what characters belong to which token. Such a file may have a line like this (PLUS "+"). This means the "+" character is represented by the "PLUS" token. Flex makes a lexer that will follow the given rules. Bison reads the specification in a parser file (Parser.y) and produces source code (perhaps "parser.c") that can be compiled. The parser is then created from the source code.

    NOTE: Various lexer file formats exist, so be sure you know which one your preferred lexer reads.

    Once the lexer and parser are made, the lexer will generate tokenized code by reading a project's source code. The parser receives the tokenized code and produces an abstract syntax tree (AST) based on the source code. With this, the parser can ensure the code is using correct syntax. If not, then an error is generated and the programmer can find and fix the error. Below is an example. Keep in mind that the examples are greatly simplified for better understanding. With real usage, there is a lot of code and programming inside the lexer and parser files.

    My source code

    sum = 7 + 3;

    The lexer makes tokens as seen below

    "sum" -> VAR
    [ \r\n\t]* -> WS
    "+" -> PLUS
    "=" -> EQ
    ";" -> SCOLON
    [0-9]+ -> NUM

    The parser gets the tokenized code and ensures the right structure is used by doing something like this.

    Input -> VAR,WS,EQ,WS,NUM,WS,PLUS,WS,NUM,SCOLON
    Valid syntax -> VAR,WS,EQ,WS,NUM,WS,PLUS,WS,NUM,SCOLON
    Status -> VALID

    If the semicolon or a white-space where missing, then an error would occur.

    In summary

    • Lex file maps characters to tokens: "+" => PLUS
    • Lexer generator + lex file => lexer
    • The parser file (syntax specification) defines how the code should be structured
    • Parser generator + syntax specification => parser
    • Lexer: character strings => tokens
    • Lexer: tokens + src => tokenized code
    • Parser: tokenized code + syntax specification => checks syntax and reports errors

    GNU m4 is a macros preprocessor. In other words, before the project's source code is parsed, lexed, or compiled, GNU m4 will interpret the macros code. A macros is code inside source code that performs some tasks on the source code itself before compiling. For instance, a developer has a small project for a calculator, and the source code is ready to be compiled. The configuration file is run and set to compile for a 64-bit system. The macros in the project's source code can be used to remove platform specific code as seen in the pseudo-macros code below (this means the code below is fake code that does not follow any "real" specification).

    //Sample code
    #IF ARCH=64
    code()
    //this is code I would use if compiled for 64-bit systems
    #ELSE IF ARCH=32
    code()
    //my code for 32-bit platforms
    #ELSE
    generic_code()

    In my compiled code for a 64-bit system, the 32-bit code and generic code would not be included. Thus, my compiled program is smaller than it would be if I had the programming language use a similar construct. Remember, the macros changes the source code itself. The compiled program does not include any macros code and never knows about the original coding.

    Some programming languages natively support macros code (like C/C++). However, not all programming languages have native macros support (like Python). If macros is needed for such programming languages, find a third-party library/module/extension for your preferred programming language, or obtain a third-party macros preprocessor and learn its supported macros language. Alternately, you could make your own macros preprocessor.

    GNU Debugger (GDB) is the debugging tool used in the GNU Toolchain. The executable/command is "gdb". Many CPU architectures and programming languages are supported. Also, many GUI-frontends exist and some IDEs support gdb as a plugin. To debug a program, compile your source using a special flag that adds debugging support. After the debugging is done and fixes have been applied, compile your source code normally.

    Example of command-line usage

    gcc myapp.c -g -o MyApp #compile; the "-g" flag implements debugging features
    chmod +x #you may need to grant executable permissions
    ./MyApp
    #the program crashes
    #time to investigate
    gdb ./MyApp #specific error codes and such are given
    gcc myapp.c -o MyApp #the programmer fixes errors and recompiles for normal use

    The GNU build system, also know as autotools, is a set of utilities that allow programmers to make their source code packages portable to other Unixoid systems (Unix and Unix-like systems). This toolset includes Autoconf, Autoheader, Automake, Libtool, and GNUlib.

    Autoconf is used with a configure.ac file to produce a "configure" script. When using a terminal in the same directory as the source code project, the user can type "config" and the configuration script will be executed. During execution, the "config.status" script is executed which uses various input files (including "Makefile.in") to produce "Makefile".

    Automake uses "Makefile.am" to make portable makefiles. Automake creates "Makefile.in" which is used by the configuration script to generate the final "Makefile".

    Libtool is used to create static and dynamic libraries that are portable.

    GNUlib is a lot like Libtool, but instead of making libraries portable, GNUlib helps to make executables more portable.

    Other GNU development tools exist, but these are the ones that make up the official GNU Toolchain.

    Further Reading

Viewing 1 post (of 1 total)