Mico/Milang, a basic compiler

It’s always fascinating to be given a new box full of LEGO bricks you never knew about and you discover can build these great machineries you’ve been working with for years, right? Well, I had the same feeling when I was introduced to compilation…

While the domain goes way deeper than that, I would crudely sum up compilation as ‘the art of translating understandable user-language into computer binary jibberish’. Indeed, a compiler is a tool that takes a source text as input, your program, generally written in a high-level readable language, and it creates a matching output your computer can execute to actually perform the tasks in the program. This process works in several steps consisting in a series of translations that use various programming languages to slowly transform your code. However, the two main parts of the compilation are the analysis of the input text (to deconstruct it into atomic parts and store it as adapted data structures) and the synthesis of the output text (to rebuild an ordered list of machine-specific instructions corresponding to the program).

To try and wrap our heads around these notions, the compilation course I took at Polytech offered to create our own compiler, step by step. I developed the Mico/Milang (for ‘Mini-Compiler/Mini-Language’) duo that allows you to write some simple programs in my personal language, the Milang, and then compile and execute them with the Mico.

To have a better control over our set of instructions, we also developed a virtual machine to execute the code produced by the Mico.

The whole compiler/language can be downloaded here and the dossier over here (Fr).

The Milang: a very basic programming language…

Of course, in only 4 months, it was impossible to develop a truly useful language. But I tried and incorporate core functionalities so the Milang could be used for some simple problems (like printing a message on the screen, writing some loops or computing an element of the Fibonacci series with a recursive algorithm).

The Mini-Language draws from C (with static variable typing, instructions separated by the ; character and case-sensitive keywords) but also from Python (since I tried to choose easily readable keywords and some instructions can be written with or without parentheses). It defines two variable types: numeric and strings. We differentiate between global variables shared by the whole program and local variables only known inside the function’s context.

You are provided with 3 control structures: the if/else selector (for now, there is no else if though), and the for and the while loops. At the moment, loops conditions use numeric variables and the for loop only works with a positive increment – but this step can be specified by the programmer.

You can input variable values and print messages thanks to the input and print keywords. When outputting a constant string, you can either use simple quotes or double quotes; for string variables, the + and * operators act as they do in Python (the former concatenates two strings and the latter repeats the given string a certain amount of times).

I also added a comment system that allows users to write notes and details in their codes, by encapsulating them in <:: and ::> tokens.

… combined with a simple compiler

The Mico runs Milang programs. It handles basic codes with a main function, function calls (even recursive ones), I/Os and the other elements introduced above. It also warns the user in case of syntax or semantic errors: this will stop the compilation process and print out the line and the column where the error was found.

The compiler has a debug system that can show more details about the execution of the code to the programmer by printing info on the different compilation steps as they occur.

A little code snippet

Here is a simple example of Milang code that shows most of the implemented features:

If you run it, this program will output 2, 1 and 4, as predicted.

References
  1. S. Graillat, “Calculabilité – Décidabilité : Automates à pile, grammaires hors-contexte et langages hors-contexte,” 2017. [Course notes for MAIN4 – Polytech Sorbonne].
  2. I. Free Software Foundation, “The Bison Parser Algorithm.” (https://www.gnu.org/software/bison/manual/html_node/Algorithm.html), 2008.
  3. N. Jones, “Efficient C Tip #12 – Be wary of switch statements.” (https://embeddedgurus.com/stack-overflow/2010/04/efficient-c-tip-12-be-wary-of-switch-statements/), Avr. 2010.
  4. I. Wikimedia Foundation, “Shift-reduce parser.” (https://en.wikipedia.org/wiki/Shift-reduce_parser), Sept. 2017. [Online; last access on 08-Nov-2017].
  5. I. Wikimedia Foundation, “Assembleur.” (https://fr.wikipedia.org/wiki/Assembleur), Oct. 2017. [Online; last access on 09-Nov-2017].
  6. J. R. Levine, flex & bison (Unix Text Processing Tools). O’Reilly Media, Inc., Août 2009.
  7. F. Harrouet, “Générer un analyseur avec Flex & Bison.” (https://www.enib.fr/~harrouet/Data/Courses/Flex_Bison.pdf).
  8. R. S. Alfred V. Aho, Monica S. Lam and J. D. Ullman, Compilers: Principles, Techniques, and Tools. Pearson Education, Inc., 1986 (2006).

Leave a Reply

Your email address will not be published. Required fields are marked *