Writing a simple Compiler on my own - Function Semantics (part 1) [C][Flex][Bison]

[Custom Thumbnail]

All the Code of the series can be found at the Github repository:


    Hello it's a me again @drifter1! Today we continue with my Compiler Series, a series where we implement a complete compiler for a simple C-like language by using the C-tools Flex and Bison. In this article we will write the needed code for the parameter definition, function declaration and also parameter type-compatibility checking, which are some of the function semantics. Again all this will not be included in the "actual" compiler yet...

The topics that we will cover today are:

  1. Parameter definition code
  2. Function declaration code
  3. Function parameter type compatibility checking code
    In part 2 we will discuss how we can check this compatibility after the function is declared using a "revisit queue", cause functions are being called before they are declared!


    Actually you need to read and understand all the topics that I covered in the series as a whole, as these articles will give you access to knowledge about:
  • What Compiler Design is (mainly the steps)
  • For which exact Language the Compiler is build for (Tokens and Grammar)
  • How to use Flex and Bison
  • How to implement a lexer and parser for the language using those tools
  • What the Symbol Table is and how we implement it
  • How we combine Flex and Bison together
  • How we can pass information from the Lexer to the Parser
  • How we define operator priorities, precedencies and associativity
  • What Semantic Analysis is (Attributes, SDT etc.)
  • How we do the so called "Scope Resolution"
  • How we declare types and check the type inter-compatibility for different cases


Talking about the series in general this series can be rated:
  • Intermediate to Advanced
Today's topic(s) can be rated:
  • Medium
So, without further ado, let's now finally start with the actual Tutorial...

Actual Tutorial Content

Parameter Definition

    Knowing that a Parameter is a somewhat complex structure, I think that creating a function that gives back such a structure is not a bad idea. The parameter struct is defined as:
typedef struct Param{
    // parameter type and name
    int par_type;
    char param_name[MAXTOKENLEN];
// to store the value int ival; double fval; char *st_sval; int passing; // value or reference }Param;
    This shows us that a Parameter is mainly described by it's data type, name and passing way (by reference or value). Of course, when being in a function call a parameter is a constant, identifier or even expression, which gives us some "final" type. Let's call the function that manages this "def_param" and let this function return a Param struct variable. This means that the declaration of this function looks as following:

Param def_param(int par_type, char *param_name, int passing);

    What the function has to do is simply set the values of the corresponding entries. This means that the entry "par_type" will get the value of the parameter "par_type"etc. In the end we also have to return the created structure. So, the actual code looks as following:

Param def_param(int par_type, char *param_name, int passing){
    Param param; /* Parameter struct */
/* set the information */ param.par_type = par_type; strcpy(param.param_name, param_name); param.passing = passing;
/* return the structure */ return param; }

    This function will be used in the parser action code when we are in a rule that contains parameters. And so in both the function calls and function declarations. There is not much more to say really. You will see more when it's time! :)

Function Declaration

    We already declared a function for type declaration called "set_type", but this function is not enough for functions declarations. When declaring functions we also have to define the number of parameters and also the actual parameters that the function has. The actual parameters will come as an input of this function, meaning that we will manage them outside of this function. This management will be discussed when we get more into bison and how we integrate all these semantics. Let's call the function that does the function declaration "func_declare". The declaration of this function looks as following:

int func_declare(char *name, int ret_type, int num_of_pars, Param *parameters);

    This function of course has to lookup the entry from the symbol table. Afterwards we declare the function, a step where we can also check if the function was already declared! If no error occurs then we just give the corresponding values to the corresponding entries of the symbol table entry.
The code is:
int func_declare(char *name, int ret_type, int num_of_pars, Param *parameters){
    /* lookup entry */
    list_t *l = lookup(name);
/* if type is not defined yet */ if(l->st_type != UNDEF){ /* entry is of function type */ l->st_type = FUNCTION_TYPE;
/* return type is ret_type */ l->inf_type = ret_type;
/* parameter stuff */ l->num_of_pars = num_of_pars; l->parameters = parameters;
return 0; /* success */ } /* already declared error */ else{ fprintf(stderr, "Function %s already declared!\n", name); exit(1); } }

    This function will be called when declaring functions and more specifically in some part(s) of the "function_head" rule. I guess you already see the big problem here! Functions are declared after they are being used, meaning that we will have to come up with some trick that checks the function declaration and compatibility of the parameters (comes next) after we get to this point. This already shows us that our compiler can't be one-pass! We will have to do some "revisiting" of specific things, which include function semantics for example.

Function Parameter Type Compatibility Checking

    Last time we created a function called "get_result_type", which also had a special case for no operator (NONE). This case is useful for assignments and parameter compatibility checks. Using this function and the types of all the parameters of a function call and the function declaration, we can check if the parameters are compatible or not! Let's call such a function "func_param_check", which of course needs the name of the function, number of parameters and parameters of the function call. So, the declaration of this function looks as following:

int func_param_check(char *name, int num_of_pars, Param *parameters);

    Getting more in-depth this function has to lookup for the symbol table entry of the function, which actually is the entry that stores the function declaration details. Then, we first have to check if the number of parameters is right. The number of parameters of the function call and function declaration of course has to be the same. If it's not we will return an error! Lastly we then check if each function parameter is compatible with getting the function call parameter. Using the previously mentioned function "get_result_type", type errors get managed "automatically" from that function, without having to check the result of the function. Either way, the result would be '1' if they are compatible, as we saw last time.
The code for all that looks as following:
int func_param_check(char *name, int num_of_pars, Param *parameters){
    int i, type_1, type_2;
/* lookup entry */ list_t *l = lookup(name);
/* check number of parameters */ if(l->num_of_pars != num_of_pars){ fprintf(stderr, "Function call of %s has wrong num of parameters!\n", name); exit(1); }
/* check if parameters are compatible */ for(i = 0; i < num_of_pars; i++){ /* type of parameter in function declaration */ type_1 = l->parameters[i].par_type;
/* type of parameter in function call*/ type_2 = parameters[i].par_type;
/* check compatibility for function call */ get_result_type(type_1, type_2, NONE); /* error occurs automatically in the function */ }
return 0; /* success */ }

    It's easy to guess where exactly such a function will be used. Well, it's when calling a function and so in a "function_call" statement or expression, to check if the parameters of the function call are compatible with the types of the function declaration. The function declaration of course occurs much much later (as we already mentioned earlier) and so a "revisit queue" has to be created so that we can check the parameter compatibility and function declaration when we get to the point where a function get's declared. This is a so called "Backtracking" technique that we will discuss in part 2 and so next time!

Note: We might tweak today's code later on, when it actually gets into use!



    No actual references. Only stuff we discussed through-out the series, but now "implemented" somewhat!


All of the images are custom-made!

Previous parts of the series

Final words | Next up on the project

     And this is actually it for today's post! I hope that I explained everything as much as I needed to, meaning that you learned something out of it.
Next up on this series are:
  • Semantic analysis (creating and using even more semantic rules)
  • Intermediate Code generation (Abstract Syntax Tree)
  • Machine Code generation (MIPS Assembly)
     Which are all topics that will need more than one article to complete. Also, note that we might also get into Optimizations later on, or could even extend the Language by adding complex datatypes (structs and unions), more rules etc.
So, see ya next time!

GitHub Account:


Keep on drifting! ;)

3 columns
2 columns
1 column