跳至主要内容

Implement function overloading in c

Origin

When I learn the system call of linux, I saw the functions like this:

int open(const char *pathname, int flags);
int open(const char *pathname, int flags, mode_t mode);

And this:

int fcntl(int fd, int cmd);
int fcntl(int fd, int cmd, long arg);
int fcntl(int fd, int cmd, struct flock *lock);

They seems normal, right?
But c has no support for function overloading, how this system call implemented?

Analysis

In order to find out the implementation, we first dive into the head of those functions. We are going to take the fcntl as example.
The header of fcntl() is called fcntl.h, which is located under /usr/include/ with other c header files.

$ cat fcntl.h | grep 'fcntl'

extern int fcntl (int __fd, int __cmd, ...);

# include <bits/fcntl2.h>

Upon seeing this, we understand that they are implemented by variadic function. But with a variadic function, we can’t tell how many arguments that it has, so it’s not the end of this story.

So now we have the two problem: first is that we don’t know how many arguments are passed here, which mean we can’t tell the first form of fcntl and the other two. second is we don’t know the type of parameter. The fcntl() manual tell us:

man: fcntl() can take an optional third argument. Whether or not this argu‐
ment is required is determined by cmd. The required argument type is
indicated in parentheses after each cmd name (in most cases, the
required type is int, and we identify the argument using the name arg),
or void is specified if the argument is not required.

The manual make it clear: the cmd decide the number of parameter and their type.
So it can implemented like this:

int fcntl(unsigned int fd, unsigned int cmd, ...) {
  if (cmd == THIRD_ARGUMENT) {
    va_list v;
    va_start(v, cmd);
    if (cmd == LONG) {
      long p3 = va_arg(v, long);
      va_end(v);
      // handle fcntl(fd, cmd, arg)
    }
    struct flock *p3 = va_arg(v, struct flock*);
    va_end(v);
    // handle fcntl(fd, cmd, flock)
  }
  // handle fcntl(fd, cmd)

Simple program to illustrate it

And there is a runnable example to show this pattern:
Gist code

#include <stdio.h>
#include <stdarg.h>

void hello2(int p1, int p2)
{
    printf("hello2 %d %d\n", p1, p2);
}

void hello3(int p1, int p2, int p3)
{
    printf("hello3 %d %d %d\n", p1, p2, p3);
}

static void hello(int p1, int cmd, ...) {
    if (cmd == 7)
    {
        va_list v;
        va_start(v, p2);

        int p3 = va_arg(v, int);
        va_end(v);
        hello3(p1, p2, p3);

        return;
    }
    hello2(p1, p2);
}

Another trick – by macro

And this is not the end. You can also accomplish part of overloading by macro. How? The c++ make it by the compiler to add some info about parameter into function, and you can also make different functions by macro. Can you come up with it now?


For the following code snippet, they are absolutely valid function call in c, right?

hello1("a");
hello2("a", "b");

So we can turn the following into above form and we are done, right?

hello("a");
hello("a", "b");

And here is macro to concatenate two tokens:

#define glue(a, b)   a ## b
#define xglue(a, b)  glue(a, b)

we now need this:

xglue(hello, 1)("a")
xglue(hello, 2)("a", "b")

But how can we get 1 and 2? we absolutely don’t want to write it manually. Aha, what about count the number of argument?(actually it is where the number comes from). Now the problem become how to count the number of argument using macro. This question is hard and come up with a solution may be not so easy, so let’s see the solution some great guy come up with.

#define PP_NARG(...) \
    PP_NARG_(__VA_ARGS__,PP_RSEQ_N())
#define PP_NARG_(...) \
    PP_ARG_N(__VA_ARGS__)
#define PP_ARG_N( \
     _1, _2, _3, _4, _5, _6, _7, _8, _9,_10, \
    _11,_12,_13,_14,_15,_16,_17,_18,_19,_20, \
    _21,_22,_23,_24,_25,_26,_27,_28,_29,_30, \
    _31,_32,_33,_34,_35,_36,_37,_38,_39,_40, \
    _41,_42,_43,_44,_45,_46,_47,_48,_49,_50, \
    _51,_52,_53,_54,_55,_56,_57,_58,_59,_60, \
    _61,_62,_63,  N, ...) N
#define PP_RSEQ_N() \
    63,62,61,60,                   \
    59,58,57,56,55,54,53,52,51,50, \
    49,48,47,46,45,44,43,42,41,40, \
    39,38,37,36,35,34,33,32,31,30, \
    29,28,27,26,25,24,23,22,21,20, \
    19,18,17,16,15,14,13,12,11,10, \
     9, 8, 7, 6, 5, 4, 3, 2, 1, 0

// use it like PP_NARG(__VA_ARGS__)

Let’s analyse how it work.
This macro append a serial number from 63 to 0 at the end of our original variable length argument(now we have 64 + N parameter). Due to the limitation of macro, we put them into a temporary macro and send it to a macro which count the number parameter as _i. For example, if we original argument is “a”, now we have the following parameter mapping.

"a", 6362, ..., 2, 1, 0
_1, _2, _3, ..., _63, N, ...

So it’s clear that we get the number 1. And more argument also work. But notice that this macro can only support parameter number less than 64, which should be enough.

full program to illustrate it

Gist code


part of var.h


#define PP_NARG(...) \
    PP_NARG_(__VA_ARGS__,PP_RSEQ_N())
#define PP_NARG_(...) \
    PP_ARG_N(__VA_ARGS__)
#define PP_ARG_N( \
     _1, _2, _3, _4, _5, _6, _7, _8, _9,_10, \
    _11,_12,_13,_14,_15,_16,_17,_18,_19,_20, \
    _21,_22,_23,_24,_25,_26,_27,_28,_29,_30, \
    _31,_32,_33,_34,_35,_36,_37,_38,_39,_40, \
    _41,_42,_43,_44,_45,_46,_47,_48,_49,_50, \
    _51,_52,_53,_54,_55,_56,_57,_58,_59,_60, \
    _61,_62,_63,  N, ...) N
#define PP_RSEQ_N() \
    63,62,61,60,                   \
    59,58,57,56,55,54,53,52,51,50, \
    49,48,47,46,45,44,43,42,41,40, \
    39,38,37,36,35,34,33,32,31,30, \
    29,28,27,26,25,24,23,22,21,20, \
    19,18,17,16,15,14,13,12,11,10, \
     9, 8, 7, 6, 5, 4, 3, 2, 1, 0


#define glue(a, b)   a ## b
#define xglue(a, b)  glue(a, b)

overload.c

#include <stdio.h>
#include "var.h"

void hello1(const char *s) {
    printf("hello1 %s\n", s);
}

void hello2(const char *s, const char* s2) {
    printf("hello2 %s and %s\n", s, s2);
}

#define hello(...) xglue(hello, PP_NARG(__VA_ARGS__))(__VA_ARGS__)

int main(int argc, char *argv[]){
    hello("a");
    hello("a", "b");
    return 0;
}

Application

And we can implement the default argument of function by above tricks if you like, which is just like the open do. Try by yourself before have a look of Gist sample.

More

We just implement the function overloading of different number of parameter, but how to overload function with just different type?
This article provide a solution:

void sizeof_overload_float(float f)
{
    printf("Got float %f\n", f);
}

void sizeof_overload_double(double d)
{
    printf("Got double %f\n", d);
}

void sizeof_overload_longdouble(long double ld)
{
    printf("Got long double %Lf\n", ld);
}

#define sizeof_overload(A)\
    ((sizeof(A) == sizeof(float))?sizeof_overload_float(A):\
    (sizeof(A) == sizeof(double))?sizeof_overload_double(A):\
    (sizeof(A) == sizeof(long double))?sizeof_overload_longdouble(A):(void)0)

Above trick works, but also have some limitation: only for argument with different size.
Can you design a better way?

Reference

Written with StackEdit.

评论

此博客中的热门博文

Spring Boot: Customize Environment

Spring Boot: Customize Environment Environment variable is a very commonly used feature in daily programming: used in init script used in startup configuration used by logging etc In Spring Boot, all environment variables are a part of properties in Spring context and managed by Environment abstraction. Because Spring Boot can handle the parse of configuration files, when we want to implement a project which uses yml file as a separate config file, we choose the Spring Boot. The following is the problems we met when we implementing the parse of yml file and it is recorded for future reader. Bind to Class Property values can be injected directly into your beans using the @Value annotation, accessed via Spring’s Environment abstraction or bound to structured objects via @ConfigurationProperties. As the document says, there exists three ways to access properties in *.properties or *.yml : @Value : access single value Environment : can access multi

Elasticsearch: Join and SubQuery

Elasticsearch: Join and SubQuery Tony was bothered by the recent change of search engine requirement: they want the functionality of SQL-like join in Elasticsearch! “They are crazy! How can they think like that. Didn’t they understand that Elasticsearch is kind-of NoSQL 1 in which every index should be independent and self-contained? In this way, every index can work independently and scale as they like without considering other indexes, so the performance can boost. Following this design principle, Elasticsearch has little related supports.” Tony thought, after listening their requirements. Leader notice tony’s unwillingness and said, “Maybe it is hard to do, but the requirement is reasonable. We need to search person by his friends, didn’t we? What’s more, the harder to implement, the more you can learn from it, right?” Tony thought leader’s word does make sense so he set out to do the related implementations Application-Side Join “The first implementation

Implement isdigit

It is seems very easy to implement c library function isdigit , but for a library code, performance is very important. So we will try to implement it and make it faster. Function So, first we make it right. int isdigit ( char c) { return c >= '0' && c <= '9' ; } Improvements One – Macro When it comes to performance for c code, macro can always be tried. #define isdigit (c) c >= '0' && c <= '9' Two – Table Upper version use two comparison and one logical operation, but we can do better with more space: # define isdigit(c) table[c] This works and faster, but somewhat wasteful. We need only one bit to represent true or false, but we use a int. So what to do? There are many similar functions like isalpha(), isupper ... in c header file, so we can combine them into one int and get result by table[c]&SOME_BIT , which is what source do. Source code of ctype.h : # define _ISbit(bit) (1 << (