Categories
c function-pointers

How do function pointers in C work?

1430

I had some experience lately with function pointers in C.

So going on with the tradition of answering your own questions, I decided to make a small summary of the very basics, for those who need a quick dive-in to the subject.

3

1658

Function pointers in C

Let’s start with a basic function which we will be pointing to:

int addInt(int n, int m) {
    return n+m;
}

First thing, let’s define a pointer to a function which receives 2 ints and returns an int:

int (*functionPtr)(int,int);

Now we can safely point to our function:

functionPtr = &addInt;

Now that we have a pointer to the function, let’s use it:

int sum = (*functionPtr)(2, 3); // sum == 5

Passing the pointer to another function is basically the same:

int add2to3(int (*functionPtr)(int, int)) {
    return (*functionPtr)(2, 3);
}

We can use function pointers in return values as well (try to keep up, it gets messy):

// this is a function called functionFactory which receives parameter n
// and returns a pointer to another function which receives two ints
// and it returns another int
int (*functionFactory(int n))(int, int) {
    printf("Got parameter %d", n);
    int (*functionPtr)(int,int) = &addInt;
    return functionPtr;
}

But it’s much nicer to use a typedef:

typedef int (*myFuncDef)(int, int);
// note that the typedef name is indeed myFuncDef

myFuncDef functionFactory(int n) {
    printf("Got parameter %d", n);
    myFuncDef functionPtr = &addInt;
    return functionPtr;
}

11

  • 24

    Thanks for the great info. Could you add some insight on where function pointers are used or happen to be particularly useful?

    May 8, 2009 at 15:55

  • 362

    “functionPtr = &addInt;” can also be written (and often is) as ” functionPtr = addInt;” which is also valid since the standard says that a function name in this context is converted to the address of the function.

    – hlovdal

    May 9, 2009 at 14:39

  • 26

    hlovdal, in this context it’s interesting to explain that this is what enables one to write functionPtr = ******************addInt;

    May 10, 2009 at 17:54

  • 117

    @Rich.Carpenter I know this is 4 years too late, but I figure other people might benefit from this: Function pointers are useful for passing functions as parameters to other functions. It took a lot of searching for me to find that answer for some odd reason. So basically, it gives C pseudo first-class functionality.

    – giant91

    Oct 13, 2013 at 2:28


  • 27

    @Rich.Carpenter: function pointers are nice for runtime CPU detection. Have multiple versions of some functions to take advantage of SSE, popcnt, AVX, etc. At startup, set your function pointers to the best version of each function for the current CPU. In your other code, just call through the function pointer instead of having conditional branches on the CPU features everywhere. Then you can do complicated logic about deciding that well, even though this CPU supports pshufb, it’s slow, so the earlier implementation is still faster. x264/x265 use this extensively, and are open source.

    Aug 30, 2015 at 2:22


328

Function pointers in C can be used to perform object-oriented programming in C.

For example, the following lines is written in C:

String s1 = newString();
s1->set(s1, "hello");

Yes, the -> and the lack of a new operator is a dead give away, but it sure seems to imply that we’re setting the text of some String class to be "hello".

By using function pointers, it is possible to emulate methods in C.

How is this accomplished?

The String class is actually a struct with a bunch of function pointers which act as a way to simulate methods. The following is a partial declaration of the String class:

typedef struct String_Struct* String;

struct String_Struct
{
    char* (*get)(const void* self);
    void (*set)(const void* self, char* value);
    int (*length)(const void* self);
};

char* getString(const void* self);
void setString(const void* self, char* value);
int lengthString(const void* self);

String newString();

As can be seen, the methods of the String class are actually function pointers to the declared function. In preparing the instance of the String, the newString function is called in order to set up the function pointers to their respective functions:

String newString()
{
    String self = (String)malloc(sizeof(struct String_Struct));

    self->get = &getString;
    self->set = &setString;
    self->length = &lengthString;

    self->set(self, "");

    return self;
}

For example, the getString function that is called by invoking the get method is defined as the following:

char* getString(const void* self_obj)
{
    return ((String)self_obj)->internal->value;
}

One thing that can be noticed is that there is no concept of an instance of an object and having methods that are actually a part of an object, so a “self object” must be passed in on each invocation. (And the internal is just a hidden struct which was omitted from the code listing earlier — it is a way of performing information hiding, but that is not relevant to function pointers.)

So, rather than being able to do s1->set("hello");, one must pass in the object to perform the action on s1->set(s1, "hello").

With that minor explanation having to pass in a reference to yourself out of the way, we’ll move to the next part, which is inheritance in C.

Let’s say we want to make a subclass of String, say an ImmutableString. In order to make the string immutable, the set method will not be accessible, while maintaining access to get and length, and force the “constructor” to accept a char*:

typedef struct ImmutableString_Struct* ImmutableString;

struct ImmutableString_Struct
{
    String base;

    char* (*get)(const void* self);
    int (*length)(const void* self);
};

ImmutableString newImmutableString(const char* value);

Basically, for all subclasses, the available methods are once again function pointers. This time, the declaration for the set method is not present, therefore, it cannot be called in a ImmutableString.

As for the implementation of the ImmutableString, the only relevant code is the “constructor” function, the newImmutableString:

ImmutableString newImmutableString(const char* value)
{
    ImmutableString self = (ImmutableString)malloc(sizeof(struct ImmutableString_Struct));

    self->base = newString();

    self->get = self->base->get;
    self->length = self->base->length;

    self->base->set(self->base, (char*)value);

    return self;
}

In instantiating the ImmutableString, the function pointers to the get and length methods actually refer to the String.get and String.length method, by going through the base variable which is an internally stored String object.

The use of a function pointer can achieve inheritance of a method from a superclass.

We can further continue to polymorphism in C.

If for example we wanted to change the behavior of the length method to return 0 all the time in the ImmutableString class for some reason, all that would have to be done is to:

  1. Add a function that is going to serve as the overriding length method.
  2. Go to the “constructor” and set the function pointer to the overriding length method.

Adding an overriding length method in ImmutableString may be performed by adding an lengthOverrideMethod:

int lengthOverrideMethod(const void* self)
{
    return 0;
}

Then, the function pointer for the length method in the constructor is hooked up to the lengthOverrideMethod:

ImmutableString newImmutableString(const char* value)
{
    ImmutableString self = (ImmutableString)malloc(sizeof(struct ImmutableString_Struct));

    self->base = newString();

    self->get = self->base->get;
    self->length = &lengthOverrideMethod;

    self->base->set(self->base, (char*)value);

    return self;
}

Now, rather than having an identical behavior for the length method in ImmutableString class as the String class, now the length method will refer to the behavior defined in the lengthOverrideMethod function.

I must add a disclaimer that I am still learning how to write with an object-oriented programming style in C, so there probably are points that I didn’t explain well, or may just be off mark in terms of how best to implement OOP in C. But my purpose was to try to illustrate one of many uses of function pointers.

For more information on how to perform object-oriented programming in C, please refer to the following questions:

11

  • 29

    This answer is horrible! Not only it implies that OO somehow depends on dot notation, it also encourages putting junk into your objects!

    Sep 16, 2012 at 14:30

  • 33

    This is OO all right, but not anywhere near the C-style OO. What you have brokenly implemented is Javascript-style prototype-based OO. To get C++/Pascal-style OO, you’d need to: 1. Have a const struct for a virtual table of each class with virtual members. 2. Have pointer to that struct in polymorphic objects. 3. Call virtual methods via the virtual table, and all other methods directly — usually by sticking to some ClassName_methodName function naming convention. Only then you get the same runtime and storage costs as you do in C++ and Pascal.

    Mar 18, 2013 at 21:53


  • 20

    Working OO with a language that is not intended to be OO is always a bad idea. If you want OO and still have C just work with C++.

    Jul 4, 2013 at 15:21

  • 27

    @rbaleksandar Tell that to the Linux kernel developers. “always a bad idea” is strictly your opinion, with which I firmly disagree.

    Apr 30, 2015 at 12:31

  • 7

    I like this answer but don’t cast malloc

    – cat

    Sep 29, 2016 at 14:41

254

The guide to getting fired: How to abuse function pointers in GCC on x86 machines by compiling your code by hand:

These string literals are bytes of 32-bit x86 machine code. 0xC3 is an x86 ret instruction.

You wouldn’t normally write these by hand, you’d write in assembly language and then use an assembler like nasm to assemble it into a flat binary which you hexdump into a C string literal.

  1. Returns the current value on the EAX register

    int eax = ((int(*)())("\xc3 <- This returns the value of the EAX register"))();
    
  2. Write a swap function

    int a = 10, b = 20;
    ((void(*)(int*,int*))"\x8b\x44\x24\x04\x8b\x5c\x24\x08\x8b\x00\x8b\x1b\x31\xc3\x31\xd8\x31\xc3\x8b\x4c\x24\x04\x89\x01\x8b\x4c\x24\x08\x89\x19\xc3 <- This swaps the values of a and b")(&a,&b);
    
  3. Write a for-loop counter to 1000, calling some function each time

    ((int(*)())"\x66\x31\xc0\x8b\x5c\x24\x04\x66\x40\x50\xff\xd3\x58\x66\x3d\xe8\x03\x75\xf4\xc3")(&function); // calls function with 1->1000
    
  4. You can even write a recursive function that counts to 100

    const char* lol = "\x8b\x5c\x24\x4\x3d\xe8\x3\x0\x0\x7e\x2\x31\xc0\x83\xf8\x64\x7d\x6\x40\x53\xff\xd3\x5b\xc3\xc3 <- Recursively calls the function at address lol.";
    i = ((int(*)())(lol))(lol);
    

Note that compilers place string literals in the .rodata section (or .rdata on Windows), which is linked as part of the text segment (along with code for functions).

The text segment has Read+Exec permission, so casting string literals to function pointers works without needing mprotect() or VirtualProtect() system calls like you’d need for dynamically allocated memory. (Or gcc -z execstack links the program with stack + data segment + heap executable, as a quick hack.)


To disassemble these, you can compile this to put a label on the bytes, and use a disassembler.

// at global scope
const char swap[] = "\x8b\x44\x24\x04\x8b\x5c\x24\x08\x8b\x00\x8b\x1b\x31\xc3\x31\xd8\x31\xc3\x8b\x4c\x24\x04\x89\x01\x8b\x4c\x24\x08\x89\x19\xc3 <- This swaps the values of a and b";

Compiling with gcc -c -m32 foo.c and disassembling with objdump -D -rwC -Mintel, we can get the assembly, and find out that this code violates the ABI by clobbering EBX (a call-preserved register) and is generally inefficient.

00000000 <swap>:
   0:   8b 44 24 04             mov    eax,DWORD PTR [esp+0x4]   # load int *a arg from the stack
   4:   8b 5c 24 08             mov    ebx,DWORD PTR [esp+0x8]   # ebx = b
   8:   8b 00                   mov    eax,DWORD PTR [eax]       # dereference: eax = *a
   a:   8b 1b                   mov    ebx,DWORD PTR [ebx]
   c:   31 c3                   xor    ebx,eax                # pointless xor-swap
   e:   31 d8                   xor    eax,ebx                # instead of just storing with opposite registers
  10:   31 c3                   xor    ebx,eax
  12:   8b 4c 24 04             mov    ecx,DWORD PTR [esp+0x4]  # reload a from the stack
  16:   89 01                   mov    DWORD PTR [ecx],eax     # store to *a
  18:   8b 4c 24 08             mov    ecx,DWORD PTR [esp+0x8]
  1c:   89 19                   mov    DWORD PTR [ecx],ebx
  1e:   c3                      ret    

  not shown: the later bytes are ASCII text documentation
  they're not executed by the CPU because the ret instruction sends execution back to the caller

This machine code will (probably) work in 32-bit code on Windows, Linux, OS X, and so on: the default calling conventions on all those OSes pass args on the stack instead of more efficiently in registers. But EBX is call-preserved in all the normal calling conventions, so using it as a scratch register without saving/restoring it can easily make the caller crash.

19

  • 9

    Note: this doesn’t work if Data Execution Prevention is enabled (e.g. on Windows XP SP2+), because C strings are not normally marked as executable.

    Feb 12, 2013 at 5:53

  • 5

    Hi Matt! Depending on the optimization level, GCC will often inline string constants into the TEXT segment, so this will work even on newer version of windows provided that you don’t disallow this type of optimization. (IIRC, the MINGW version at the time of my post over two years ago inlines string literals at the default optimization level)

    – Lee

    Jan 2, 2014 at 6:20

  • 11

    could someone please explain what’s happening here? What are those weird looking string literals?

    – ajay

    Jan 20, 2014 at 10:17


  • 62

    @ajay It looks like he’s writing raw hexidecimal values (for instance ‘\x00’ is the same as ‘/0’, they’re both equal to 0) into a string, then casting the string into a C function pointer, then executing the C function pointer because he’s the devil.

    – ejk314

    Feb 21, 2014 at 21:27


  • 4

    hi FUZxxl, I think it might vary based on the compiler and the operating system version. The above code seems to run fine on codepad.org; codepad.org/FMSDQ3ME

    – Lee

    Mar 13, 2014 at 0:48