Correct usage of LD_PRELOAD for hooking libc functions

LD_PRELOAD is a very powerful feature supported by the dynamic linker on most Unixes that allows shared libraries to be loaded before others (including libc). This makes it very useful for hooking libc functions to observe or modify the behaviour of 3rd-party applications to which you do not control the source.

Unfortunately, a lot of what's been written on the subject online is subtly wrong — not wrong enough to fail outright — but just enough to bite you once when you expect it the least. In this post I'll first go over the incorrect approach often described, analyze why it's wrong, and then describe the easy fix.

A simple program

Let's consider a simple C program that we'll be using to test. Our goal will be to track what files it's opening using LD_PRELOAD.

#include <stdio.h>

int main(int argc, char **argv) {
  FILE *ptr = fopen("/etc/hosts", "r");
  fclose(ptr);
  return 0;
}

Nothing special going on here — we can save it to test.c and compile with:

$ gcc test.c -o test

(Incorrectly) using LD_PRELOAD to hook fopen

Strictly speaking, fopen is not the lowest-level you can get for opening files. open(2) (and friends) is the syscall everything eventually trickles down to, but we can't intercept the syscall directly it with an LD_PRELOAD hook — that's what ptrace(2) is for. At most, we could intercept its libc wrapper. Nonetheless, hooking fopen is enough for demonstration purposes.

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>

typedef FILE *(*fopen_t)(const char *pathname, const char *mode);
fopen_t real_fopen;

FILE *fopen(const char *pathname, const char *mode) {
  fprintf(stderr, "called fopen(%s, %s)\n", pathname, mode);
  return real_fopen(pathname, mode);
}

__attribute__((constructor)) static void setup(void) {
  real_fopen = dlsym(RTLD_NEXT, "fopen"); 
  fprintf(stderr, "called setup()\n");
}

We can compile this code as a position-independent shared library, linking libdl for dlopen. Then by passing the full-path to it into the LD_PRELOAD environment variable, it gets loaded before libc so fopen gets resolved to our declaration.

$ gcc -shared -fPIC -ldl preload_test.c -o preload_test.so
$ LD_PRELOAD=$PWD/preload_test.so ./test
called setup()
called fopen(/etc/hosts, r)

Let's provide a bit more background on what's going on. __attribute__((constructor)) is a GCC extension (that's supported by Clang too) which places a pointer to setup in preload_tests .ctors section. The loader then knows to execute the function before anything else (in particular, before main is called). In our setup function, we ask libdl for the next (RTLD_NEXT) resolution of fopen — this should be libc's — and keep a pointer to it. When our test executable runs and opens /etc/hosts, our hooked fopen is caled.

This is what a lot of articles online get wrong. Sure, it seems to work for our simple test, but let's try a "real" application, like ssh. If you're following along on your own, note that ssh may or may not exhibit this behaviour on your system, depending on how it was compiled and how your system is set up.

$ LD_PRELOAD=$PWD/preload_test.so ssh
called fopen(/proc/filesystems, r)
Segmentation fault

Oops!

So, what's going on? It's clear that our setup was never called, which means that when we try to invoke real_fopen, we're dealing with a null pointer. Basic stuff, but why? We can use valgrind to get a better idea of what's going on (some valgrind output omitted for brevity).

$ LD_PRELOAD=$PWD/preload_test.so valgrind --tool=memcheck ssh
called setup()
called setup()
called fopen(/proc/filesystems, r)
==2108== Jump to the invalid address stated on the next line
==2108==    at 0x0: ???
==2108==    by 0x5048B0D: selinuxfs_exists (in /lib/x86_64-linux-gnu/libselinux.so.1)
==2108==    by 0x5040D97: ??? (in /lib/x86_64-linux-gnu/libselinux.so.1)
==2108==    by 0x400F859: call_init.part.0 (dl-init.c:72)
==2108==    by 0x400F96A: call_init (dl-init.c:30)
==2108==    by 0x400F96A: _dl_init (dl-init.c:120)
==2108==    by 0x4000C59: ??? (in /lib/x86_64-linux-gnu/ld-2.24.so)
==2108==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==2108== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

This paints a clear picture of what's happening. ssh depends on libselinux, which defines its own constructor that tries fopen-ing /proc/filesystems. At this point in time, our setup has not been called by the linker, but fopen has been resolved to ours. As a result, we end up invoking an uninitialized pointer and segfault.

(Correctly) using LD_PRELOAD to hook fopen

With our investigation over, the fix is very simple: don't depend on a constructor to resolve libc's fopen, and do it on demand when it's first needed.

#define _GNU_SOURCE

#include <stdio.h>
#include <dlfcn.h>

typedef FILE *(*fopen_t)(const char *pathname, const char *mode);
fopen_t real_fopen;

FILE *fopen(const char *pathname, const char *mode) {
  fprintf(stderr, "called fopen(%s, %s)\n", pathname, mode);
  if (!real_fopen) {
    real_fopen = dlsym(RTLD_NEXT, "fopen");
  }

  return real_fopen(pathname, mode);
}

__attribute__((constructor)) static void setup(void) {
  fprintf(stderr, "called setup()\n");
}

And now, after recompiling we can see that it works as expected:

$ gcc -shared -fPIC -ldl preload_test.c -o preload_test.so
$ LD_PRELOAD=$PWD/preload_test.so ssh
called fopen(/proc/filesystems, r)
called fopen(/proc/mounts, r)
called setup()
called fopen(/etc/passwd, rme)
usage: ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
           [-D [bind_address:]port] [-E log_file] [-e escape_char]
           [-F configfile] [-I pkcs11] [-i identity_file]
           [-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
           [-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
           [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
           [user@]hostname [command]

libselinux's constructor opens /proc/filesystems and /proc/mounts, before the linker passes control to our setup, and /etc/passwd is read as part of ssh's initalization procedures.

Overall, this is a simple fix to a problem that might otherwise go undetected during testing, but I hope the analysis of what can go wrong when relying on constructors to execute in a particular order was entertaining to read.