Sometimes, the kernel lies about process memory usage

Jul 5, 2021 • 9 minutes

Here's a short systems debugging story.

On dmoj.ca, we run user-submitted solutions to algorithmic programming problems against a set of input files, and judge their output for correctness. One metric by which solutions are ranked on our leaderboards is memory usage. A user recently reported that some code they had submitted was reported as having consumed 4 KiB of memory, despite their code allocating a 128 KiB array. How come?

This is a story about how sometimes, the kernel lies about memory usage — all in the name of performance.

Continue reading...

Peeking under the hood of GCC's `__builtin_expect`

Mar 23, 2020 • 5 minutes

If you've ever poked at high-performance C code, you've probably seen GCC's __builtin_expect extension being used to manually hint the likelihood of a branch being taken a particular way.

The Linux kernel famously contains macros for likely and unlikely branches, which perform the appropriate __builtin_expect incantations.

#define unlikely(expr) __builtin_expect(!!(expr), 0)
#define likely(expr)   __builtin_expect(!!(expr), 1)

…but, how does this all work? What does "hinting" mean, exactly, and how does __builtin_expect translate to generated assembly?

Continue reading...

On online judging, part 5: optimizing `ptrace` filtering with `seccomp`

Jan 4, 2019 • 8 minutes

In part 1 of this series, I mentioned that the overhead of a pure ptrace-based sandbox is about 10%. In hindsight, this number is very optimistic — it can be as high as 50% for some workloads — but understanding why requires a bit of background on how the judge keeps track of submission time.

In this post, we'll discuss both submission time-keeping, and a simple but effective method to reduce sandboxing overhead using seccomp alongside ptrace.

Continue reading...

Emulating microprocessors with macros

Dec 11, 2018 • 7 minutes

Whenever I work on an emulator (having written several in the past), I try to make my life as interesting as possible. After all, implementing hundreds of opcodes can be a very dull task.

Most recently, I joked that C macros were powerful enough for it to be feasible to implement an simple architecture in them. One thing led to another, with the result being an Intel 8080 emulator core implemented purely with C macros.

In this post, I'll go over the awful hacks that helped make this monstrosity a reality… and why perhaps it's not such a bad idea to write an emulator in macros.

Continue reading...

Correct usage of `LD_PRELOAD` for hooking `libc` functions

Nov 18, 2018 • 7 minutes

LD_PRELOAD is a very powerful feature supported by the dynamic linker on most Unixes that allows shared libraries to be loaded before others (including libc). This makes it very useful for hooking libc functions to observe or modify the behaviour of 3rd-party applications to which you do not control the source.

Unfortunately, a lot of what's been written on the subject online is subtly wrong — not wrong enough to fail outright — but just enough to bite you once when you expect it the least. In this post I'll first go over the incorrect approach often described, analyze why it's wrong, and then describe the easy fix.

Continue reading...

Low-latency static sites with Scaleway and Cloudflare

Sep 3, 2018 • 3 minutes

For a while now, I'd been searching for a cheap but reliable hosting solution for this website.

The option of hosting with Github Pages and similar services exists and has a minimal barrier to entry, but I like to be in control of my servers, so that I can occasionally use them for other tasks than just purely hosting. For instance, the machine serving this page runs both a Tor relay and acts as a backup for my large but non-sensitive files.

Now, I think I've found a good solution: a €2.99/mo Scaleway plan coupled with Cloudflare for fast page load times worldwide.

Continue reading...

Mining for Tor v3 onions in the cloud

Mar 22, 2018 • 29 minutes

Tor supports a new hidden service protocol as of v0.3.2.1-alpha, released back in October 2017, and is now in stable branches. Dubbed the "v3" onion service protocol, among other changes, it replaces SHA1/DH/RSA1024 with SHA3/ed25519/curve25519 for much improved cryptographic security.

I already had a v2 onion site up at tbrindus6tjv6wpi.onion, so I thought it would be an interesting exercise to mine a v3 vanity domain prefixed with tbrindus. For this, I set up 15 servers to mine for a matching prefix — more on this below!

It took well over a week of mining, but as of today, this site can also be accessed through the v3 hidden service tbrindusxnnqwmzov5qof56hyion6usmciqwykffxqsawswhk73aq5yd.onion!

Continue reading...

Setting up an SSTP VPN on Windows Server with LetsEncrypt

Mar 8, 2018 • 5 minutes

Setting up a VPN on Windows Server for remote access to company resources comes up often enough, and a great deal has been written on the subject online.

However, back when I first went through the whole process, I found it time-consuming to sift through all the outdated information floating around, so I created this document for personal reference. I've had the opportunity to test them out on a number of fresh installs, and worked out a bunch of kinks that way.

These instructions assume a brand new install of Windows Server 2016, but they should be easily adaptable to other scenarios.

Continue reading...

Blazing-fast Java2D rendering

Oct 18, 2017 • 8 minutes

Anyone who has ever attempted to draw anything more than almost-static scenes with Java2D can attest that it sluggishly chugs along. Some will even say it's even unusable for repainting at 60Hz or higher without taking a toll on CPU.

Today, we'll look at what we can do to speed up rendering, in ways that (at the time of writing) I have not seen discussed anywhere online. Probably because it's a big hack.

Continue reading...

Java internals, or when `true != true`

Oct 14, 2017 • 4 minutes

Most programmers have heard jokes about inserting a Greek question mark (;, U+037E) into Java code in place of a semicolon to cause "inexplicable" compilation errors.

But, it's too easy to discover. What about something that manifests itself at runtime, but when inspected — either by printing to stdout or through a debugger — shows nothing amiss?

Continue reading...

On online judging, part 4: a Java-specific sandbox

Oct 8, 2017 • 5 minutes

If you've been following along this far in our creation of a sandbox for an online judging system, you may recall this requirement we set out with:

The sandbox must be easy to expand to support more runtimes. Language-specific sandboxes are unacceptable, simply due to the effort of maintaining them.

While this is true in the general case, exceptions do have to be made for popular runtimes that need a little extra flexibility in judging — which brings us to today's topic, a Java-specific sandbox.

Continue reading...

On online judging, part 3: the macOS sandbox

Oct 7, 2017 • 1 minute

In a previous installment, we talked about creating a Linux sandbox using the ptrace(2) system call. Today, we'll look at setting up a similar sandbox for macOS.

Continue reading...

On online judging, part 2: the Windows sandbox

Oct 6, 2017 • 5 minutes

In the last chapter of this topic, we talked about creating a sandbox for Linux-based systems.

On Linux, we have a fine-grained control over untrusted code by intercepting all system calls via the ptrace(2) API. Since a similar API does not exist on Windows, alternative methods must be used.

Continue reading...

On online judging, part 1: the Linux sandbox

Oct 4, 2017 • 7 minutes

I've been operating a semi-large online judge for the better part of 4 years at the time of writing. People sometimes ask me how it works, so I figured it'd be a worthwhile task to document, in general, what goes into writing an online judge. So let's start with the basics: you need a sandbox to run user code in. Let's focus on Linux for now, since it's a significantly easier platform to target.

Continue reading...

Ricing the Windows Subsystem for Linux

Aug 5, 2017 • 3 minutes

One of the more exciting pieces of tech coming out of Microsoft these days is the Windows Subsystem for Linux (also sometimes referred to as "Bash on Ubuntu on Windows"). If you haven't heard of it yet, you should definitely check it out.

Today, we'll talk about setting up an aesthetically-pleasing graphical development environment within it.

Continue reading...

Modern security on IIS

Jul 15, 2017 • 13 minutes

There are a lot of resources on IIS website security out there, and they generally tend to fall in three categories:

The Outdated
Hacks for IIS 6 aren't (usually) relevant when better solutions exist for IIS 10, but still appear higher in search rankings.
The Overly General
Yes, keeping your antivirus definitions up-to-date is key, but you should be doing that regardless of if you're using IIS or something else. Many tips about IIS security are really about OS security, so not very useful.
The Good
There are nonetheless many good resources on IIS security that are still relevant nowadays, and where appropriate I've linked them in this post.

I've written this resource in an attempt to distill large chunks of relevant information into a single document, with enough info to configure IIS, and links towards further reading on other sites.

Continue reading...