Monday, February 13, 2017

Notes on C++ Exception Handling and Stack Unwinding


Introduction

If you take a look at the LLVM documentation, you might be surprised to see that different processor architectures have their own exception handling ABI, and there's an entire section on how things work on Windows. Why would processors care about C++ exceptions, and why would things work differently on Windows - what is it to the OS what a C++ program does within the user space (opposed to the kernel space)?

There are interesting answers to these questions, but first let's take a closer look at the C++ exceptions.

Exceptions Demystified

C++ exceptions give the look and feel of magic. Based on throw and catch statements the control flow seems to jump over functions, doing proper cleanup and knowing when to stop. When you look under the hood it is no surprise that it feels like magic - especially the catching of an exception is really complicated. Breaking things down and making them more concrete helps us better understand things so let's try to break the life of an exception into more concrete albeit high level steps.

Everything starts by compiler's transformation of the keywords throw and catch. Throw is replaced with a pair of __cxa_allocate_exception and __cxa_throw calls, catch is replaced with a pair of cxa_begin_catch and __cxa_end_catch calls. These are functions in the C++ runtime. Now at least we know that those keywords are just some function calls under the hood. Now let's look at the high level flow of events when an exception is thrown during program execution:

  1. __cxa_allocate_exception allocates the exception to be thrown.
  2. __cxa_throw starts the chain of events involved in throwing and handling the exception.
    1. This call never returns. Either the exception is caught by some call frame on the stack and execution continues from that point, or no one catches it and std::terminate is called
  3. __Unwind_RaiseException (comes with C++ runtime, part of the Unwind library) walks back the function call frames on the stack one by one.
    1. To be able to walk back the stack, __Unwind_RaiseException uses the CFI (Call Frame Information) that's found in the .eh_frame section of the object/executable file (e.g ELF).
  4. As __Unwind_RaiseException walks back the stack, for each frame it calls something called the personality function (__gxx_personality_v0 for GCC) for that frame. The personality function is a language specific function which
    1. Checks whether the current frame will be able to catch the function.
    2. Does the proper cleanup as the current frame is unwound, like calling destructors for objects going out of scope.
  5. When the personality function finds a frame that can catch the exception, control is transferred to the corresponding landing pad, from where execution continues.
  6. If no call frame on the stack can catch the exception then std::terminate is called.

Unwinding the Stack - C++ ABI

In the previous section it might have caught your eye that we need some sort of metadata in the object file (e.g ELF) to be able to find the previous stack frames. If you have looked at how the x86 function call stack works this probably sounds weird to you, why can't I just follow the EBP register (which keeps the frame pointer) to find the previous stack frames?


Yes, on x86 processors you don't need any metadata to be able to walk back the stack because the frame pointers (EBP) are saved on the stack and practically for a linked list that you can traverse. However x86 is an old architecture at this point and newer architectures like x86-64, Itanium, and ARM don't necessarily save the frame pointer on the stack - they may not even have a dedicated register for the frame pointer. [1] Compiler generates code in a way that it can always refer to local variables with varying offsets to the stack pointer (ESP) [2].

This is where the C++ ABI and eh_frame/CFI (Call Frame Information) come in. C++ ABI for specific platforms (e.g [3]) specify what kind of metadata needs to be generated by the compiler to enable stack unwinding. This metadata is used by
  1. Exception handling by the runtime
  2. Stack back trace calls in the program itself
  3. Debuggers
It's not surprising this ABI is usually specified per processor architecture - it directly relates to processor registers. 

You can stop your compiler from generating this metadata by using the -fno-unwind-tables flag, in which case I'd imagine exceptions won't work and you can't see the stack trace in a debugger.

Stack unwinding forms the language independent part of the exception handling. There's also a language specific part, which we'll look at next.

Personality Function and Call Frame Cleanup

So the unwinding process is able to walk back the stack without knowing anything about the programming language or the compiler - thanks to the ABI - which is great. But someone needs to handle two things:
  1. Understanding whether current call frame (during the unwind process) can catch this exception
  2. Cleaning up the current frame, e.g calling the destructors for the proper objects on the stack
These are very language specific, in fact compiler specific actions. This is why compiler generates something called personality function, which is simply one function per compiler, __gxx_personality_v0 for GCC. This function is called by the unwinder [4] for each frame on the call stack and does the two things mentioned. To be able to do those, the personality function uses the LSDA (language specific data area), which is metadata generated for each function and placed at the end of the function's code. LSDA contains information to determine what exceptions the function can catch [5] and if it doesn't catch the exception how to clean up this frame (e.g destructors).

The cleanup metadata that is in the LSDA is called exception table, GCC calls it gcc_except_table. The -fno-exeptions compiler flag tells the compiler not to generate this metadata. An exception thrown by such a program would result in a call to std::terminate.

Nothrow

After reading the last section it should be clear why a C++ program terminates when an exception escapes a nothrow function. For such a function the compiler wouldn't generate the exception table which would make the cleanup of the call frames for that function impossible during unwinding. The runtime has no option but to terminate.

One Pass vs. Two Pass Exception Handling

The high level exception handling flow given in the second section is missing one aspect. Some runtimes decide not to do call frame cleanup (e.g destructors) until they make sure someone will handle the exception. They take a first pass at the stack unwinding, trying to locate a frame that will catch the exception. If none found, runtime directly calls std::terminate without bothering with frame cleanups. If a frame catches the exception then the runtime does another pass on the stack this time actually calling the personalization function for frame cleanup. Apparently GCC works this way.

Exceptions in Windows

This is interesting, if you spent your life in the UNIX universe you must be thinking what the hell does an OS to do with exceptions. You're right, UNIX systems don't have the notion of exceptions. They know about signals, and exceptions are an entirely application concept.

Windows is different - it has the concept of exceptions and it manages them closely. Architects of Windows made the interesting choice to unify hardware exceptions (e.g page fault) with software exceptions. The OS defines what an exception looks like, how the exception handling frames should be setup etc. and this is called Structured Exception Handling [6]. This has some cool implications, you can catch hardware exceptions like page fault or divide by zero in your C++ catch block. You just need to compile with the SEH flag enabled. This is not possible on UNIX systems, hardware exceptions are delivered as UNIX signals.

On Windows there's a system API that's language agnostic to throw exceptions - RaiseException. Thanks to the extensive OS support Microsoft's C compiler also supports exceptions, through the __throw and __catch keywords.

First Chance / Second Chance Exceptions

If you have debugged a Windows application that throws and catches exceptions you might have noticed something peculiar in the debugger. It'll print statements about  'first chance exception's while executing. This is basically an OS provided debug port feature, the OS provides debuggers insights into the exceptions that are happening in the program while the program is running and the debugger is not broken into. First chance exception means an exception has been thrown but stack unwind hasn't happened yet so the program code might still catch the exception. Second chance exception means the stack unwind has finished but no call frame has caught the exception so the program is about to crash [7].

As far as I know this is a pretty unique Windows feature and it is possible because the OS is involved in the exception throwing/handling process.


Notes

[1] Some ARM compilers seem to reserve a register as the frame pointer, but this is more of a configuration/implementation detail for that given compiler - unlike the universal use of the EBP register on x86. So runtimes and tools still need to rely on eh_frame and CFI to be able to work for all compilers on the same ABI.
[2] "In common with the IA-64 runtime architecture, the ARM Exception ABI specifies separate, per-function unwinding tables indexed by program counter."
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0038b/IHI0038B_ehabi.pdf
[3] http://infocenter.arm.com/help/topic/com.arm.doc.ihi0038b/IHI0038B_ehabi.pdf
[4] Unwinding runtime code knows where to find the personality function thanks to the CFI (Call Frame Information)
[5] Obviously this depends on where the program counter (PC) was when that frame made the call to another function
[6] Apparently this was later replaced by Vector Exception Handling but still the OS is a key player in exception throwing and handling.
[7] https://support.microsoft.com/en-us/help/105675/first-and-second-chance-exception-handling


References

2 comments: