Cybersecurity
Part 2: Reverse Engineering and Patching with Ghidra
This content is provided "as is" and is more than a year old. No representations are made that the content is up-to date or error-free.
In the first installment of our three-part blog series here we learned how to root the Flashforge Finder 3D printer and acquire its firmware. In this post, we will delve into reverse engineering and patching the software using the new open source NSA tool Ghidra, which rivals its expensive competitors such as IDA Pro in value and ease of use.
Installation on Windows
- Go to to Ghidra's site and click “Download Ghidra” in the middle of the page
- Extract the folder from the downloaded zip file to somewhere on your computer
- Navigate to the extracted folder, which as of this writing was named ghidra_9.0.2
- There will be two files within this folder, one named ghidraRun.bat and one named ghidraRun Since we have installed this in Windows, we will double click ghidraRun.bat
- If a command prompt opens that starts with the message, “Java runtime not found,” you will need to install Java’s JDK and add it to your local path:
- Go to Oracle's site and download the latest version of JDK for your computer’s architecture
- Install the downloaded file
- Add the Java bin folder to your local user’s PATH environment variable
- By default it will install itself to C:\Programs Files\Java\jdk-12.0.1\ making the bin path C:\Program Files\Java\jdk-12.0.1\bin
- In your Windows search bar type, “path”
- Click the best match result, “Edit the system environment variables”.
- Inside of the box that says “User variables for <username>” select Path, then click the Edit button
- Click the New button on the right side of the window and type or paste in the JDK bin path, which would be C:\Program Files\Java\jdk-12.0.1\bin if you installed the 64bit JDK to the default path
- Click OK
Creating the Project
Go to File > New project > Non-shared project, then give it a project name. Now you’re ready to import a file for disassembly. Click on your project directory that was just created in the Ghidra window, then click the green dragon head right above it. This will open a Code Browser window. Hit File > Import File and select the finder_plus.hex firmware we identified in the previous post. You should be at a screen similar to below:
There are two fields that need to be adjusted. First, since we’re importing a hex file and not a binary, change the Format field to “Intel Hex.” Second is the Language field, which is harder. We know the main board in the printer is ARMv5t little endian thanks to flashforge_init.sh. But this hex file is flashing a microcontroller on the main board and we don’t know its architecture. There are two ways of figuring this out:
- Open up the printer and try to guess which chip on the board is the chip we’re flashing
- By trial and error
One way to do this is to load the file in various architectures that seem most likely, then seeing how many functions Ghidra is able to sniff out and how many errors Ghidra mentions in the decompiled C code. The more functions and fewer errors, the more likely the architecture is accurate. Since we know the main board is ARMv5LE, we’ll start with that. We select ARM v5 little endian in the Language field and Ghidra will ask us if we want to analyze it. We select “yes,” and keep all the analyze options as default as they’re sane values. Ghidra analyzes the file and the disassembled functions will appear in the left side of the Code Browser window.
Although this was decompiling it with a decent number of functions, there appeared to be a lot of errors in the C code on the right side of the window such as the one seen below:
After loading and reloading finder_plus.hex into Ghidra and trying various architectures, we found that ARM Cortex seemed to work the best and was confirmed later when we opened up the printer and identified the correct chip. A quick search of the chip name confirmed it was ARM Cortex.
Ghidra Interface
Now we can get to work on reverse engineering. Let’s start by getting familiar with the parts of the Ghidra interface that are relevant to our interest in forcing the printer to never stop heating up. The Code Browser window is likely where you’ll be spending most of your time:
In the blue box on the left we have the Symbol Tree. The most important part of this section is the Functions list where we can identify and jump to the various functions Ghidra was able to sniff out. The green box in the center is the assembly instruction and the yellow box on the right is the decompiled C code.
Variables and functions can be renamed by right clicking them inside the yellow box above and selecting the Rename option. By renaming the functions and variables as we begin to understand them, we will slowly pull back the cloak of obfuscation inherent to a decompiled binary or hex. In doing this, the initial name we give the function or variable often won’t be accurate, but just renaming functions based on what they do, such as “related_to_temperature?” will dramatically help in our high-level understanding of the file. It is a good practice to add an identifier to the rename if unsure that the label is 100% accurate such as a question mark.
JMPing Off Point
When beginning a reverse engineering project, the reverse engineer needs a starting point. There are a few options:
- Search the binary for strings and start at a recognized string
- Start at the “entry” function and look for the main function
- Start at a known variable
For most projects, that will be identifying the main function. The reverse engineer will start at the function labeled “entry” and step through the functions the entry function calls looking for a function that calls a lot of other code or enters a permanent while loop. The entry function is simply the first function called by the firmware and isn’t usually more than a few steps away from the main function of the program.
We click this function then start inspecting the decompiled C on the right side of the Code Browser window.
Now we simply double click any of the FUN_0800xxxx functions and check them out. FUN_08006d58 immediately stood out.
This function contained a while loop that called several other functions. Each of those functions called many more functions meaning this is the meat of the firmware. We can label this function “main?” and continue our analysis.
However, there is an easier way. Click File > Export program and choose C/C++ as the format. This will export all the decompiled C code to a file. Now simply open the file and search for the temperature value. If looking for a number, as we were, then try searching for both the actual number, 240, or the hex representation of that number, 0xf0.
Bingo! Let’s go ahead and change that to… the temperature of the sun sounds fun.
Patching with Ghidra
IN our experiment, we now had the max temperature variable, so we figured we’d patch that and be done. This turned out to not be the case, but let’s see how to go about it now. Patching binaries is easy in Ghidra. The reverse engineer only needs to right click the assembly line they wish to change and click Patch Instruction.
Then just change the hex value to the desired value. Once the patches are done, go to File > Export Program and select the format Intel Hex and the edited firmware file will be saved to the chosen location. However, a complication quickly arose. This instruction line can only handle a decimal value up to 256 (0x100 in hex) since the mov instruction can only handle a 1-byte value. 256 is the maximum decimal that can be stored in 1 byte. There are two options to work around this: find a code cave or increase the size of the firmware.
The Mystery of Code Caves
A cave is a region of unused space in the target binary. Typically, you will find some extra space at the end of a section. You might get lucky and find some space between functions. Why these caves are in a given location can be a real mystery. In the FlashForge Finder’s firmware, the compiler and linker are probably aware of the storage constraints typical of embedded systems and have created a very small binary. Here we have a picture of Ghidra showing the data at the end of a section, full of potentially important junk.
These are not null bytes, meaning they are not empty space. Since we don’t know the importance of this data yet, it’s best not to touch anything. Unfortunately, an easy code cave isn’t an option, so we instead move to the second choice: increase the size of the firmware. We know we can do this, because we looked at the chip datasheets and have MEGABYTES of extra space!
Increasing the Size of the Firmware
First we press the “Display Memory Map button”
Then we hit the “+” button and configure the details in the dialog. We specify that the start address is the first byte after the previous “ram” section’s end address. Of course, we request 0x1337 bytes of space. After we are done, we can shrink this back down if we remember.
Once we hit OK we can look at our new section.
Great, we have space now!
Now we can redirect execution here from where we want to start changing logic, run the code and redirect execution back. We will have to be mindful to preserve important registers and the stack.
But wait! What if the unconditional jump itself has to overwrite bytes? Those far jump instructions can be pretty large. Code stealing is the answer! Yes, ladies and gentlemen, we will steal instructions and hide them in the code cave we created. When we branch to our cave, those stolen instructions will be run in the epilogue of our evil function before returning to the scene of the crime.
Here is an example:
The above code is invoked when the printer receives a G40 command. For some inexplicable reason, the function we renamed to “fun_gcode40_crash_thing?” is called and completely crashes the printer – and then it prints out a message saying “ok.” Not ok if you ask us.
Let’s modify this function so that instead it more sensibly prints out the word: “potato” and then calls crash thing to emulate one.
Now we are jumping to the segment we added earlier. We have also NOP’d out a bunch of the original code which just means it no longer performs any actions.
Lets take a look at hax_code!
We are loading up r0 with the address to our nifty potato string, and then calling puts_serial. Once we have informed the user of our intent, we turn the printer into a potato until it is reset. Now that is some added value! As long as we take care to preserve the registers for the rest of the execution, and are mindful of the stack, we can code up anything we want!
Now that we created a code cave by increasing the firmware size, we simply point the previous max temperature variable, which was constrained to 256 bytes to this new larger area with a much higher number to max the temperature of the printer.
Excitedly, we turned on the printer after making this adjustment and watched with bated breath as the temperature rose to 243*C then 244*C… until it hit about 260*C and cooled down. There was clearly some kind of thermal runaway code as a protection against errors. Tracking down this code was much trickier than finding the max temperature variable. We again started in the function that housed the temperature variable and continued renaming variables and functions until we thought we saw something to test. We would patch one value, reflash the firmware, and do it again day after day. Eventually we found the culprit.
In the third and last blog post, we will review the final safety mechanism in place on the printer’s firmware and get to see a demonstration of what happens when you do this to a live FlashForge Finder 3D printer. Stay tuned!