Solving the Unredacter Challenge
Overview
Serious security researchers are constantly monitoring industry happenings for interesting technical research in our field. An article published by Bishop Fox in February 2022 entitled "Never, Ever, Ever Use Pixelation for Redacting Text" by Dan Petro was one such article.
Shortly after the article was posted, I challenged some on my team of security consultants to study the author's research and think about their own use of redaction tools and techniques. Penetration testing and application assessment engagements can discover vulnerabilities that expose sensitive data. Redaction is often necessary to obfuscate sensitive information captured in proof-of-exploitation screenshots we embed in client deliverables. Ultimately, it's in everyone's best interests for Optiv to redact this information in a reliable and unreversible manner.
Fast-forward a few months, Bishop Fox announced a CTF-style challenge related to their earlier research.
The Unredacter Challenge
The CTF was straightforward enough - crack the redacted text in the image below. The challenge was valid from March 31, 2022 8 a.m. ET through August 14, 2022 at 11:59 a.m. ET. The first five successful submitters would receive recognition and a chance for a grand prize. Winners would be announced at the close of DEFCON 30 in August.
Figure 1: Source image of the challenge
My Approach to the Solution
My first step was to consider what clues the challenge image held to solving the puzzle. Based on my CTF experience over the years, I've learned to begin by thinking objectively on what can be observed and then jotting them down. I do this even before laying out an attack strategy. For me, this helps me avoid making too many false assumptions (I still make some) which can result in many wasted hours of frustration. One approach I like is the OODA loop:
Observe - relax, see/listen for the facts of the situation / environment
Orient - look around, put observations in context of objectives/goals
Decide - pick a direction and communicate it out
Act - take ownership of our decisions/solutions, be default aggressive, and follow through on the outcome
With this in mind, I asked - what did the image tell me? Well, there is the obvious unredacted taunt. This actually provides the first potential clue - the font of the blurred text could be the same font as the unredacted text. The font size, weight, and other attributes may be slightly different, but I felt this was at least a good bet to move forward from.
For reversing pixelated or blurred text in an image, it is critical to estimate the source font as accurately as possible if there's any hope at all to leveraging a brute-force tool like Unredactor or Depix. So my next step was to consider how I could validate the source font. I could approximate the typeface, including weight and size, apply blurring effects using an image editor to some sample text, then visually compare the results to the challenge image.
I had to assume Bishop Fox wasn't so cruel as to leverage a non-standard system font in this challenge. And honestly, I just had to get close enough. So I fired up various text editors on Mac and Windows, typed out the top-line text, and manually cycled through the available sans-serif fonts in quick succession. Through this process, I watched for various typographic characteristics to match the apostrophe, "g" and "y" characters especially. I leveraged TextEdit, Microsoft Word, and Notepad and determined TextEdit was the easiest to step through the choices one-by-one to narrow down the possibilities. I landed on several close options: Ariel, Lucida Sans, Verdana, and Yu Gothic UI. For the record, I never obtained a perfect match, but I selected Yu Gothic UI Regular 26 to move forward with.
My next main observation was that the effect applied to the obfuscated text was clearly not pixelation, but a blur. Thus, the Unredactor tool created by BF would be of little help, at least in its unmodified current state. My hypothesis was that one of the more popular blur effects like Gaussian Blur was used.
At this point, I considered overhauling the Unredactor or Depix tools to work with blurred source. Based on my team's research experience with recreating the JumpSec challenge, I knew it would take considerable time to work through pixelation offset, block size, and other considerations to get the tool to function at all. This also raised questions about how Gaussian blur diffusion compared to pixelation diffusion, and what adjustments would be necessary. I also wondered if there was a technique to reverse an "unreversible" blur filter?
I paused, took a breath, and considered other options.
My recent research in AI-based pattern matching led me into discover a few AI-based tools that could possibly reverse blurred text content. One tool was Image Upscaler. Their website offers a limited-use online deblurring service (https://imageupscaler.com/deblurring/). Keeping my expectations low, I uploaded a cropped version of the challenge image and downloaded the results.
Figure 2: Image processed with ImageUpscaler.com tools
Not terrible, but also not quite what I was looking for. While I wanted to explore this particular path more, I decided to pivot and try one of my trusty image editor tools called GIMP. This tool has come in handy on several CTFs in the past, so I thought, why not?
Using Gimp, I applied the Sharpen filter (obviously) with various values for Radius, Amount, and Threshold until it yielded decent font shapes. Most of the phrase was clear enough, at least through neighbor character inference, to obtain some of the plaintext.
Figure 3: Image sharped through Gimp tools
At this point I had enough starting characters to run "45-3456-w-3453" through the Gaussian filter in GIMP, using my font mentioned above, to compare the results, and they were actually pretty close. I was fairly confident about the first half of the plaintext at this point.
The remaining portion of the flag proved more problematic. I leveraged Google to search for candidate terms to what I thought the next word may represent. Through trial-and-error, I discovered a candidate proper noun - "Transnet". After running it through blur/pixel filters in Gimp, this appeared to be a reasonable guess-word. And the last blurred character was either a "3" or "J". Which left just one more blurred word to crack.
This final puzzle piece took me the longest to reverse. I ended up researching algorithms used to reverse Gaussian blurs, such as the Richardson-Lucy deconvolution. Since GIMP lacked this particular effect, I turned to an online alternative called G'MICol. This site offers various filter options, including a Richardson-Lucy deconvolution, Low-variance normalization, and many others. I found moderate success using high amplitude values and moderate levels of threshold and iterations with these filters.
Going back to Google, I conducted OSINT on Transnet and their industry. I settled on "onrail" and "onerail" as the final candidate terms. Running these through GIMP blur filters, "onrail" was word I chose.
I submitted my answer "45-3456-w-3453 Transnet onrail-3" to the challenge moderators on July 8. They confirmed a few days later that my guess was close enough to qualify as a winner. Time will tell what portion I flubbed, but nonetheless I had a fun couple of hours hacking on this challenge.
UPDATE: Sept. 1 2022 Bishop Fox notified me this week that I was selected as the Grand Prize Winner of their Unredacter Challenge. I would like to extend a special thanks to Dan Petro for the brief yet fun distraction from my daily duties.
Take-aways
- OODA Loops are useful.
- Similar to how audio engineers leverage DAW software plugins like Unveil and De-Reverb/RX to "undo" reverb and other effects on audio tracks, attackers can leverage scripting tools and imaging software plugins to reverse pixelated and blurred content.
- Don't rely on blur effects to redact sensitive textual content on web pages, documents, and other media. Instead, apply opaque color block overlays in a permanent manner.
- These same considerations apply to images and videos on social media platforms, online services like Google Maps, and others where blur filters are incorporated to obfuscate identities. Think about all those faces, tattoos, body markings, security badges, and such on the Internet "protected" by "unreversible" blur filters.
- Tools leveraging artificial intelligence (AI) and machine learning (ML) technology will continue to emerge and evolve to reverse common obfuscation techniques in use today.
References
- Blog article posted Feb 15, 2022 https://bishopfox.com/blog/unredacter-tool-never-pixelation
- Unredacter tool https://github.com/bishopfox/unredacter
- Depix tool https://github.com/beurtschipper/Depix
- G'MICol tool https://gmicol.greyc.fr/index.php
- Richardson-Lucy Deconvolution function https://en.wikipedia.org/wiki/Richardson%E2%80%93Lucy_deconvolution
- Fonts and Typefaces https://en.wikipedia.org/wiki/Typeface_anatomy
Optiv, the cyber advisory and solutions leader, originally published this article on September 1, 2022. Reprinted with permission.
Watch our exclusive interview with Shawn to hear how he solved the challenge.
Subscribe to Bishop Fox's Security Blog
Be first to learn about latest tools, advisories, and findings.
Thank You! You have been subscribed.
Recommended Posts
You might be interested in these related posts.
Nov 01, 2024
A Brief Look at FortiJump (FortiManager CVE-2024-47575)
Sep 24, 2024
Broken Hill: A Productionized Greedy Coordinate Gradient Attack Tool for Use Against Large Language Models
Sep 11, 2024
Exploring Large Language Models: Local LLM CTF & Lab
Jul 02, 2024
Product Security Review Methodology for Traeger Grill Hack