CVE-2023-27997 Vulnerability Scanner for FortiGate Firewalls
TL;DR
Bishop Fox has developed a tool to quickly check if a remote FortiGate firewall is affected by CVE-2023-27997.
Overview
CVE-2023-27997 is a heap buffer overflow caused by an incorrect length check in the FortiGate SSL VPN. The technical details of the vulnerability, including a description of how to achieve remote code execution, are described in detail in a blog post from the discoverers of the vulnerability. While LEXFO provides some IOCs in their follow-up post, they do not provide a safe method to check if a device is vulnerable other than checking the version information.
Memory corruption vulnerabilities are often difficult to detect remotely without crashing the target application. We developed a stable, non-crashing vulnerability check to help Bishop Fox Cosmos customers continuously assess whether their FortiGate appliances are affected by this issue. Since FortiGate has a wide footprint across the public internet, we decided to release this tool publicly to help others ensure they’ve patched appropriately. This blog post describes the design of our vulnerability assessment tool.
Approach to Detection
As mentioned in the original blog post from LEXFO, CVE-2023-27997 can be triggered by sending a crafted GET or POST request with an enc=
parameter to either the /remote/hostcheck_validate
or /remote/logincheck
endpoint of the FortiGate SSL VPN. The enc
parameter is a hex-encoded string consisting of a 4-byte (8-character) seed used to derive the key, followed by an encrypted data payload.
sslvpnd
decrypts the length field from the enc
parameter, which is contained in the first 2 bytes of the encrypted data, then validates this length field against the size of the input data. If this validation check passes, it proceeds to decrypt the remainder of the data. The second step, validation of the size field, is implemented incorrectly in vulnerable FortiGate devices.
The decryption process involves XORing an MD5 keystream with the input data to produce cleartext. A new MD5 operation is required to produce key material for every 16 bytes of input data.
For a large request, decryption can involve thousands of MD5 operations. If we provide a large request and size field, we can compare the response times of requests with valid and invalid lengths to determine if the full decryption step occurred. In our test environment, this difference was approximately 250 microseconds when lengths differed by 0x7f00 bytes.
FIGURE 1 - Target logic which validates the length and if the check passes, loops through the data and performs the MD5 operations.
Avoiding a Crash
This technique we are using to detect the vulnerability has a major downside. Since we are triggering the overflow, we are writing out of bounds. If we are not careful, this could result in an unintentional crash which would disconnect online VPN users. We can avoid a crash by understanding the memory allocator used by FortiOS.
FortiOS uses the jemalloc
allocator, which is an allocator designed for performance and fragmentation avoidance. jemalloc
internally organizes memory into regions of contiguous equally sized blocks. When the application requests a memory allocation, jemalloc
calculates the smallest block size which is greater than or equal to the requested size and returns it. This means that if the application requests a block which is slightly smaller than the next largest block size, there will be unused data between the newly allocated block and the start of the next block.
FIGURE 2 -There is a gap after our allocated data but before the start of the next allocation.
Now, suppose we trigger an allocation of 0xfe00 bytes. The smallest block size which would fit this allocation is 0x10000 bytes. This means we have 0x10000-0xfe00=0x200 bytes after our allocation which will be unused, and thus safe to overflow into. This lets us safely trigger the overflow bug while guaranteeing that it will not crash the application.
We selected a request length of 0xfe00 bytes for all requests and a length field of either 0xfeff or 0x7f00 bytes.
Statistical Analysis
The internet is noisy. The response time of a single request is primarily influenced by the latency of network traffic at a given moment, which can vary significantly. Network traffic also has occasional spikes in latency, and a few dropped packets can significantly affect the timing measurements. To counteract this, we collect many samples before processing the data.
We use two sample groups, one with a length field of 0xfeff and one with a length field of 0x7f00. If the first group takes longer than the second group, we know that sslvpnd
is processing our data with an invalid length field and is therefore vulnerable to CVE-2023-27997. If the second group takes longer, we know that the length validation check detects the invalid length, and the device is therefore not vulnerable.
FIGURE 3 - On patched devices, requests with an invalid length (green) take less time than requests with a valid length (blue). The opposite is true on vulnerable devices (orange).
We first perform basic preprocessing by filtering out high-latency packets. We then select a cutoff at the 75th percentile, which worked well in our testing and discard any sample with a timing above this threshold.
We then perform a Welch’s t-test on the filtered data. This is a statistical test which returns two results: a p-value and a t statistic. Put simply, the p-value tells us how confident the test is that the two groups have a different average, and the t statistic tells us how different those means are, and which is greater.
We repeatedly perform measurements until the p-value is below 0.001 indicating the test was able to clearly pick up the timing differences between requests with different lengths. We then use the t-statistic to determine which sample group took longer and report whether the device is vulnerable.
If we reach a certain threshold of requests per sample group and the p-value is still above 0.001, we still attempt to report whether the device is vulnerable, but this comes with a warning due to the low-confidence results. If the t statistic is too close to 0, we can’t detect a significant difference in means. Both outputs are usually due to poor connections, and most often occur when we attempt to scan a device which is physically located far away from the scanner.
Conclusion
CVE-2023-27997 is a very potent vulnerability and it is likely that threat actors will take advantage of it in the near future. We hope that this tool helps security teams and FortiGate administrators. To learn more about our research on vulnerabilities impacting FortiGate firewalls, check out:
Subscribe to Bishop Fox's Security Blog
Be first to learn about latest tools, advisories, and findings.
Thank You! You have been subscribed.