Detecting Web shells continues to be a common talking point in the world of Cyber Security. During our daily operations at ParaFlare, we often encounter Web Shells in client networks that are analysed and remediated.
In the larger community, organizations like the US National Security Agency (NSA) and the Australian Cyber Security Centre (ACSC) have released advisories for the detection and prevention of web shells.
The NSA/ACSC advisory covers numerous detection and prevention techniques, from host-based signature hunting to known good file comparison and network traffic analysis.
ParaFlare has endeavoured over the past month to cover in more in detail the “host-based signatures” approach to web shell detection.
As noted in the ACSC advisory host-based signature detection is potentially unreliable due to the limitless possibility attackers have when creating their web shells. Different languages, functions, obfuscation methods and more allow attackers to defeat many host-based signature checks if the attacker is even slightly diligent.
In this blog post, we hope to cover some potential approaches to detecting web shells through host-based signatures.
We will also be releasing a small PowerShell script that can automate these methods against a given web directory.
The script is geared towards Windows-based web servers as we found that a lot of released web shell scanners tended towards scanning Linux environments.
We also prioritized no dependencies as SOC’s rarely have (or want) the approval to install dependencies on remote client infrastructure.
For more information on the web shell scanner visit ParaFlare’s Public Github.
Methods – String Match and Entropy
Our initial task was to brainstorm methods that would be able to detect web shells on a system, without providing too many false positive hits.
Our final list of detection methods includes some common detections used by most, if not all, web shell scanners. As well as some hopefully unique approaches that have proven to uncover some of the more stubborn to detect web shells.
Let us jump into the specifics of each approach.
Yes, string matching has its flaws.
It is easily defeated and yet has often been the crutch that security software vendors lean on too heavily. However, string matching is a great way to capture a large portion of the low hanging fruit which in this case is large non obfuscated web shells.
String Matching essentially alludes to picking a word, or rundown of words and parsing the target to check whether those words exist. In the example below I have pulled a snippet from a web shell that places a Base64 encoded string inside a variable which it then decodes and evals. This is a very common execution method in the land of web shells. In this scenario, it is easy enough to look for “eval” and “base64_decode” to find this web shell.
This method is not without its flaws. Whilst there are lists available of functions that are commonly used by attackers. Merely searching for one of these words to appear in a file does not mean the file is malicious. Functions like eval, base64_decode, passthru, system and more are also used by developers in their legitimate scripts.
To remedy a large amount of the False Positives that basic string matching returns we can set a “count threshold”. If we find enough strings inside our blacklist to meet the threshold, we go ahead and call the file a web shell.
- Simple idea to implement
- Already well documented online
- Finds non obfuscated shells with low effort
- Easily defeated through string obfuscation
- High false-positive rate
Entropy, as it relates to digital information, refers to the randomness of a given set of data.
It is often used to determine whether files contain compressed or encrypted information. We can calculate the entropy score of web servers file contents to help determine if the file contains compressed or encrypted data. As an example, here is a snippet of a web shell that contains compressed data.
These lines cause the entropy score for the file to be quite high, with a final entropy score of 7.87
for comparison, here is a randomly selected file from a Wordpress plugin, which has an Entropy score of 5.5.
Using the scores we can alert on files that go above a certain threshold that we determine.
Interestingly, some web shells can also be detected by alerting on very low entropy scores as well.
Over the course of developing our web shell hunter, we have found that most legitimate files tend to sit between a score of 4 - 5.5. Alerting on files that score 5.6 or higher (or maybe 6 or higher to reduce FP’s) has uncovered a decent amount of web shells in our scans.
- Detect malicious code that is either compressed or encrypted which would not be detected with string matching
- Still prone to false positives unless the threshold is set quite high and/or quite low
- Misses files that only have small amounts of encrypted/compressed data
- Large files with lots of text can naturally have a higher entropy score and force you to push the alerting threshold higher