Draft White Paper on Stick

Fun with Packets:
Designing a Stick

By Coretez Giovanni

This paper outlines a denial-of-service attack against not the computer network, but the human processes that support intrusion detection. This attack is a resource exhaustion attack as outlined in the previous paper "Topology of denial of service".
It is informally written to express my opinion by which the tool "stick" was written to exploit. Hopefully this tool clearly shows some IDS flaws that will soon be remedied by better IDS products. Arthur Money spoke at Blackhat '00 about quality. Too bad there must not have been any software developers there to listen.
I use Stick and other self-developed tools for evaluating stress capability of IDS and firewalls. At this time I do not have comprehensive listing of IDS that are unaffected by the preceding methodology that can be implemented using the Stick code.
I am not endorsing any products in this paper. This paper and tool are opinion and should be treated as such. There are two IDS that I found checked the state before payload alarm which is essential for defending against Stick, but these systems did not detected other header attacks nor were they robust enough to detect a good deal of other attacks.
If my home lab ever gets big enough I might be able to give a fair unbiased opinion. Until then, you should evaluate and consider this opinion with your own opinion and testing (as you always should do).

Designing of the Attack
People are the essential element in intrusion detection. Automated responses are rare due to self induced denial-of-service . Alarms are sorted in priority and are reviewed and summarized for response. Organizations hire just enough, people as to accomplish handling a normal load of alarms.
Therefore, when a high number of false alarms are produced, finding the actual attack becomes impossible due to the lack of resources to investigate actual from spoofed attacks. When a high number of false alarms occur, the shear number of alarms makes the alarm data useless in informing the decision makers of the real status of the network. This is a form of information overload.
The key of course is to create a high number of alarms that will trigger the system/network intrusion logs.

Create An Alarm
The easiest system to create alarms on is signature based intrusion detection systems (IDS). I will refer to IDS, in particular I mean network based IDS.
Signature based IDS use a predetermined criteria in order to determine bad from good. The three most common attributes in signatures are IP packet header fields, transport layer header fields and packet data payload. If the attributes to set off the criteria for these three sections are known, a trigger packet can be created.
Stateless Analysis
For reasons of speed and processing power, a packet is often evaluated on its own individual merit regardless of other packets on the network. This is seen in many early detectors like "Shadow" and exists in most commercial detectors. It is a primary concern to most IDS companies and a common measure of quality when evaluated by the media.
A design based on speed most likely means that a trigger packet needs no precursor event or post event in order for the trigger packet to set off an alarm.
Triggering an alarm purposefully is not something the designers think about much. Designers do care about false-positives (bad alarms), but false-positives are considered in the context of normal traffic, and that these alarms can be filtered out in time.

Validity
The first weakness that an IDS has in dealing with information overload attacks, is validity. An IDS should only care about scans because they mark a precursor to a possible attack. The scan itself causes no loss damage to the network except a decrease in obscurity.
When an alarm signature is written certain assumptions are made that is not always true. In the case of Snort, it is assumed a packet is in its proper state. Meaning a data packet had a successful handshake.
Therefore, producing a TCP data packet that meets all the requirements of a given signature sets off an alarm. This occurs regardless of the fact that no handshake occurred prior to the packet.
The alarm is not valid. Yes, it is an anomaly. Yes, dropped packets occur and the IDS might have missed a handshake. Yes, it's a pain to manage the stack in an IDS system. The alarm has still not been validated.

Managing the Alarms
A small number of alarms might be forgivable, but computers due one thing better than anything else. Computer repeat simple, mind numbing tasks over and over again.
Stick happens. Within two seconds there are over 450 alarms . The CPU of the sensor is hitting 100%. The system is too busy to listen to a control-C to turn off the sensor software. Meanwhile, the database is filled with alarms from every possible (and impossible) IP address from the Internet. The attack can last for as long as the attacker wishes.
And if there was a real attack in the list of 60,000 attacks, which one is real? Did anything really happen? How many people do you have to validate all the attacks?
Computer response groups act like any other emergency coordination center. They estimate the average load and manage resources just above the line with auxiliary people. False alarms are a big deal because they take away from scarce resources.
If an attacker can generate a large number of false alarms, the resource planning becomes invalid. The structure fails and must fall back on handling only critical elements without the aid of the emergency system.

Designing the Tool, Stick
The design of the tool is centered on speed and flexibility. If the tool was based on a set number of alarm patterns, it could be removed from the noise using trivial filters. So the tool needed to be based off the current signatures of an IDS in question and to be able to be upgraded without re-write to handle new configuration files when they are produced.
To ensure speed the code is generated and avoids too much depth in function calls, comparisons and jumps. To meet these objects an observation must be made. The rule structure for an IDS can be seen as a language. N-Code is a language. But so are the snort rules. I took advantage of this fact and wrote a compiler for the snort rules that would create a random packet generator. In the UNIX world there are tools for doing just this: Lex and Yacc.

lex is a short for lexicon analyzer
yacc is an acronym for "yet another compiler compiler"

If I still had the skill I had ten years ago I would have followed through with the proper way and used yacc, but my brain is not as good as it use to be so I cheated and kept the state tables in the lex code. The result is close enough as it produces most of the code needed. This lex generated code is added to a collection of functions and a main loop to produce the resulting generator.
As an after though I created a command line that assigned function pointers to randomization functions as to allow for random IP zones for targeting and spoofing.
The final result is a quickly configurable packet generator.

Looking at Snort
Let's take a look at Snort's basic deign from an IDS point of view. It has the components that the Common Intrusion Detection Framework Outlines. So, it has a formal IDS structure.
Note: Please excuse out of date material as the Snort product has been evolving.
Notice the language describes a "signature". It really is a language. It has a structured syntax that describes in a complete fashion the entire purpose. Snort is a data driven interpreter to the Snort rules.

ArachNIDS ruleset
Now arachNIDS, maintain by www.whitehats.com, is a list of high profile signatures to be used by snort. I use these signatures to help induce the snort IDS to alarm. Note that these rules being used are "stateless".

alert ICMP $EXTERNAL any -> $INTERNAL any (msg: "IDS162/Ping Nmap 2.36BETA"; itype: 8; dsize: 0;)
alert TCP $EXTERNAL any -> $INTERNAL 21 (msg: "IDS2/mworm-ftp-retrieval"; content: "USER mw|0D0A|"; flags: AP;)
alert TCP $INTERNAL 5400 -> $EXTERNAL any (msg: "IDS110/trojan-active-bladerunner"; flags: SA;)

So, the code will only need to send these packets and not establish the handshake necessary to induce a full connection. The last packet will make the IDS inform the system administrator that there is a Trojan on the network, but routing requires that this type of packet will need to come from inside which if discovered will hint at the location of the generator. To solve this we will ignore the direction requests of the rules and treat every rule as an external IP address going to an internal one. If time permitted the outgoing rules should be removed entirely.

Record
Lets take a look at the recording of events in the Snort utility using arachNID. We are using arachNID because it is open source and you can go through the design to understand the flow, handling and manipulation of the IDS processes. Also, these rules are consistent with the approach commercial systems use in intrusion detection.
These are good rules. The problem is not with the work done on this signature base, but the granularity of the Snort language and the defects in the Snort design.
Notice that the rules do not record events that do not trigger an alarm (an attribute of granularity). This is common in IDS design. The overhead of recording all packets that are transmitted across the network are high compared to the apparent return on cost.
Without certain non-alarm data, it will be impossible to determine the validity or damage of most attacks.

IDS Implementation Weaknesses
Attack validation
This topic was covered before. To determine validity there must be marker events that are precursors events or post events (negative markers), or the lack of certain markers (positive markers). This is what data mining gains you, but to do so you must already know the makers that need to be recorded.
Weight of an event
A technique to determine responding to an event is a combination of weighted values on the alarms. These weights can be attached to the possibility of:

the attack being real (false-positive weight),
the danger of the attack (if true, then how damaging is the attack) and
of the event compared to the number of times the event occurs (threshold weighting most commonly seen in determining floods and scans).

Weighting allows the assigning of priorities. It is important to remember that a priority is what events need to be responded too, and not what events have the greatest threat. These are not the same.
Also threshold profiling, like in Spice (www.silconedefense.net) can be defeated by introducing an anomalies that are not valid. The profiling algorithm will eventually reduce like anomalous event in priority. Once the anomalous events are accepted as normal, the actual attack can occur in the statistical space created with a reduced chance of detection.
Statically profiling was originally a feature in NID from Lawrence Livermore National Labs. Operators soon learned to not trust this assigned value once an intrusion began, due to the nature of the attack over time being considered normal. It can be hypothesized that spoofed attacks would have had the same effect.

Lack of recording
The IO time cost and the disk space cost can cause the IDS not to handle the higher speeds. Marketing people require the speed metric to compare their product favorably against the competition. Marketing of IDS is not on the quality of the IDS, but on its speed. This is a sociological flaw that will enable attackers little to fear from IDS when properly prepared.
Speed does come into play for the rare organization that is using OC-3 and gigabyte Ethernet. But, techniques for handling these speeds via load balancing are no different than that of downstream monitoring. If your organization is this large you better have a good budget and control over your infrastructure.
There is a large amount of data that IDS tend not to collect. One is the MAC address. This tends to make it difficult to tell if packets are spoofed entering the system or leaving it.
Also, most IDS do not start recording an attack until an alarm is triggered. This means that the original flaw that allowed access will not be recorded. Some IDS buffer that data, so that the IDS will have the last X number of bytes before the alarm to see what occurred before it.
Regardless, IDS do not usually record packet in great detail due to the recording requirements on IO and remote management.

Mis-Categorization as a weakness
Is there a danger in not categorizing an attack correctly? Often IDS quickly categorize attacks based of extremely general criteria. Port scans and network scans are the commonly mis-categorized events.

BO2K scan versus port scan

A pure BO2K scan would look appear as a series of SYN packets looking for a response on port 32767 as such

Source IP	Port	Dest. IP	Port
10.0.0.1	1055	10.0.1.1	32767
10.0.0.1	1055	10.0.1.2	32767
10.0.0.1	1055	10.0.1.3	32767
10.0.0.1	1055	10.0.1.4	32767
10.0.0.1	1055	10.0.1.5	32767
10.0.0.1	1055	10.0.1.6	32767

But is the following sequence a port scan or BO2K scan?

Source IP	Port	Dest. IP	Port
10.0.0.1	1055	10.0.1.1	10188
10.0.0.1	1055	10.0.1.2	32767
10.0.0.1	1055	10.0.1.3	32767
10.0.0.1	1055	10.0.1.3	9876
10.0.0.1	1055	10.0.1.2	14555
10.0.0.1	1055	10.0.1.1	32767

The scan is created by a BO2K scanner with a "noise maker" on it. A "noise maker" tricks an IDS system into improperly categorizing an attack. A response team may never see the actual packet data as they trust the IDS to inform them correctly. Mis-categorization means that the results will not be reviewed and therefore missing the intent and possibly the existence of a BO2K Trojan.
On a side note, snort can be programmed (not in current rule set) to catch the success of a scan by recording the SYN ACK and UDP responses of a scan, but the MAC address needs to be added for directional validity against insider threat.
The point of mis-categorization is that response is based off the alarm received. If an IDS sees an attack as NMAP then the response will react differently than if seeing a Vetescan (which uses NMAP).
Therefore, if an attacker wishes there purpose to be hidden, using a larger signature (scan) that incorporates a smaller one (BO2K scan) will hide there intent.

Conclusion
At one point I had I real long reason for writing stick. The best reason remains that open communication increasing the knowledge base of the community. Stick succeeds because "script kiddies" are operating security. People are downloading IDS and buying IDS without knowing what or why.
First, an IDS must be able to validate that the alarm is correct. This means that the IDS needs to determine if the pre-cursor and post events occurred that confirm or deny that an attack is real.
Second, the IDS signature language needs to be more accurate as to incorporate the accuracy of the alarms. Stateless analysis is was a flaw in firewall design, and it is also true that IDS cannot have signatures that are stateless.
Finally, the IDS should be generating alarms that aid in the response. Not all scans are equal. Progress should be occuring in the methodologies in how intrusions are determined and responded to. Instead, the methodologies simular to that of Virus detection is entrenching itself as a solution. Signature based IDS in itself is flawed, but the implementation of signature based IDS is done so immaturely that common programming methodologies are ignored.
The most common ignored software development idiom that is software is based off a solution and not a marketting requirement. IDS first must detect an attack acurately and lead to a response before issues of speed, user interface, and first to market are concerned. For if the objects of IDS were placed first in development, this tool would be more than likely only a testing tool to separate the wheat from the shaft.
I started with Arthur Money's plea for quality software, and I end this paper with it.

The information contained in this paper is for education purposes only. This paper is the property of Endeavor Systems, Inc., and is not to be replicated for commercial advertisement or gain without the written permission of Endeavor Systems, Inc.