New cyber tool learns network behavior to sniff out malware
- By William Jackson
- Aug 20, 2013
Network administrators and security officials could soon have a new tool to help detect malicious traffic on their networks by sifting out the command and control traffic of infected computers from the background noise.
Researchers from the Georgia Institute of Technology tested a prototype of the tool, called ExecScent, on live networks and identified dozens of previously unknown command and control domains while discovering hundreds of infected hosts on the networks.
ExecScent spots the traffic by using templates of common command and control protocols used by malware. What sets it apart is that it also uses machine learning to understand the normal traffic patterns of the network on which it is working.
“It learns to adapt to normal background traffic,” said Mustaque Ahamad, professor at Georgia Tech’s College of Computing. By spotting traffic that is both similar to known examples of C&C communication but different from normal traffic, it reduces the number of false positives and increases the value of the results.
The tool was developed over the past year by Terry Nelms, a doctoral student at Georgia Tech and director of research at Damballa, Inc.; and Roberto Perdisci, assistant professor in the Department of Computer Science at the University of Georgia and adjunct professor at the Georgia Tech School of Computer Science; along with Ahamad.
The three presented the results of their work at the USENIX Security Symposium in Washington. ExecScent is being commercialized by Damballa.
Modern malware on an infected host typically communicates with a command and control server to send home stolen data and receive instructions. Tracing this traffic can be a way of spotting infections and identifying their source, but because attackers often use multiple servers on rapidly shifting domains, identifying the traffic is not always easy.
ExecScent takes advantage of the fact that command and control protocols often are reused in multiple variants of malware, which can make them easier to spot. Looking for known patterns and signatures is not new, but distinguishing them in real time in high volumes of network traffic can be a challenge. The ability to learn and adapt to network norms helps expose the malicious traffic.
“The idea at the high level sounds intuitive,” Ahamad said. “But making it work took a lot of effort. There is a lot of engineering in the system.”
The tool can automate much of the process of identifying malicious traffic to produce actionable results. “Overall, within the entire two-week test period ExecScent generated a quite manageable number of false positives, in that a professional threat analyst could analyze and filter out the false C&C domains in a matter of hours,” the authors wrote in their paper.
The developers used a commercial security service to gather malware-generated HTTP network traces associated with known malware and used the data to create templates of the common command and control protocols. Normal traffic on the network where the templates are deployed was studied for several days before going into operation. If a C&C template pattern closely matches legitimate traffic on that network, that pattern is given less weight in making a match.
The prototype was tested with more than 4,000 templates for two weeks on two university networks and one large financial institution. The networks generated from 35 million to more than 66 million HTTP requests a day.
In one university network, “we detected a total of 66 C&C domains, of which 34 are new, previously unknown C&C domains,” the authors wrote. They also detected 105 infected hosts, 90 of which were new infections related to 34 previously unknown C&C domains. The prototype match engine on the second university network, which was operating on hardware together with other production software that had a higher priority, could not keep up with the higher volume of traffic. An optimized version tested later ran eight times faster.
Only two new C&C domains were found on the financial institution’s network, which was not surprising because of its multiple layers of security. “However, our findings confirm that even well-guarded networks remain vulnerable,” the authors wrote.
Sixty-five of the newly discovered C&C domains were monitored on six ISP networks, and 25,584 IP addresses were found querying them.
ExecScent is not perfect, the authors said, but it produces actionable results. Also, efforts by attackers to hide from or mislead the tool probably would be time consuming and impractical, and ExecScent would interfere with bot operations.