History of Technical Attribution Mistakes: My Notes
I thoroughly enjoyed watching Sarah Jones present on common attribution mistakes that have been made historically by cyber threat intelligence analysts. You can check out the full video here:
Overview
Sarah Jones, a Principal Analyst at FireEye [Mandiant (now part of Google Cloud)], examined the analytic mistakes that the information security community has made over the past ten years when attributing nation-state cyber attacks. This article is my notes from listening to this video, which I thought would be insightful sharing as not everyone has the time to consume the content.
In her presentation, Sarah makes note that the most common attribution mistakes occur when over relying on certain components of the Diamond Model, which is primarily:
- Infrastructure centric analysis.
- Capability centric analysis.
- Victim centric analysis.
She also acknowledged that analysts can sometimes have cognitive biases which impact their adversary analysis, another component of the Diamond Model.
Over-reliance on Infrastructure Centric Analysis
The infrastructure component of the Diamond Model describes the physical and/or logical communication structure an adversary uses to deliver a capability, (e.g. malware staging server), maintain control of capability (e.g. command and control/C2) and effect results from the victim (e.g. exfiltrate data). Examples include:
- IP addresses
- Domain names
- E-mail addresses
- Virtualised server infrastructure
- USB devices and physical workstations
This infrastructure can be fully controlled or owned by the adversary, it could also be controlled by an intermediary (for example an unknown victim) or a service provider.
Sarah’s presentation included that analysts have made mistakes in the past by over relying on the following data points when determining threat attribution:
- Dynamic DNS: If this is incorrectly interpreted as actor controlled infrastructure and does not consider that it is shared, it can lead the analyst to come to inaccurate conclusions.
- IP Egress Space: Just because one actor has used an IP egress point, it does not mean that all traffic from that point is malicious. For example, there was research completed where 2000 IP addresses related to malicious activity were associated with egressing from a Chinese IP egress point. However, this doesn’t imply that all traffic egressing from this IP block are malicious.
- VirusTotal Timestamps: Interpreting a file creation date as being set custom by actors for counterintelligence, as some actors “timestomp” their malware which removes or adjusts the creation date of files. However, when a user uploads a file on VirusTotal, there is some JavaScript to check the date and time on the computer uploading, and if the date and time is wrong, the wrong one is set.
- Name Servers and Registrars: Just because an APT has preferred name servers and registrars, it doesn’t mean that all other actors who use those name servers and registrars are connected to that specific APT. This was a mistake made by “root9b” as they saw an actor registering US financial markets related domains to specific name servers and registrars. They inferred that it was APT28 and than attack on US financial markets was going to occur. This was not correct, as name servers and registrars are not unique to one actor.
- Scans are not Attacks: A company called “Norse” collected a large volume of data from it’s global sensors and saw that a significant volume of traffic coming from Iran included scans and pings. They inferred from this that there was an imminent attack coming from Iran. This was not correct at all, and whilst they had a large volume of data that they had collected, they assigned too much significance to this data when completing technical attribution.
Over-reliance on Capability Centric Analysis
The capability component of the Diamond Model describes the tools and/or techniques of the adversary used in the event. This could include a variety of capability from unsophisticated methods (e.g. manual password guessing) to the most sophisticated and automated techniques. For example, Mandiant recently discovered that a suspected Chinese actor was utilising a 0-day vulnerability in the Barracuda Email Security Gateway.
An actor’s capability can include their sophistication to identify 0-day vulnerabilities and develop exploits, create or use off-the-shelf malware, and use Command and Control (C2) infrastructure. When reviewing the use of C2 infrastructure, this refers to the channels, communication structures, signals, protocols and other content from the adversary to cause effect (e.g. gain access, deliberately remove access, exfiltrate data, send attacks), progressing the adversary towards achieving their goals.
Sarah’s presentation included that analysts have made mistakes in the past by over relying on the following data points when determining threat attribution:
- Malware: When analysing malware, it creates a generous amount of data. When generating this data, it can causes analysts to over-focus on certain unique components. Some researchers would assume that the relation between the malware and the threat group is 1:1 and historically some researchers would use the name of the malware interchangeably with the group. However, adversaries running very different operations could be sharing the same malware.
- Builders: Adversaries could be sharing a common set of “builders” to create their own malware variants. This doesn’t mean that the adversaries are the related or the same.
- Exploits: 0-Day exploits can be discovered by multiple actors and can be used at the same time. Just because one actor is using a 0-day exploit, and another is using the same, it doesn’t mean that they are linked.
- Build Environments: Metadata in documents from build environment could be confused as being attributable to the same actor. For example, word documents with the author name “Grizli777” could be seen as a sign of a specific threat actor. However in actuality, this is assigned by a pirated copy of Microsoft Office 2007. Therefore, there is no linkage between threat actors where the author name of document is included as this.
Over-reliance on Victim Centric Analysis
The victim component of the Diamond Model refers to the target of the adversary, and against whom vulnerabilities and exposures are exploited, and capabilities are used. A victim can be described by both it’s persona, and it’s assets, as they serve different analytical functions:
- Victim Persona are the people and organisations being targeted whose assets are being exploited and attacks. These include organisation names, people’s names, industries, job roles and interests.
- Victim Assets are the attack surface and consist of the set of networks, systems, hosts, email addresses, IP addresses, social networking accounts and more which the adversary directs their capabilities.
Sarah’s presentation included that analysts have made mistakes for victim centric analysis due to cognitive analysis traps. This can result in collection biases for telemetry data as an example. The following are examples of these cognitive analysis traps:
- Correlation does not equal causation: Just because one event follows another event, does not mean it was caused by that event. For example, Ukraine was cyber attacked by Russia after oligarchs had their ships seized. However, this does not mean that the cyber attacks on Ukraine were caused by this. In fact, there was regular cyber attacks occurring caused by Russia at this time regardless.
- With this, therefore because of this: Just because something happened at the same time as another event, does not mean that they correlate.
- Anchoring — primary vs secondary targets: There is a tendency by analysts to place too much weight on one piece of evidence at the expense of all other pieces of evidence. This can also happen because the first piece of evidence they discovered caused them to believe that the attribution was correct, regardless of any other type of evidence that presented itself.