In this page I have documented a number of resources in various categories that I’ve found to be helpful over the years.  The page will always be a work in progress, so if you have any recommended additions please let me know, and check back occasionally for updates.

CTI Fundamentals

The following represent what I consider to be fundamental concepts, models, terminology, and methods in our field today.  I expect analysts I work with, other than perhaps recent college graduates, to have a familiarity with each of these and their application to our field.

  • The Psychology of Intelligence Analysis [PDF – for online version click here].  Richards J Heuer was a career analyst at the US Central Intelligence Agency.  After he escaped the cone of silence, he wrote this seminal work on the analytical mindset and some basic models and methods, to include the essential Analysis of Competing Hypotheses.  This is the first recommendation I have for any new analyst.  Heuer is an institution in the intelligence field because his approaches transcend any particular sub-discipline.
  • The Diamond Model of Intrusion Analysis.  This is the “Diamond Paper” by some of the most influential folks in our field who’ve since escaped the clutches of Ft. Meade and made their way into industry: Sergio Calgitarone, Christopher Betz, and Andy Pendergast.
  • Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains.  This is “the kill chain paper,” a product of myself and Eric Hutchins alongside our former manager Rohan Amin, wherein we captured some of the models and methods we were using at Lockheed Martin CIRT before many of these ideas had any formal terminology around them.  This paper was written to formalize our approach and encourage further research in what we saw as a potentially new domain.  We’ve been overwhelmed with the results.  This was truly the result of collaboration with all of our LM-CIRT colleagues from 2005-2011, without whom the paper never would have happened.

Other CTI Resources

Computational Analysis

These references and resources are helpful when working with objective data and calculated or derivative findings and observations.

  • Inferential Statistical Methods.  This is the most comprehensive, introductory-level reference for statistical methodologies and applications that I have found.  Written by a Vassar University professor, it is how I learned statistics when I found my previous undergraduate studies were insufficient.
  • Random Forests.  This seminal paper by Tan Kim Ho describes what has been (in my experience) a highly-valuable technique employed in machine learning (ML) for building classification trees.  It deals well in highly-dimensional data such as that found in our field without the over-fitting side-effect of many other techniques.  I’ve seen many attempts at heuristic algorithms and ML applications fail miserably in our field; the ones that have been successful have often employed this classifier.  For more information on RF, I’ve found the corresponding Wikipedia article quite helpful.

Research in Cyber Threat Intelligence

Below you will find some of the rare pieces of formal academic research that I’ve found are useful and applicable in real, enterprise environments.  In most cases, I’ve observed the implementation first-hand and can tell you that they are effective, and work well.

  • My friend, former boss, and close colleague Rohan Amin’s doctoral thesis, Detecting Targeted Malicious Email through Supervised Classification of Persistent Threat and Receipt Oriented Features (2011).  This was a rare and early example of 1) using machine learning, 2) demonstrated on enterprise data in a real environment, that 3) was provably implementable and an improvement over existing techniques used by AV companies at the time.  It inspired Charles Smutz’s later work on PDFRate leading to the first true CTI ML implementation I observed or worked with in my career.  This research was sound and pivotal enough to make it into IEEE Security & Privacy – a well deserved accomplishment, even more rare in that it was acknowledged as useful and valid by both academics and practitioners alike.
  • PDFrate (2012) – an ML random forest classifier written by my close colleague Charles Smutz while at LM-CIRT.  This was the first truly successful implementation of machine learning in an enterprise environment I witnessed, which appreciably & measurably improved detection, response, and defense.  Charles gets major kudos for this research, which he operationalized for public consumption here.

Blogs & Meta-Resources

The below blogs are those which I’ve found to be consistently useful for many years.  I recommend each of them highly.