Jump to content

User:Finkga/Cyber Analytics

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Finkga (talk | contribs) at 22:32, 3 September 2009. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Cyber Analytics

Cyber analytics is a science that supports the analysis of cyber data.

Cyber Analytics is a branch of analytics that applies to the domain of computers, networks, and related data. Cyber analytics is the science of analysis applied to computers and computer networks. Analysis is a decision-making based on observable facts (data). A scientific approach compares observations to hypothetical models. Thus, cyber analytics helps the analyst to understand the behavior of computers, networks, and user activities from the data computer systems use and generate. Cyber analytics tells the story behind cyber data. Cyber analytics can be used to support computer security, computer or network administration, auditing, and many other application areas.

Derivation

Cyber analytics assumes there is a unifying story behind the fractured set of available data. In fact, there are many different stories interwoven through the various streams of data. Which story is seen also depends on the purposes and perspective of the analyst. The cyber analyst's job includes both synthesis of these separate streams, abduction of hypotheses that may explain them, and analysis of the hypotheses by comparing them to the data. Thus, cyber analytics is the science of investigation into the meaning of computer data. A more accurate term might becyber Investigation, but this connotes law enforcement which is only one possible application area. Analytics is more neutral and thus is preferred.

Distinctive Features of Cyber Analytics

All analytic sciences support analysts who must make sense of massive, streaming data. Cyber analytics differs from other analytics primarily because of the characteristics of the data and the analysts who use it. Computer and network data is generated by simpler processes than most textual data. Thus, we would expect it to have lower entropy than data of human origin. However, cyber data is generated in extremely high volumes and velocities. Thus, it is streaming data that cannot easily be stored for long periods of time for off-line analysis. Cyber analysts often come from system administration, programming or other technical backgrounds as opposed to statistics where formal data analysis is taught. Thus, they often have their own approach to analysis based on subject matter expertise [1].

Cyber analytics supports analysis, but not necessarily for human analysts. Machine learning techniques can inform automated analysis and response facilities that might not require human intervention at all. In contrast, Visual Analytics is more human-centric, with the human user essential as the consumer of the visualizations. Cyber analytics can be applied to forensic investigations or to predict future events. The latter is similar to Predictive analytics which is used mostly in business.

The key difference between cyber analytics for computer security and other forms of analysis is the essential adversarial nature of the analysis.

Cyber data

Cyber data is characterized by an extreme volume and velocity of highly-structured data that is mostly not suitable for humans to read. For example, Fink cites the daily volume of security-related log events that DOE passes up the chain for central analysis to be 500 million events[2]. Cyber data is not normally human-readable, although many log formats (such as syslog [3]) contain human-readable content. The data is typically structured according to some machine-oriented protocol, but the protocols used may be non-standard or proprietary implementations of standard protocols that may not fully interoperate.

The high velocity of cyber log data makes it impractical to store[4].

Cyber analysts

The Need for Cyber Analytics

DOE cyber analysts must maintain near real-time situational awareness of a widely dispersed enterprise with over 100 sites, 500 thousand machines, and nearly 500 million events daily. The number of daily events is expected to soar into the billions in the near future. To maintain the safety of the DOE infrastructure, analysts must be able to gain a nation-wide perspective within seconds to minutes of a major event.

Analysis centers ask trending questions such as, “Are attacks becoming more effective?”, “Are attackers becoming more sophisticated?”, and “Are defenders improving their defensive posture?”. They must also answer key agency questions such as, “What resources is this external IP address accessing?”, and “Can you characterize the sites nation X is interested in?”.

Cyber analysts need tools for automated pattern extraction and recognition to track and monitor interesting events and show how bit patterns form indicators of behavioral patterns. They need predictive tools to support timely adaptation. For instance, they need to ability to detect the probes that form precursors of full-blown attacks. DOE cyber analysts need to be able to extend lessons learned at one site across the enterprise and to mitigate the effects of attacks before they happen.

Challenges

Cyber analytics is a new science that needs the rigor of standard procedures for measurement, repeatability, and prediction. Reference data sets and test suites can provide fair comparison of competing methods. Unfortunately, realistic cyber data is typically highly sensitive. We need anonymization methods that preserve the security properties of collected data without compromising privacy of the providers.

Cyber analytics spans multiple scales from processors and processes to computers, routers, and other devices to networks and internetworks.

Cyber analytics will enable predictive and adaptive approaches that improve defenders’ situational awareness and help analysts react in a timely manner. Human-guided automated response is needed for Internet-speed attacks. Large-scale collaboration in cyber defense requires very broad, nontraditional command and control strategies. Finally, defenders need to learn to use deception and to detect deception by attackers.

Tools

Cyber analysis tools and methods must be sensitive to the needs of the analyst so that they enable sense-making without forcing the analyst toward particular conclusions or uses of the data.

References

  1. ^ Fink GA, North CL, Endert A, and Rose SJ, “Visualizing Cyber Security: Usable Workspaces.” In Proceedings of the 2009 Workshop on Visualization for Computer Security (VizSEC 2009).
  2. ^ Fink GA, McKinnon AD, Clements S, and Frincke DA, "Tensions in security collaboration goals and how this affects incident detection and response," chapter three in Collaborative Cyber Security and Trust Management>, IGI Global, to appear.
  3. ^ http://www.ietf.org/rfc/rfc5424.txt?number=5424
  4. ^ needed

--Finkga (talk) 20:08, 17 July 2009 (UTC)