Event monitoring

From Wikipedia, the free encyclopedia
Jump to: navigation, search

In computer science, event monitoring is the process of collecting, analyzing, and signalling event occurrences to subscribers such as operating system processes, active database rules as well as human operators. These event occurrences may stem from arbitrary sources in both software or hardware such as operating systems, database management systems, application software and processors.

Basic concepts[edit]

Event monitoring makes use of a logical bus to transport event occurrences from sources to subscribers, where event sources signal event occurrences to all event subscribers and event subscribers receive event occurrences. An event bus can be distributed over a set of physical nodes such as standalone computer systems. Typical examples of event buses are found in graphical systems such as X Window System, Microsoft Windows as well as development tools such as SDT.

Event collection is the process of collecting event occurrences in a filtered event log for analysis. A filtered event log is logged event occurrences that can be of meaningful use in the future; this implies that event occurrences can be removed from the filtered event log if they are useless in the future. Event log analysis is the process of analyzing the filtered event log to aggregate event occurrences or to decide whether or not an event occurrence should be signalled. Event signalling is the process of signalling event occurrences over the event bus.

Something that is monitored is denoted the monitored object; for example, an application, an operating system, a database, hardware etc. can be monitored objects. A monitored object must be properly conditioned with event sensors to enable event monitoring, that is, an object must be instrumented with event sensors to be a monitored object. Event sensors are sensors that signal event occurrences whenever an event occurs. Whenever something is monitored, the probe effect must be managed.

Monitored objects and the probe effect[edit]

As discussed by Gait,[1] when an object is monitored, its behavior is changed. In particular, in any concurrent system in which processes can run in parallel, this poses a particular problem. The reason is that whenever sensors are introduced in the system, processes may execute in a different order. This can cause a problem if, for example, we are trying to localize a fault, and by monitoring the system we change its behavior in such a way that the fault may not result in a failure; in essence, the fault can be masked by monitoring the system. The probe effect is the difference in behavior between a monitored object and its uninstrumented counterpart.

According to Schütz,[2] we can avoid, compensate for, or ignore the probe effect. In critical real-time system, in which timeliness (i.e., the ability of a system to meet time constraints such as deadlines) is significant, avoidance is the only option. If we, for example, instrument a system for testing and then remove the instrumentation before delivery, this invalidates the results of most testing based on the complete system. In less critical real-time system (e.g., media-based systems), compensation can be acceptable for, for example, performance testing. In non-concurrent systems, ignorance is acceptable, since the behavior with respect to the order of execution is left unchanged.

Event log analysis[edit]

Event log analysis is known as event composition in active databases, chronicle recognition in artificial intelligence and as real-time logic evaluation in real-time systems. Essentially, event log analysis is used for pattern matching, filtering of event occurrences, and aggregation of event occurrences into composite event occurrences. Commonly, dynamic programming strategies from algorithms are employed to save results of previous analyses for future use, since, for example, the same pattern may be match with the same event occurrences in several consecutive analysis processing. In contrast to general rule processing (employed to assert new facts from other facts, cf. inference engine) that is usually based on backtracking techniques, event log analysis algorithms are commonly greedy; for example, when a composite is said to have occurred, this fact is never revoked as may be done in a backtracking based algorithm.

Several mechanisms have been proposed for event log analysis: finite state automata, Petri nets, procedural (either based on an imperative programming language or an object-oriented programming languages), a modification of Boyer–Moore string search algorithm, and simple temporal networks.

Tools[edit]

Name Description
Apsolab-IT Log aggregation and monitoring from distributed systems. Web-based dashboard with control over threshold. Email alert with filters.

See also[edit]

References[edit]

  1. ^ J. Gait (1985). A debugger for concurrent programs. Software-Practice And Experience, 15(6)
  2. ^ W. Schütz (1994). Fundamental issues in testing distributed real-time systems. Real-Time Systems, 7(2):129–157