Windows Error Reporting
Windows Error Reporting (WER) (codenamed Watson) is a crash reporting technology introduced by Microsoft with Windows XP and included in later Windows versions and Windows Mobile 5.0 and 6.0. Not to be confused with the Dr. Watson debugging tool which left the memory dump on the user's local machine, Windows Error Reporting collects and offers to send post-error debug information (a memory dump) using the Internet to the Microsoft or stops responding on a user's desktop. No data is sent without the user's consent. When a dump (or other error signature information) reaches the Microsoft server, it is analyzed and a solution is sent back to the user when one is available. Solutions are served using Windows Error Reporting Responses. Windows Error Reporting runs as a Windows service and can optionally be entirely disabled. If Windows Error Reporting itself crashes, then an error reports that the original crashed process cannot be sent at all.
Windows Error Reporting was improved significantly in Windows Vista. Most importantly a new set of public APIs have been created for reporting failures other than application crashes and hangs. Developers can create custom reports and customize the reporting user interface. The new APIs are documented in MSDN. The architecture of Windows Error Reporting has been revamped with a focus on reliability and user experience. WER can now report errors even when the process is in a very bad state for example if the process has encountered stack exhaustions, PEB/TEB corruptions, heap corruptions, etc. In earlier OSs prior to Windows Vista, the process usually terminated silently without generating an error report in these conditions. A new Control Panel applet, "Problem Reports and Solutions" was also introduced, keeping a record of system and application errors and issues, as well as presenting probable solutions to problems.
A new application, Problem Steps Recorder (PSR.exe), is shipping on all builds of Windows 7. This feature enables the collection of the actions performed by a user while encountering a crash so that testers and developers can reproduce the situation for analysis and debugging.
WER is a distributed system. Client-side software detects an error condition, generates an error report, labels the bucket, and reports the error to the WER service. The WER service records the error occurrence and then, depending on information known about the particular error, might request additional data from the client, or direct the client to a solution. Programmers access the WER service to retrieve data for specific error reports and for statistics-based debugging.
Errors collected by WER clients are sent to the WER service. The WER service employs approximately 60 servers connected to a 65TB storage area network that stores the error report database and a 120TB storage area network that stores up to 6 months of raw CAB files. The service is provisioned to receive and process well over 100 million error reports per day, which is sufficient to survive correlated global events such as Internet worms.
In the Microsoft Windows Error Reporting (WER) system, crash reports are organized according to "buckets". Buckets classify issues by:
- Application Name,
- Application Version,
- Application Build Date,
- Module Name,
- Module Version,
- Module Build Date,
- OS Exception Code,
- and Module Code Offset.
Ideally, each bucket contains crash reports that are caused by the same bug. However, there are two forms of weakness in the WER bucketing: weaknesses in the condensing heuristics, which result in mapping reports from a bug into too many buckets. For example if you compile your application one more time without any changes Module Build Date will changes however and same crash will be placed to another bucket. And weaknesses in the expanding heuristics, which result in mapping more than one bug into the same bucket. For example if two different bugs crash inside strlen function because they call it with corrupted string there will be only one bucket for both. The reason is because the bucket is generated on the Windows OS client without performing any symbol analysis on the memory dump. The module that is picked by the Windows Error Reporting client is the module at the top of the stack. Investigations of many reports result in a faulting module that is different from the original bucket determination.
Software and hardware manufacturers may access their error reports using Microsoft's Windows Dev Center Hardware and Desktop Dashboard (was Winqual) program. In order to ensure that error reporting data only goes to the engineers responsible for the product, Microsoft requires that interested vendors obtain a VeriSign Class 3 Digital ID or DigiCert certificate. Digital certificates provided by cheaper providers (such as Thawte, Comodo, GlobalSign, GeoTrust, Cybertrust, Entrust, GoDaddy, QuoVadis, Trustwave, SecureTrust, Wells Fargo) are not accepted.
Software and hardware manufacturers can also close the loop with their customers by linking error signatures to Windows Error Reporting Responses. This allows distributing solutions as well as collecting extra information from customers (such as reproducing the steps they took before the crash) and providing them with support links.
Impact on future software
Microsoft has reported that data collected from Windows Error Reporting has made a huge difference in the way software is developed internally. For instance, in 2002, Steve Ballmer noted that error reports enabled the Windows team to fix 29% of all Windows XP errors with Windows XP SP1. Over half of all Microsoft Office XP errors were fixed with Office XP SP2. Success is based in part on the 80/20 rule. Error reporting data reveals that there is a small set of bugs that is responsible for the vast majority of the problems users see. Fixing 20% of code defects can eliminate 80% or more of the problems users encounter. An article in the New York Times confirmed that error reporting data had been instrumental in fixing problems seen in the beta releases of Windows Vista and Microsoft Office 2007.
Privacy concerns and use by the NSA
In December 2013, an independent lab found that WER automatically sends information to Microsoft when a new USB device is plugged to the PC.
According to Der Spiegel, the Microsoft crash reporter has been exploited by NSA's TAO unit to hack into the computers of Mexico's Secretariat of Public Security. According to the same source, Microsoft crash reports are automatically harvested in NSA's XKeyscore database, in order to facilitate such operations.
While WER effectively collects all crashes over the world, it is not so effective in crash analysis and organization without debugging symbols. Also there are some difficulties mentioned above to get access to WER data for independent software vendors, especially for small one and open source teams. Because of that, there are some third party alternatives which allow users to also submit crash reports to the developers of the crashing software.
- Doctor Dump Crash Reporting System, free crash reporting and memory dump analyzing service, that collects, organizes and stores crash reports from Windows platform and provides users with solution/workaround to the problem immediately after the crash.
- Google Breakpad, an open-source multi-platform crash reporting system.
- XCrashReport, library that adds basic exception handling and crash reporting to Windows C++ application.
- What are WER Services?
- An overview of WER consent settings and corresponding UI behavior
- WER APIs
- Windows Error Reporting Problem Steps Recorder
- Debugging in the (Very) Large: Ten Years of Implementation and Experience
- How WER collects and classifies error reports
- MSDN Blogs > WER Services > The only thing constant is change – Part 1
- SysDev (was Winqual) website
- Update a code signing certificate
- Introducing Windows Error Reporting
- WinQual Registration Head Aches
- Microsoft Support Forum: WER with Thawte authenticode signed app
- The Old New Thing: How can a company get access to Windows Error Reporting data?
- The great digital certificate ripoff?
- Steve Ballmer's letter: Connecting to customers
- A challenge for exterminators
- Microsoft Privacy Statement for Error Reporting
- "Are Your Windows Error Reports Leaking Data?". Websense Security Labs. 29 Dec 2013. Retrieved 4 January 2014.
- Inside TAO: Documents Reveal Top NSA Hacking Unit