Friday, 14 December 2012

Historical Debugging (Part 1) - The end of the "No repro" problem?

If you think about it, with all the advancements that have been made within the software engineering industry over the last 40 years, debugging tools and strategies have not really advanced at the same pace.  Sure, parallelism has received a lot of focus in recent years and as such we have seen new debugging visualisers like the "Parallel Stacks" & "Parallel Tasks" windows in Visual Studio 2010 and the new "Parallel Watch" window in Visual Studio 2012 but by and large, debugging strategies have remained the same. Examine a stack trace; place some breakpoints in the associated source and try to reproduce the issue.

How many times have we or our teams tried to reproduce an issue using the reproduction steps provided by QA or worse still, the client and not been able to do so?  More times than I care to remember.  The problem becomes even harder to diagnose when the issue is data driven or caused by a data inconsistency issue and requires setting up an environment that closely represents that of the environment at fault.  We shouldn't be asking our clients for copies of their production data, especially if it contains sensitive customer or patient information.  So how do we debug issues we can't reproduce in a controlled development environment?  The answer, "Historical Debugging".

Historical debugging is a strategy whereby a faulting application can be debugged after the fact, using data collected at the time the application was at fault.  This strategy negates the need to spend time and money configuring an environment and reproducing the issue in order to perform traditional or "live" debugging.  Instead, the collected information allows us to dive straight into the state of the application and debug the issue as if we had reproduced it ourselves in our own development environment.

There have been two major Visual Studio advancements in historical debugging strategies over the last few years; "Visualised Dump File Analysis" and "IntelliTrace".  Both of these have made debugging production issues for maintenance and development teams simpler than ever.  Do they represent the end of the "No repro" issue?  Well between them, quite possibly and I'll discuss each of them in turn in the following two blogs..

No comments:

Post a Comment