Sun 3 Sep 2006
Instrument your code young man…. and do it now (Part 1)
Posted by gremlin under Application Servers, Java, Tech
Comments Off
Do you know how well your application functions? Can you tell me what is the average time for a request to be handled? the worst? Can you tell me how your application functions with time? If not then you have missed one of the most common “unwritten� requirements – that of instrumenting your code.
Hang-on I hear you say!! you can profile the application and unit tests can give you this information. Yes this may be true, but it is never a substitute for building into your code instrumentation as you go along. You will never know when it will come handy, and anyway testing your code on your development machine is rarely anything like running it on the real deployment environment.
For example, many years ago I developed some middle-ware for a large and very profitable international corporation. The piece of code was to mediate between the front-end and the back-end billing/resource system. Development went well and as usual I inserted instrumentation as I went along before the product was released into the production environment under the watchful eye of application management group. To help roll out the product there was a massive marketing campaign – TV, billboards the whole nine yards; first time I have ever seem anything I developed up on TV.
About a month latter my line manager (a really nice lady) came to me and said “Gary there seems to be a problem with the system; I have just had my arse kicked by sales and marketing. The product is a great success but there are a lot of unhappy customers who get ‘time out’ errors from the GUI, and the GUI guys say its because the connection to your middle ware is timing out, and the applications support environment says the 3 machines in the deployment cluster are fine. Could you investigate and get back to me?â€? Sound familiar? Talk about a bombshell and a potential career limiting move. With the manager there I said “lets look at the instrumentation and performance screensâ€?; A couple of minutes latter the situation came clear. Two of the three machines in the middle-ware cluster were “downâ€? for the times and periods mentioned but for different reasons; one machine was “upâ€? and running but there were no performance metrics; the other machine was not up at all!!! So in effect we were running at 1/3rd capacity – no wonder we had problems. We paid a quick visit to application support. To cut a long story short, it turned out the machine which was down was “being upgradedâ€? – why that was done in the “busy periodâ€? and why it took a week would be another article; the machine that was “upâ€? but had no statistics turned out not to be included in the cluster configuration at all!!!
These were soon fixed and everything went smoothly – especially when I wrote a script to monitor the scripts and analyze the results. So my manager would then know on a day/week/month basis how many actual transactions went through the system, how long they took, what the system throughput was and what we were capable of supporting, when the transactions took place. As a result my manager did not need to ask sales and marketing how many sales went through, and did not need to access application support unless from the statistics she was told one or more of the servers was down – where upon, she went looking for application support.
I suppose the moral of this little story is if you don’t instrument your code you will never know you have a problem until someone with more power and clout than you comes looking for you with a big club embedded with nails and a very bad attitude.
In the next blog I will go through the types of instrumentation and performance metrics are useful to put in your code.
No Responses to “ Instrument your code young man…. and do it now (Part 1) ”
Sorry, comments for this entry are closed at this time.