The Problem is Problem Management!….or rather, the lack of!
It amazes me after all of these years how little progress we have made in the areas that matter. I have asked thousands of IT employees in our simulation and ABC workshops ‘How many are doing Problem management’? only about 15% of the hands go up, and those that are doing it are doing it half heartedly and don’t really get the strategic importance of problem management.
In my mind, together with CSI, problem management is one of the CORE cababilities that IT organizations NEED to develop. Especially now that there is a growing demand for and importance of IT for the business, especially now with the pace and growth of emerging technologies, especially now with the trend for adopting emerging technologies, especially now with the trend for ‘ agile’ , fast deployment, especially now with the move to other sourcing models such as cloud, especially when you look at the definition of a Service, as in Service management. A Service is all about Value and Outcomes against Costs and Risks. Problem management is an excellent risk management capability.
I will try and explain what I mean.
Often in our business simulations I play the Customer role, it is a great role to play because I get to shout a lot, shake my head in disbelief and point my finger a lot at the teams when they fail to deliver ‘Service’ to my business, by not being able to translate all their best practice theory into practice.‘ I’m losing money!’ I moan, ‘ We haven’t been productive this month!’ I moan. ‘ Do you know what my strategy is? What goals I am trying to achieve? What the impact is on my business of what you have just done?’ ….there are usually blanks stares and a nodding of heads. Nodding ‘ NO!’.
What do you mean you don’t know?
I ask them during reflection ‘What is IT being paid for‘? In almost half of the sessions I get to hear the answer ‘To solve incidents’. ‘No’! I bluster as my customer role….‘I pay you preferably to stop incidents from happening in the first place!. If you did changes properly, and found out where all those incidents were coming from and removed the cause of all the incidents I wouldn’t need all those fire-fighting hero types that are proud of their technical expertise in fixing things. I sometimes get the idea that IT’s core capability is putting things in and breaking them so that they can then show me how good they are at fixing the broken things…..that they broke!’ Like I said it’s great fun playing the customer role, a good chance of getting rid of all of my frustrations.
Here lies for me the crux of Problem management. ‘ Where are all those incidents coming from? What impact are they having on the business? How can we give adequate and effective work-arounds to staff to solve them as quick as possible? How can we propose changes to stop them happening again, AND PM’s vital role in CSI – which other processes could we improve to stop a similar situation occurring in the future which will also cause a flood of incidents.’
‘We don’t have time to do Problem management like that! We are too busy solving incidents and making changes to repair changes that didn’t go right in the first place!, we don’t have time for that because there is a massive demand for us to work on new solutions and projects….’ People declare.
‘ And we don’t have the time or staff to work on all these new projects because….?’ I ask.
‘ Er…….because we are busy trying to fix all the work that support is asking us to do like solve incidents?’
‘ And if you do problem management to stop or reduce the amount of stuff coming from support’?
‘ Er…..we would have more time to work on the new solutions that the business is demanding…’?
‘ I see… and what do you think the business would RATHER see you doing?’
‘ Er….?…….working on new solutions to support, enable their business. Generating new value, increasing and improvement business outcomes…’
‘ I rest my case’.
Let’s go back to what I said before about problem management being important.
Especially now that there is a growing demand for and importance of IT for the business.
- When IT breaks it has a significant impact on business value and business outcomes.
- IT MUST understand the impact and urgency associated with the outages. Top chosen ABC cards world-wide reveal ‘ IT doesn’t understand business impact and priority’ , ‘ IT is too internally focused’
- When IT breaks the business loses productivity and certain functionality is no longer fit for purpose or use? What is the additional costs associated with this? What is the loss of revenue or value, what is the potential risk the business faces and is this risk acceptable?’
- These are questions problem management should be exploring and using to make a business case. Send Problem management staff into the business for one day to see and understand how critical solutions are being used and discover what the BUSINESS impact is or all the outages that occur.
Especially now with the pace and growth of emerging technologies
- As we roll out ever faster, ever more technology we are always behind the learning curve.
- Problem management can analyse weaknesses and capabilities that cause outages by analyzing the incidents arising from emerging technologies and help document work-arounds and make suggestions to develop new skills and capabilities.
Especially now with the trend for ‘ agile’ and fast deployment.
- Agile and fast deployment always means cutting corners somewhere, not documenting something, not fully testing something, making something that may not be fully as designed which will cause downstream incidents or requests….
- Problem management can analyse incidents associated with an Agile deployment and help identify the gaps that still need plugging and signal risks in agile ways of working and suggest countermeasures.
- For example ‘ each time we do an agile deployment of this range of applications we have a massive amount of incidents relating to ‘ usage’ , the agile team is now assigned to a new project, this means incidents are open longer, the business users are less productive and not generating the outcomes and expected value, this means X number of transactions are unavailable with a X impact on revenue/costs of manual processing, rework….
- therefore the suggestions is either generate a set of FAQs on usage or self service features for the most expected ‘ incidents’ (Trends can show which sort) or trends show in the first 2 weeks an X growth in incidents, have a least one person of the agile team assigned to incident/request support during the first go-live weeks’
- These are the types of things Problem management can be doing. Once again you see the need to understand business priority, business impact in tearms of value, outcomes, costs and risks to business operations and strategy.
Especially now with the move to other sourcing models such as cloud.
- Problem management can analyse all incidents caused or arising in cloud services and analyse possible problems, problem management can signal weaknesses, risks, threats, additional costs and impact on Value and outcomes, once again.
- Making suggestions to those managing cloud provider contracts on agreements and demands. Problem management can provide insights that can help shape RFI’s and RFP’s for cloud services.
Especially when you look at the definition of a Service, as in Service management. A service is all about Value and Outcomes against Costs and Risks.
- Problem management is an excellent risk management capability. I often hear this. Problem management makes problem records and suggests changes but problem management isn’t taken seriously and their changes are always sees as less urgent….problem management is often poor at making the business case for the change.
- We saw the need for Problem management to understand the business strategy, the goals they hope to achieve, how for example agile is a strategic instrument, which new solutions are seen as strategic and what the expected value generation is to be, or the expected increase in productivity and outcomes, or the reduction in business operating costs.
- Problem management can analyse the growth and type of incidents and relate these to the V,O,C,R. Signaling risks, threats, loss of opportunities, impact on growth…etc,.
- Problem management can signal which process areas could have prevented the rise or type of many of X types resulted in poor change validation and testing, therefor the change management process and testing and validation scripts can be improved to prevent this……This is input for CSI.
This was my rant on problem management. The ITIL experts and Gurus will probably now shoot me down saying ‘ yes but on page X in ITIL it says this….’ , or ‘ yes but the ITIL process availability management….’ , yes but….ITIL and ITSM frameworks are guidelines. Not something to be implemented to the letter.
One organization I know started this by getting 10 specialists together. Not as a project….as part of their daily work. Remember, according to them they get paid for fixing things. They got 1 from application support, 1 from network, 1 from server support etc. Each was asked ‘What the the top 5 most common repeating incidents in your area that cause the most negative impact on V,O,C,R’? This was new to the specialists and made them think…..’We don’t really know the impact’! Go and find out. This exercise helped them better understand the business, the business use, the business priority, the business impact. They made work-arounds and in some cases made changes….they then gathered new incidents stats and saw a signicant reduction in the incident volume and the resolution times. This was then roughly equated back to the increase in business productivity, the reduction in wasted hours and costs. These was partly determined by asking the business users. It came to a large amount of money, hours, productivity improvements in business and IT. They documented this in a small report. ‘ What did you do’ ? asked the senior IT managers, ‘ How did you do that’ ? asked the business unit managers……’It’s this stuff called ITSM best practices, what we did is called CSI, what we looked at is called problem management, there is more of this stuff like validation and testing of changes, and stuff like managing workloads and demands, if we did some CSI to look at those areas we could make EVEN more savings?
‘Tell me more’ said the senior IT managers? ‘Tell me more?‘ said the business. ‘This sounds interesting and it gives us value!‘.
‘Well if we get together and look at the V,O,C,R to your business…’
‘Whats’ VOCR?’ asked the business manager?
‘ Well it stands for Value, Outcomes, Costs and Risks…..Now we happen to know that your department has just adopted new IT to increase revenue growth, which is more value, and at the same time sell more products. We know that with each change that is made there are a flood of outages, these cause additional costs in man hours and rework and pose a significant Risk that you won’t get the expected value from the IT….we have noticed that validation and testing could be improved that would reduce these outages. For example after the last roll out we saw 25 outages that cause…..’
This is more effective than saying ‘ We need to IMPLEMENT ITIL, lets start a big ITIL implementation project’ .
I hope this has given some thought a the need to look at Problem management differently. Good luck with your Problem management initiatives.