Follow the process…

Application has problems. Batch window (yes, it’s old school) projected to breach SLA. Local SME (me) sounds the klaxons, points the finger at the hardware. Yes, the application is crap, but everything was fine until we moved data centres… but the data centre guys just point at how crap the application is.

Management takes an interest. Understandably, they hire an acronym company to get a second opinion. In preliminary discussions all stakeholders agree, while they’ll look at other stuff, the focus is the hardware and figuring out the i/o bottleneck is the criterion for success or failure.

The acronym comes back with a mammoth report. Management is pleased. SME is not. There are two throwaway lines about the hardware and the i/o, and the rest is all about how crap the application is, plus some boilerplate stuff cut and pasted from a manual or the web. On close examination the report is mainly vague and non-specific, but when it does drill-down it’s not too hot on detail.

Management runs off with the report. Next thing, another external company is charged with raising a business case so we can get investment to implement the “recommendations” in the report.

I think I’m getting a Cassandra complex.

Advertisements
Posted in Uncategorized

There’s a reason why they don’t call it redundancy any more

People actually being fired for under performing is actually quite rare in the corporate world. Mostly, in my experience, people are retrenched for reasons having nothing to do with their individual performance or circumstances – in mass serial waves, year after year – either as standalone budget issues, or as part of a corporate plan to offshore its resources.
(In these retrenchment waves, it is noticeable how few if any of managerial rank are retrenched, while the ranks are gutted.)
Any national laws meant to protect people from being retrenched unnecessarily/unfairly are treated with contempt by any company big enough to have lawyers on payroll, and retrenched individuals (who are supposedly redundant i.e. performing a role that is no longer necessary) are almost always called upon in their final days/weeks/hours to handover or “train” their (cheaper, younger, offshore) replacements.

At the end of the day it’s metric driven.
The executives and managers are rewarded based on what’s measured. Employee costs are measured, to the last cent. Employee productivity/outputs are not adequately measured. For a while, this makes the exercise look successful on paper: costs are down, but no one really has a clue as to what it’s done to productivity (which is fortunate, as productivity has dropped by a bigger ratio than the costs).
The bubble bursts eventually, but it lasts long enough for all involved to be handsomely rewarded before moving on to another company.

Posted in Uncategorized

Why bureaucracies fail

You have this bureaucracy, doesn’t matter but let’s say it’s a bank.
It’s making billions of dollars in profits, year in, year out.

Yet every year there are cuts. Cuts everywhere. “More, with less” Retrenchments. Outsourcing, offshoring.

Why? What drives this? The business is not in trouble. But sooner or later those cuts will get it in trouble. Key-man risks emerge. Knowledge is lost. Systems fail. Processes are forgotten.

Take offshoring. It doesn’t save money.
Trade one experienced onshore resource for one experienced offshore resource, and you save money.
Trade one experienced onshore resource for one inexperienced offshore resource. You may save money, but you get less (or even nothing) done.
Scale up, and trade a group of experienced onshore resources for a group of largely inexperienced offshore resources. You save money, but less done. So get some more (offshore) resources.
My perception is that by the time you’re paying ~90% of what you used to pay, you’re achieving ~20% of what you used to. One in four of the resources are okay, and value for money. The rest are “ballast”. And they all cost the same.

But I digress.

What happens is this. Your CEO says to his minions, “cut your budgets by 10%”. This will earn him brownie points with the board. Perhaps his bonus depends on it. He makes sure his minions’s bonuses depend on it. The CEO’s minions pass the pain down the line. Line managers have to cut their budget by 10%, no matter what, but still do the same job as before.

Of course they can’t. Not after the last few cuts eliminated any fat that may have been in the system. So what they do is produce less. A 10% budget cut is virtually guaranteed to produce >10% drop in productivity (given there’s no fat / waste left to cut).

That doesn’t matter, the key things is that (1) the budget cut is easily measured, and there’s no way round it (2) the output/productivity is often less easily measured, and there are many ways to juke the stats so as to make the output/productivity seem not as bad as it really is.

So at all levels from the executive down to the line management, everybody plays the same game:-
1) Cut the expenditure, or lose your job/promotion/bonus prospects.
2) Juke the stats to make it look (in the short term) like cutting the expenditure was no big deal.

Cut by cut, productivity falls – far faster than the budget has been cut, and faster with every subsequent cut.
This may indirectly put pressure on the budget next time round, leading to more cuts in a never ending vicious spiral.

Posted in Uncategorized

SNAFU

Just a routine troubleshooting incident. A policy “disappeared” before a user’s eyes.

It didn’t really disappear, but the status was screwed up.

The user clicked a button, which was supposed to perform a two part operation: invalidate the old record, and add a new valid record. This is supposed to be in a transaction.

What actually happened was, the transaction handler is somehow screwed and the user clicked the button twice in succession without realising it. The first operation succeeded, then repeated: the recently added new record was now invalidated, but the addition of another new valid record failed. Not being in a transaction, and error handling not helping, we ended up with no valid records.

This is hardly unique to this patch of code. Most programmers wouldn’t think to program defensively against a double click. Nobody has the time to worry about things like that anyway, all the coding is seat of the pants. And the transaction handler has been broken for >5 years, but remains unfixed despite being a known source of problems.

In the good old days,
* management would have been seen the transaction handler problem as a MAJOR issue, and fixed it
* lead developers would know about guarding against double-clicks (etc) and set standards accordingly
* mid-rank developers would be peer reviewing each others’ work looking for this kind of thing
* junior developers would be spoon fed baby work until they’d proven they weren’t menaces

But now,
* management take no interest in the application beyond high-level resourcing and schedule issues
* there are no lead developers, or at least none doing that kind of work
* mid-rank developers do their own thing, and no one mentors them or checks their work
* junior developers are just thrown in the deep end, without supervision

It’s not just this team, this application; if anything, it’s above average for the neighbourhood.

What happened in the industry, that this is normal, and only a few dinosaurs appreciate that this is a problem?

Posted in Uncategorized

Do The Math

I recently overheard how much offshoring really costs.

We have 30 testers, 3 onshore (quite good ones), 27 offshore (not sure if all of them actually exist, and it seems not more than a handful are actually productive).

We pay a blended rate to a 3rd party outsourcer which is equivalent to an amount per tester which happens to be a middling testers’ salary in Sydney, according to industry surveys.

So the saving amounts to eliminating superannuation, insurance and other overheads. 15%? Still, substantial enough to be worth it.

However, offshore, you’ll be lucky if 1 out of 3 are good hires. But you’ll never know about the ones who stay offshore; you don’t know who’s doing what exactly, and you see a blended output that averages out amongst good hires, marginal hires, and waste-of-space hires in roughly equal proportions.

Seems there’s a good argument to be made for offshoring when done properly. Police the hires, make sure you only keep the good hires, and cycle through the rest as fast as possible until you end up with good hires only – and enjoy your 15%-20% saving over an equivalent local hire.

Problem is, the price will go up if the outsourcer cottons on that you only want the better hires. It’s only cheap because your blended rate cross-subsidises currently unproductive hires – training them up for the future benefit of the outsourcer.

Anyone with a business brain wouldn’t enter into this kind of arrangement. However, this is a bureaucracy first and a business second.

Posted in Uncategorized

Ah, Consultants

We had some consultants in recently. I don’t know how much they cost, but I’m guessing between $50k and $100k was spent on them.

They were supposed to discover why our backups were taking a variable amount of time, and generally slow at that – a SAN issue suspected.

What they didn’t do is make headway on the one actual problem we didn’t know how to solve ourselves.
Instead what they actually did was flag all the other known problems our application has (by interviewing the local SMEs), and put them in a report suitable for digestion by senior managers. (You know, lots of pictures, words of one syllable, primary colours.)

On the surface, a waste of time. They told us nothing we didn’t know already.

But on the other hand, it was essential. It depends on your point of view.

It’s clear we have problems with our hardware, even if we don’t know exactly what, and one possible solution is to buy better hardware. Our managers, however, can’t go and spend on the hardware based on the word of the local SMEs alone. Their solution is, engage some consultants to tell them what the problems are (even if that’s just documenting what the local SMEs are saying). Then they’ll have justification to buy the hardware.

The consultants are happy, they’ve been paid big bucks for doing bugger all. The SMEs are happy, the problems they’ve been bitching about for years might actually get some attention. The business are happy, they might actually see some results now. The managers are happy, they can now spend the problem away without fear of recriminations downstream, because in the worst case they can always blame the consultants for getting it wrong.

Everybody’s happy here in paradise.

Situation Normal.

Update:
The final report came out recently, and it was worse than expected. They didn’t crack the SAN problem, in fact they said it was all hunky dory and working as expected – while simultaneously recommending that we move to a dedicated storage array. But those little nuggets could be easily missed, a couple of lines buried in a long laundry list of things that could be done to improve the application, only some of which were accurate; some items were miscommunicated; many items were just plain wrong.

Maybe Weird Al can redo “Money For Nothing” with IT consultants instead of rock stars…

Posted in Uncategorized

How hard can it be to raise an invoice?

When starting/renewing a contract at a certain organisation, at the best of times it takes up to 2 weeks for a work order to be raised and approved.   Mind you, the decision to hire/renew has already been taken, so it’s not entirely clear why any “approval” is involved.  There’s no prospect of the work order NOT being approved, because the person is already busy at work.

So anyway, up to 2 weeks is normal.  After my renewal work order ran over the 2 weeks by a couple of days, I escalated.   Turns out the software had broken because of a data integrity issue, so my work order was stuck in limbo.  Eventually that was sorted.   But it was a struggle.

Then the next problem:  as soon as I realise my work order is ready to use, I have lost my access.  Also a data integrity issue.   By now, I’ve missed no less than 4 pay periods and I can see the next one ticking away.

At first sight, it seems no one wants to help or is able to help.   But it’s not true.   Help is at hand, it’s just in slow motion.   You log a call.  It sits in a queue.  You chase it, it gets assigned to someone.  You chase it, that someone might look at it, and pass it on to the next queue.   Rinse and repeat, over and over again, with at least a day between each 5 minutes of activity.   There are cycles: acknowledge/assign/diagnose/act.   Each step in the cycle could take a day.  The final action might be no more than pass on to another queue.  Or in some cases, get approval to ask someone else to raise another call to assign to another queue…

And meanwhile, I haven’t been paid for 4-5 weeks.   If I owed a bank a similar amount of money, and was being tardy and using excuses like “my software has a glitch” to avoid payment, you can bet they’d be adding penalties and interest and before long threatening lawyers and collections agencies.

Posted in Uncategorized