Friday, July 19, 2013

The Thing About Code

Code is amazing stuff. Good code puts people into space, runs super-colliders, and keeps the Internet ticking. Bad code on the other hand, winds up on wireless controllers.

OK, just kidding.

Maybe.

For the life of me I can't understand how vendors keep crappy code listed on their download pages, often at the top of the list, for customers to find. You know, the kind of half-baked stuff that everyone from sales engineers to tech support cringe at when you tell them what version you are running. Which often also happens to be the same code that others from the same company declare to be "the good code", and recommend that you go to to get past some other problem with earlier buggy code. Ever been there? It pretty much sucks, yet this rhythm seems to have become an operational model for some vendors.

This is where we pause, and I read minds. Quiet please..... quiet..... shhhhhh. I'm picking something up..... ah yes, got it. The "testing" fallacy- I'll address that..... wait, one more coming.... what's that? Oh, sure- the release notes thing. Let's talk about both of those.

I hear an awful lot of "test, test, test!" from colleagues and respected industry folk. And I do agree that nothing, including code, should be rushed into to. But please tell me- other than just being a mantra, what does "test, test, test!" really mean? Does it mean load the code on a test box, configure it the way you'd use in prod, throw clients at it, and then wait for smoke and screams? OK, that's acceptable. Or maybe it means that you should actually take what I just mentioned and add whatever new features that interest you into the mix, and make sure they don't create problems. Fine, yes- this too is arguably reasonable.

But guess what vendors? If you expect us (and evidently some of you do) to be your crowd-sourced QA departments, let's call it what is and put warning labels on code:

"Caution: we either don't quite know WTF this code will do in many environments, or we have some inkling, and it ain't pretty. But we're putting it out there anyways so you can be our debug squad. Stuff that has always worked now may crash, but it's worth it because this is NEW code."

We buy the hardware and code, pay for support on it all, eat the pain and suffering that comes with the shaky code, and the vendor gets to say "you really need to test new code and let us know what you find". Everybody wins- except for the customer.

We don't know what modules and packages were added and changed, and we're not programmers with access to source views to that which is causing us pain. (Funny how we don't tend to have these problems in the mobile network world.)

Then there are the release notes. Hats off to vendors that are open an honest about their shortcomings with their code. But... when the same bugs are listed for years, you start not to pay attention. And some unresolved issues sound minor, but can bring the house down. Others sound apocalyptic, but actually happen so rarely or have minimal real impact that they can be safely disregarded. But they are all listed in the same terse "you figure it out, and good luck with that" manner. The onus is unfairly on the customer to wade through it all, and that is wrong for COTS gear- would be different if this were all open source.

So how do "we" fix this?

  • Stop putting out shitty code. Plain and simple. Just stop. New features aren't worth instability- client access is the key mission of the WLAN and if the WLAN is melting down from crappy code the key mission is compromised

  • If code is found to be crappy on a catastrophic scale, PULL IT. Don't leave it up for others to find. And reach out to customers pro-actively like an automotive recall to let us know about it. Many WLANs these days have million-dollar plus price tags- we deserve better.


It's time to stop the code insanity.

2 comments: