ARCHIVE NOTICE

My website can still be found at industrialcuriosity.com, but I have not been posting on this blog as I've been primarily focused on therightstuff.medium.com - please head over there and take a look!

Sunday 25 October 2020

Keeping Track of your Technical Debt

The impact of technical debt

Over the years the concept of "technical debt" has become a phrase that can generate anxiety and a lack of trust, as well as setting up developers and their managers for failure. The metaphor might not be perfect (Ralf Westphal makes a strong case for treating it like an addiction), but I feel it's pretty apt if you think of the story of The Pied Piper of Hamelin - if you don't pay the piper promptly for keeping you alive, he'll come back to steal your future away.

Maybe I'm being a bit dramatic, but I value my time on this planet and over the course of two decades as a software developer in various sectors of the industry I have (along with countless others) paid with so much of my own precious lifetime (sleep hours, in particular) for others' (and occasionally my own) quick fixes, rushed decisions and hacky workarounds that it pains me to even think about it.

It's almost always worthwhile investing in doing things right the first time, but we rarely have the resources to invest and we are often unable to accurately predict what's right in the first place.

So let's at least find a way to mitigate the harm.

Why TODOs don't help

The traditional method of initiating technical debt is the TODO. You write a well-meaning (and hopefully descriptive) comment starting with TODO, and "Hey, presto!" you have a nice, easy way to find all those little things you meant to fix. Right? Except that's usually not the case. We are rarely able to make time to go looking for more things to do, and even when we can, with modern software practices it's unlikely that searching across all of our different repositories will be effective.

What generally ends up happening, then, is that we only come across TODOs by coincidence, when we happen to be working with the code around it, and the chances are that it'll be written by a different developer from a different time period and be explained in... suboptimal language.

"Why hasn't this been done?", you may well ask.

"Leave that alone! We don't remember why it works," you may well be told.

Track your technical debt with this one simple trick!

No matter the size of your team (even a team of one), you should always be working with some kind of issue tracking software. There's decent, free software out there, so there's really no excuse.

The fix? All you need to do is log a ticket for each TODO. Every time you find yourself writing a TODO, and I mean every single time, create a ticket. In both the ticket and the TODO comment, explain what you're doing, what needs to be done, why it needs to be deferred, and how urgent completing the TODO task is.

Then reference the ticket in the TODO comment.

This way you'll have a ticket - which I like to think of as an IOU - which can be added to the backlog and remembered when grooming and planning. This also provides the developer who encounters the TODO in the code a way to review any details and subsequent conversations in the ticket.

One interesting side effect of this approach? I often find that it's more effort to create the TODO ticket than to do the right thing in the first place, which can be a great incentive for the juniors to avoid the practice.

Another?


After establishing with my team that I will not approve a TODO unless there's a ticket attached, I must admit... I do sleep just a little bit better at night.

Saturday 24 October 2020

Reading and writing regular expressions for sane people

Your regular expressions need love. Reviewers and future maintainers of your regular expressions need even more.

No matter how well you've mastered regex, regex is regex and is not designed with human-readability in mind. No matter how clear and obvious you think your regex is, in most cases it will be maintained by a developer who a) is not you and b) lacks context. Many years ago I developed a simple method for sanity checking regex with comments, and I'm constantly finding myself demonstrating its utility to new people.

There are some great guides out there, like this one, but what I'm proposing takes things a step or two further. It may take a minute or two of your time, but it almost invariably saves a lot more than it costs. I'm not even discussing flagrant abuse or performance considerations.


Traditional regex: the do-it-yourself pattern


The condescending regex. Here you're left to your own devices. Thoughts and prayers.

var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})(0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;

(example taken from this question)


Kind regex: intention explained


It's the least you can do! A short line explaining what you're matching with an example or two (or three).

// parse a url, only capture the host part
// eg. protocol://host
//     protocol://host:port/path?querystring#anchor
//     host
//     host/path
//     host:port/path
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})(0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;


Careful regex: a human-readable breakdown


Here we ensure that each element of the regex pattern, no matter how simple, is explained in a way that makes it easy to verify that it's doing what we think it's doing and can modify it safely. If you're not an expert with regex, I recommend using one of the many available tools such as regexr.com.

// parse a url, only capture the host part
//     (?:([A-Za-z]+):)?
//         protocol - an optional alphabetic protocol followed by a colon
//     (\/{0,3})(0-9.\-A-Za-z]+)
//         host - 0-3 forward slashes followed by alphanumeric characters
//     (?::(\d+))?
//         port - an optional colon and a sequence of digits
//     (?:\/([^?#]*))?
//         path - an optional forward slash followed by any number of
//         characters not including ? or #
//     (?:\?([^#]*))?
//         query string - an optional ? followed by any number of
//         characters not including #
//     (?:#(.*))?
//         anchor - an optional # followed by any number of characters
var parse_url = /^(?:([A-Za-z]+):)?(\/{0,3})(0-9.\-A-Za-z]+)(?::(\d+))?(?:\/([^?#]*))?(?:\?([^#]*))?(?:#(.*))?$/;

Now that we've taken the time to break this down, we can identify the intention behind the patterns and ask better questions: why is the host the only matched group? Was this tested? (Because (0-9.\-A-Za-z] is clearly an error, and there are almost no restrictions on invalid characters)

Unless you're a sadist (or a masochist), this is definitely a better way to operate: be careful, and if you can't be careful then at least be kind.