Tuesday, December 4, 2012

A lonely dev and his users

The experience

I spent half of this year working on rolling out a new system to one of my company’s biggest internal customers.  To help ease this difficult, painful, miserable process, my workspace was reassigned so I could be co-located with our users.  So for six months, I did my development work in the middle of a 24x7 customer service center.  I’m writing this mostly to document what I gained from it for the future, but hopefully you will find it interesting as well.



A little bit of background to start.  I’m a software developer at a major American railroad.  We’re as I like to describe it, seven years in to a seven year project with about seven years to go to rewrite the 40+ year old system that runs almost all of our operations (except for actual on-board systems).  There are 75 customer service reps (CSRs) that are supposed to be cutover from an old green-screen mainframe app to a brand new webapp by the end of this year (spoiler: we’re not going to make it).  These 75 CSRs are our heavy users, so we tackled them first in order to find and fix any issues that come up; later we’ll be cutting over approximately 200 more users who do more simple tasks with much less criticality.  What do these users do?  They deal with anything from external customers’ questions about everything to employees in yards trying to assemble railcars into trains.  Nothing on the railroad’s allowed to move without a waybill*, and I’m helping roll out the new waybill system.

I learned three main things while working with them.  First and most obvious is what they actually are doing, how they do it, and more importantly, how they want to do it.  Second, and most importantly, just how important their work is and just how quickly what we consider a small issue to look into becomes a big problem for them, and very quickly for the railroad.  And the third main thing I learned - and not to be overlooked - is their collective skill sets and more particularly, lack thereof.


They do what?

What does the service center do?  They’re one of IT’s biggest customers, but they’re always this vague abstraction off somewhere.  “They work with the customers and stuff, right?”  When I have meetings with them, it’s with the CSRs management if I’m lucky.  More often it’s with other business people in the area who are experts at how things should work, but are nowhere near the actual work being done.  How can we be expected to make a great, easy-to-use, problem-free system for a user we’ve never met?

I always just assumed most of them dealt with external customer issues and had no idea most of them are working more with employees in the yards.  They’re fixing problems between our own systems and between our systems and the actual people doing the work.  They spend most of their time answering questions, researching issues, and getting things in the right status for other processes to work correctly.  They’ve been doing what they do over and over so many times, for so many years, on such an unchanging system, they don’t think about what they’re doing anymore.  It’s almost just recognition: when I’m asked this, type this and expect to see this.  So, as a developer, I can’t look at it and think “Wow, they don’t know what they’re actually doing.”  I need to look at it and say “Okay, so my replacement system needs to be just as mindlessly easy for them to use.”  And that’s a tremendous challenge to fight for.


Whoa, that’s important.

We all seem to say we know the service center is important.  They’re a big part of our operations and keep customer happy.  And without knowing most of their interaction is with other employees (as recently discovered), there’s no way to gauge the impact they have on the railroad’s operations.  During our rollout, there were many times a user would hit a problem and call me over.  And if I was stumped after five or twenty minutes of working with them, I’d pass it down to the rest of my team of analysts to figure out.  If something looked like it was going to take more than a few hours, we’d find a workaround for them or tell them to use the old system.  But even the times we’d get something fixed two hours after finding it (which seems blazing fast to learn about an issue, analyze it, fix it, and get it to the users), we’ve still held something up by two hours.  Maybe it missed the train it was supposed to go on, so now it’ll get to a customer a day or two late.  That’s an immediate, direct impact on a customer – from something in IT we looked at as a small issue we quickly resolved.  When we had a problem working a certain type of bills one day, we held up several entire trains.  A manager came to me and said there was two million dollars of revenue being held up, and all I could do was apologize and try to speed up a workaround while other’s worked on the root problem.

Every new system has dozens and dozens of small issues, problems, bugs and quirks.  In IT, our jobs are to analyze them, figure out the best solutions, implement, test, verify, etc.  While all that’s going on though, someone else is stuck, and it’s effecting real train movement and real revenue.  When a CSR is on the phone with a customer trying to fix their waybill and my new system doesn’t allow what the old system used to, I’d better have a good answer and fast.  Because noone’s looking for an explanation of some tedious issue, they’re looking to get back a success message.  Anything less is failure and anything slower or harder than the old system may as well be unacceptable.


They can’t do what?

The thing I learned during this adventure that will probably stay with me the most – and lead me to always fight on behalf of usability – is the CSRs’ lack of basic computer skills.  I try to stay in perspective knowing that everyone I’m surrounded with at work and in my personal life (including family with the annoying “can you fix my email” questions) is toward the very top of the computer skill spectrum, but a few things really caught me off guard.  They’ve spend their entire careers (20+ years for many of them) using green-screen mainframe terminals.  But I figured they used a computer like a normal person at home.  I must be wrong.  Things like basic use of a web browser caused problems.  Despite my efforts, some still seem confused about the Back button.  If something opens in a new window, it’s as if the previous window is lost forever.  After a few hours, they’ll have a dozen browser windows open and have to close them all to start fresh because they’re getting confused, then blame me and my system for being too complicated.  Forget about tabbed browsing.  I thought it would clear up the issues with too many windows open – I was sorely mistaken.  I watched a user repeatedly right-click next to (next to – not in) a text field, get frustrated when paste wasn’t an option in the context menu, and decide apparently she wasn’t allowed to paste into that field.  Opening a link that’s saved as a favorite is a chore.  And a continuous problem (that in this case is fully the fault of IT) was a field that auto-completed to save a user from typing as much, but had too slow of an Ajax call which would fubar the data field.

After recovering from what seemed to a third-world experience, I spent a lot of time thinking about broader usability issues.  If the Back button causes stress, can we make sure there’s always something on the screen to help navigate, even if it’s to a previous screen?  Instead of balancing new windows, modals, popouts, etc based on can or can’t be worked on while one is open, let’s focus on what’s less ocnfusing to a user.   Features like auto-completion and data-lookups are great, but must work seamlessly.  Decisions that make sense in a conference room (it’ll take the same time, but be more accurate) become slower and frustrating when it requires a process change.

Going from a terminal session to a web-browser is a huge paradigm shift.  As IT professionals, we’re used to change; we embrace it and love seeing what’s new.  Our users couldn’t care less though – they just want to do their work as similarly as possible to how they did it yesterday.  If the old system put the curser on the field with an error, we shouldn’t just display a message with the error – let’s put the focus on the same field too.  If we’re going to change a workflow, we can’t just say we’re changing it – we need to actually explain the new process to the users, demonstrate the corresponding parts, and explain how the inputs and outputs match with what they’re used to seeing.


The face of IT

Working beside my users, I became the face for my system, related systems, systems I’ve never seen before, and IT as a whole.  Sometimes it was a stressful role.  You can only listen to someone call your project – the thing you’ve been slaving on for a year; been debating features and implementations with teammates, devs, and managers; coddling through infrastructure changes; and pushing through Change Management to Production – worthless so many times in a day.  There were days I struggled not to take it personally.  “This thing doesn’t work,” they say.  I reply, “You need to click the highlighted button that the message told you to click.”  Why did I give up a month of my evenings working second shift with them to teach them to do what I’d programmed the app to tell them?

I tell them “I spent a month making it 3 times faster than before and am still trying to speed it up,” only to still hear “It’s too slow.”  “I’m sorry, a distributed system is inherently slower than a mainframe, and I can only do so much!”  It became my responsibility to explain strategic IT decisions to users who only focus on doing their jobs quickly and accurately.

I was rolling out a system they didn’t like before they’d even seen it.  Trying to beg the few evangelists we had to spread the word, but settling for being glad they weren’t complaining.



So now my time on this project is coming to a close.  I spent a year on this project and six months embedded with our customer, including a month working second shift (third shift will come after I’m likely on a new project).  So what am I taking away?  Just complaints about users not knowing how to use a web browser and hating anything new?  I’ve learned what our biggest internal customer actually does.  How they work and what they find important.  I’ve learned just how quickly IT issues can start to effect real business operations.  It’s not just major outages and production issues that cause problems – routine bugs kill too.  I’ve learned the incredible focus needed on usability and an entire new perspective on the normal tradeoffs we make during development.

Did the working environment drive me crazy?  Did some of my users make me contemplate quitting?  Yes and yes.  But in the end, I’m focusing on the experience I was granted.  Six months of minimal coding, instead focusing on learning the business and our customers.  That’s an experience worth its weight in gold.

* A waybill (or just bill) is a document that says who’s shipping what, how much of it, from where to where, how it’s getting there, who’s paying, who’s responsible, any hazardous details, and any other relevant piece of information.  We only have a few thousand business rules to process and validate a waybill.


  1. Great post. Thanks for sharing your experience. The company I work for still uses a mainframe system with green screens. Every time they discuss a replacement, I cringe at the thought of the "veteran" processors who will have to learn a new system.

    I wholeheartedly agree that usability needs to be a major focus, except for one point. While the functionality needs to remain, I don't believe a replacement system should be designed to mimic the same user steps, unless that truly is the best design. Future generations will be using this system and they will have more experience with Back buttons, etc so why limit yourself to bad design that appeases folks that will be retiring shortly?

  2. This is a great case study to a 'best practice' of learning your customer. We are constantly trying new mobile apps in agriculture and are constantly amazed at how WRONG our assumptions are in regards to the user expectations. We constantly test, test, test and discover that only 10% of our features are worth keeping.

    I have read this over many times and wondering is it okay to keep the older UI style? On the one hand we need to get adoption by our customer or we do not get funded for the next phase. On the other hand its important to target the heavy user who tends to prefer very modern UI.

    Thanks for sharing!

  3. How to balance new v old is one area I'm still most uncertain about. I wholeheartedly agree that we shouldn't make bad design decisions to appease the soon-to-retire. So maybe we shouldn't add extraneous navigation, and if a new window or popout makes sense we should be able to use it. I guess what I meant more (and this is defiantly an area I’m struggling to matching philosophy to practice) is that we need to at least keep the current users in mind more when making these decisions. Some of the users who are retiring “soon” will still be around several years.

    We gave light-hearted consideration early in the process to making an “old-style” and “new-style” UI the users could choose between that shared a common back-end, but vetoed that for any of the many reasons it’s a bad idea.

    I’d love to hear more about your experiences as you go through a replacement.

    And Buzz, I’ve thought about (but never implemented) a log to track what features get used or ignored. I’d bet good money it doesn’t match all that well with what we expect (in any system).