Tuesday, December 4, 2012

A lonely dev and his users

The experience

I spent half of this year working on rolling out a new system to one of my company’s biggest internal customers.  To help ease this difficult, painful, miserable process, my workspace was reassigned so I could be co-located with our users.  So for six months, I did my development work in the middle of a 24x7 customer service center.  I’m writing this mostly to document what I gained from it for the future, but hopefully you will find it interesting as well.



A little bit of background to start.  I’m a software developer at a major American railroad.  We’re as I like to describe it, seven years in to a seven year project with about seven years to go to rewrite the 40+ year old system that runs almost all of our operations (except for actual on-board systems).  There are 75 customer service reps (CSRs) that are supposed to be cutover from an old green-screen mainframe app to a brand new webapp by the end of this year (spoiler: we’re not going to make it).  These 75 CSRs are our heavy users, so we tackled them first in order to find and fix any issues that come up; later we’ll be cutting over approximately 200 more users who do more simple tasks with much less criticality.  What do these users do?  They deal with anything from external customers’ questions about everything to employees in yards trying to assemble railcars into trains.  Nothing on the railroad’s allowed to move without a waybill*, and I’m helping roll out the new waybill system.

I learned three main things while working with them.  First and most obvious is what they actually are doing, how they do it, and more importantly, how they want to do it.  Second, and most importantly, just how important their work is and just how quickly what we consider a small issue to look into becomes a big problem for them, and very quickly for the railroad.  And the third main thing I learned - and not to be overlooked - is their collective skill sets and more particularly, lack thereof.


They do what?

What does the service center do?  They’re one of IT’s biggest customers, but they’re always this vague abstraction off somewhere.  “They work with the customers and stuff, right?”  When I have meetings with them, it’s with the CSRs management if I’m lucky.  More often it’s with other business people in the area who are experts at how things should work, but are nowhere near the actual work being done.  How can we be expected to make a great, easy-to-use, problem-free system for a user we’ve never met?

I always just assumed most of them dealt with external customer issues and had no idea most of them are working more with employees in the yards.  They’re fixing problems between our own systems and between our systems and the actual people doing the work.  They spend most of their time answering questions, researching issues, and getting things in the right status for other processes to work correctly.  They’ve been doing what they do over and over so many times, for so many years, on such an unchanging system, they don’t think about what they’re doing anymore.  It’s almost just recognition: when I’m asked this, type this and expect to see this.  So, as a developer, I can’t look at it and think “Wow, they don’t know what they’re actually doing.”  I need to look at it and say “Okay, so my replacement system needs to be just as mindlessly easy for them to use.”  And that’s a tremendous challenge to fight for.


Whoa, that’s important.

We all seem to say we know the service center is important.  They’re a big part of our operations and keep customer happy.  And without knowing most of their interaction is with other employees (as recently discovered), there’s no way to gauge the impact they have on the railroad’s operations.  During our rollout, there were many times a user would hit a problem and call me over.  And if I was stumped after five or twenty minutes of working with them, I’d pass it down to the rest of my team of analysts to figure out.  If something looked like it was going to take more than a few hours, we’d find a workaround for them or tell them to use the old system.  But even the times we’d get something fixed two hours after finding it (which seems blazing fast to learn about an issue, analyze it, fix it, and get it to the users), we’ve still held something up by two hours.  Maybe it missed the train it was supposed to go on, so now it’ll get to a customer a day or two late.  That’s an immediate, direct impact on a customer – from something in IT we looked at as a small issue we quickly resolved.  When we had a problem working a certain type of bills one day, we held up several entire trains.  A manager came to me and said there was two million dollars of revenue being held up, and all I could do was apologize and try to speed up a workaround while other’s worked on the root problem.

Every new system has dozens and dozens of small issues, problems, bugs and quirks.  In IT, our jobs are to analyze them, figure out the best solutions, implement, test, verify, etc.  While all that’s going on though, someone else is stuck, and it’s effecting real train movement and real revenue.  When a CSR is on the phone with a customer trying to fix their waybill and my new system doesn’t allow what the old system used to, I’d better have a good answer and fast.  Because noone’s looking for an explanation of some tedious issue, they’re looking to get back a success message.  Anything less is failure and anything slower or harder than the old system may as well be unacceptable.


They can’t do what?

The thing I learned during this adventure that will probably stay with me the most – and lead me to always fight on behalf of usability – is the CSRs’ lack of basic computer skills.  I try to stay in perspective knowing that everyone I’m surrounded with at work and in my personal life (including family with the annoying “can you fix my email” questions) is toward the very top of the computer skill spectrum, but a few things really caught me off guard.  They’ve spend their entire careers (20+ years for many of them) using green-screen mainframe terminals.  But I figured they used a computer like a normal person at home.  I must be wrong.  Things like basic use of a web browser caused problems.  Despite my efforts, some still seem confused about the Back button.  If something opens in a new window, it’s as if the previous window is lost forever.  After a few hours, they’ll have a dozen browser windows open and have to close them all to start fresh because they’re getting confused, then blame me and my system for being too complicated.  Forget about tabbed browsing.  I thought it would clear up the issues with too many windows open – I was sorely mistaken.  I watched a user repeatedly right-click next to (next to – not in) a text field, get frustrated when paste wasn’t an option in the context menu, and decide apparently she wasn’t allowed to paste into that field.  Opening a link that’s saved as a favorite is a chore.  And a continuous problem (that in this case is fully the fault of IT) was a field that auto-completed to save a user from typing as much, but had too slow of an Ajax call which would fubar the data field.

After recovering from what seemed to a third-world experience, I spent a lot of time thinking about broader usability issues.  If the Back button causes stress, can we make sure there’s always something on the screen to help navigate, even if it’s to a previous screen?  Instead of balancing new windows, modals, popouts, etc based on can or can’t be worked on while one is open, let’s focus on what’s less ocnfusing to a user.   Features like auto-completion and data-lookups are great, but must work seamlessly.  Decisions that make sense in a conference room (it’ll take the same time, but be more accurate) become slower and frustrating when it requires a process change.

Going from a terminal session to a web-browser is a huge paradigm shift.  As IT professionals, we’re used to change; we embrace it and love seeing what’s new.  Our users couldn’t care less though – they just want to do their work as similarly as possible to how they did it yesterday.  If the old system put the curser on the field with an error, we shouldn’t just display a message with the error – let’s put the focus on the same field too.  If we’re going to change a workflow, we can’t just say we’re changing it – we need to actually explain the new process to the users, demonstrate the corresponding parts, and explain how the inputs and outputs match with what they’re used to seeing.


The face of IT

Working beside my users, I became the face for my system, related systems, systems I’ve never seen before, and IT as a whole.  Sometimes it was a stressful role.  You can only listen to someone call your project – the thing you’ve been slaving on for a year; been debating features and implementations with teammates, devs, and managers; coddling through infrastructure changes; and pushing through Change Management to Production – worthless so many times in a day.  There were days I struggled not to take it personally.  “This thing doesn’t work,” they say.  I reply, “You need to click the highlighted button that the message told you to click.”  Why did I give up a month of my evenings working second shift with them to teach them to do what I’d programmed the app to tell them?

I tell them “I spent a month making it 3 times faster than before and am still trying to speed it up,” only to still hear “It’s too slow.”  “I’m sorry, a distributed system is inherently slower than a mainframe, and I can only do so much!”  It became my responsibility to explain strategic IT decisions to users who only focus on doing their jobs quickly and accurately.

I was rolling out a system they didn’t like before they’d even seen it.  Trying to beg the few evangelists we had to spread the word, but settling for being glad they weren’t complaining.



So now my time on this project is coming to a close.  I spent a year on this project and six months embedded with our customer, including a month working second shift (third shift will come after I’m likely on a new project).  So what am I taking away?  Just complaints about users not knowing how to use a web browser and hating anything new?  I’ve learned what our biggest internal customer actually does.  How they work and what they find important.  I’ve learned just how quickly IT issues can start to effect real business operations.  It’s not just major outages and production issues that cause problems – routine bugs kill too.  I’ve learned the incredible focus needed on usability and an entire new perspective on the normal tradeoffs we make during development.

Did the working environment drive me crazy?  Did some of my users make me contemplate quitting?  Yes and yes.  But in the end, I’m focusing on the experience I was granted.  Six months of minimal coding, instead focusing on learning the business and our customers.  That’s an experience worth its weight in gold.

* A waybill (or just bill) is a document that says who’s shipping what, how much of it, from where to where, how it’s getting there, who’s paying, who’s responsible, any hazardous details, and any other relevant piece of information.  We only have a few thousand business rules to process and validate a waybill.

Wednesday, August 29, 2012

Didn't survive - Meetings and Theater

This blog's tagline is "Thoughts from the shower that survive to the keyboard."  What this means though, is that inevitably some great (okay, that's being generous) ideas probably won't quite make it to the keyboard.

This will become a recurring theme as I try to process half-lost ideas.

Leaving work today I had this sudden realization about how preparing for a meeting was exactly like running the sound board for my high school's theater.  I was thinking about how I had to get in early enough tomorrow to be ready for any of those last minute pre-meeting meetings or to get something setup or run some last data and it felt exactly like the rush of activity leading up to the curtain.  Except, they're not at all alike.  A stoplight or a gust of wind or a loud car grabbed my attention, and the next thing I knew my mind was off wandering, thinking about how the booth's CD player broke my senior year and trying to remember if I ever got reimbursed for it's replacement.  I've got no idea.

So nope, no idea exactly why preparing for a meeting is like running a theater's sound board, but somewhere in my mind they're related.

Friday, June 8, 2012

Legacy refactor jitters

I just committed my first complete refactor of legacy code in a long while.  Broke it apart, added tests (based on the original code), rewrote it (based on the new tests), and now have very little way to verify it worked.

It’s kind of weird.  If the rewrite works correctly, there’s no way to tell anything changed.  That’s kind of anti-climatic.  If there’s a small problem, we’ll see a small difference – at some point (hopefully sooner than later).  If there’s a big problem, we’ll see a big difference – probably soon.  It’s an unsatisfying feeling – waiting for nothing to happen.  Is it not being used?  Is it working great?  Or is it working subtly incorrectly but noone’s noticed yet?

This section of code followed the book (Working Effectively with Legacy Code, Michael Feathers) for refactoring legacy code almost perfectly.  There were thankfully convenient, simple seams to break into.  I was able to isolate the code, smash it apart, and rebuild it in small individual parts using TDD.  It all went exactly how the books say it should. 

Except of course for not being able to put any tests around it beforehand to make sure I didn’t totally FUBAR it.

Is it always this unnerving?  Why doesn’t anyone (besides this half-attempt) blog about the psychological side of swdev?  My QA’s about to ask me how to verify it worked, and the best I can tell him is “if it works the same as before, it’s good.  Except for that little part that didn’t quite work right before that prompted this change – that part should work a little better now.”

I’m sure I’ll be revisiting this topic (and this section of code) in the future.

Tuesday, May 22, 2012

Two weeks from last Tuesday to a three-quarters left-turn next quarter: My take on time and measuring progress

This may be an unfortunately Millennial perspective, but I’ve been out of school and at work for four years now, and there are some things that continue to faze me.

Mostly, I’m still a little amazed at there is such vagary in time.  In school, there is a clear First Day of School and a Last Day.  There are semester breaks, finals, exam days, due dates, etc. ad infinum.  Now, any due dates are arbitrary.  Why does something need to be done by next Wednesday?  Because your manager heard from his manager who misunderstood a customer saying it’s needed next week.

Despite some days varying wildly from others, many are mere copies of yesterday, which is a copy of the day before, which is a copy of last week, which was the same as next week and next month will be.  As I write this, I’m on mute, 43 minutes into a daily morning conference call that lasts between an hour and an hour and a half.  I say 3 sentences, about 20 minutes in explaining what I did yesterday and will do today.  Yesterday I got in, wrote some code, and gave a project demo to some new users (who aren’t actually using the new system yet anyway).  Today is unfortunately the exact same day.  But if today instead of server in the basement explodes, 40 new users all try to use our system simultaneously, and my company selects me as the new Acting CEO, tomorrow I’d still get on the call at 7, at about 7:20 I’d summarize that into 2 sentence (okay, that would probably get a third or fourth sentence), then go on mute while everyone else talks.

How do you mark time in this kind of environment?  By the constant, never-ended change.  What?  What change?  I thought you just said things are never changing?  Managers move around, teams shift projects, people move  to new teams.  There are promotions and lateral moves.  Reorgs are announced, then changed, then made announced again – the dates people start doing new work has no correlation to when changes are effective.  You think you know who to talk to about something in another department, but when you talk to them 6 months later they’ve moved to a new team.  Did something happen three months ago or six?  It was last year already, wow.  Was I working for my old manager then or the current one?  I can’t remember, there was a long slow transition period.  I remember we were sitting in a different set of cubes or on a different floor, but when did that move happen?

Projects start.  Well, projects don’t start, they wind up.  By the time a team is put together to work on it, management has it all figured out, even though there’s nothing solid to know anything about.  Users have been talked to, but not in any orderly way.  Architects are been designing it to fit their Grand Plans.  By the time there’s a team in place to do it, it’s already heading down the wrong path and too much has been done for management to want to start it over.  If you do start over, you begin a cycle of continuous restarts.  Projects get going and start rolling down the hill.  Slowly picking up speed until there’s enough there it actually can do a simple function!  There’s a first production deployment!  …it fails…  Then there’s the hot production deployment!  …it fails…  Managers get together, architects redesign, team leads look busy, developers ignore direction and do whatever they think’s needed.  Weeks of intense testing…another deployment…and it works!  Well, 90% works, so let’s have a team lunch to celebrate!  Oh wait, we still don’t have any users.  So when the first user tries is… looks like lunch was premature.

By this point, you’re 8 months in.  What do you have to show for 2/3 of a year?  A system that in very carefully controlled situations can mostly but not authoritatively do part of what a small group of people want it to do.  How long until it does everything it’s supposed to do?  Right about the time a total re-write is due.
Nothing’s ever done.  Nothing’s ever complete.  How do you measure success when there’s no completion? There’s too big a gap between “done” and “done done”.  I find a defect and make a change.  It takes a few hours or a day and I call it done.  When’s it get deployed – actually used?  Do I get to feel accomplished that I wrote code not being used yet?  Or should I feel successful instead when something I did a month (or three) ago and have moved on from since finally got turned on last week?

I spent my first nine months here working on a project.  It never got finished – my manager told me to work on something else instead.  What was I supposed to feel about that effort?  A year later another teammate picked up my work and continued, to meet the same fate.  Currently, another teammate is finally, slowly, completing it.  It’s completion is built on the core of my work, but I have no satisfaction from it.  There’s a mixture of disappointment it took so long; amusement it’s finally being finished; more disappointment because I would have liked to be the one to complete it; incompetence that my work couldn’t get finished until this far later; and a few other generally downer thoughts.  A good early success in my career would have been nice.
A few years ago my team had a big infrastructure change.  We worked hard for over a month, and when it was “done” our architect bought us lunch to celebrate.  Over 100 days later, we finally finished by my standard – and by the time I’d finished this part of the project, the rest of the team was working on something else.  That’s it – that’s the last and only time I’ve felt like a specific thing was distinctly finished.

My current project?  Training 120 users on a new system.  Two at a time.  Okay, 60 demos, that’s easy to track, right?  Easy to mark progress and know when we’re done?  Except we don’t actually know how many users need to be trained.  And the real purpose isn’t to train them on the new system – it’s to have them stop using the old one and start using the new one.  We’ve trained about 30 people so far – but only maybe 4 are actually using it.  How do I mark this progress?  When do I get my “we finished” lunch for this?  When 100% of users are trained, but not using it?  When 100% of them are using it?  Define “using it” first.  When we turn off the old system?  That’s not going to happen – it does too much other stuff that’s not being replaced by the new one, so it’ll live on in a lobotomized fashion.  We could celebrate the lobotomy because that means we’re in charge now, but celebrating our completion by another team’s intended mutilation isn’t very satisfying.

And then, people leave.  This is the part that continues to confound me the most.  We hire people mostly matching with the academic calendar – we have a large influx of college hires in June and January.  But then people decide to leave.  And I’m not talking about the constant roar of upcoming retirements (once you’ve been here 35-40+ years, I’ll dearly miss your knowledge but your leaving is anticipated).  Just out of the blue one day someone will email some friends or send a Tweet or through the grapevine that someone’s leaving.  Two weeks from yesterday someone you’ve worked with, eat lunch with, kvetch about work and life with and battle production problems and management with is gone.  What made yesterday worse than the day before to prompt you to leave?  I get that it’s a natural cycle and not usually a specific event, but it’s still so ingrained that there should be natural endings – the end of the project, the final exam, the semester or year ends.   But next Thursday seems so ethereal.

How are you supposed to keep up with all this constant change?  More emails flooding your inbox every time someone you’ve never heard of changes teams?  Org charts that show management hierarchies without corresponding project information?

How do we measure time, measure success and failure, how do we measure growth and progress or lack thereof in such a world?  It’s the middle of the 2012 2nd quarter – are things as they should be?  Better?  Worse?  Better or worse than what, than when?  Friends and mentors who have taught and trained me have left and continue to leave, and new employees I teach and train continue to come.  Who’s going to stay?  Who’ll leave soon?  Who will be working on what, and what will my relationship with them be?  What will I be working on?  What’s more important, less important?  When is something complete, when is there something to call a success?  When is something a failure and not just a mistake or setback or delay?

What are the answers to these questions?  Are there answers?  When will I start figuring them out, and who will help get me there?  How will it happen?  When will it end, and how will I  know?

Tuesday, May 15, 2012

Primary day

Hello on the evening of Nebraska's Democratic and Republican primaries.  And aren't they exciting, with Pres. Obama being unopposed and Gov. Romney running against noone who's still running.  But that's nothing of note.

What's irritated me for a while - and increasingly so - is that primaries are state funded.  Why is this so?  They are purely political functions; they are a mechanism for national (and state and local) political committees to choose who will represent them in the general election.  While personally I appreciate being able to vote for this, I don't understand why my tax dollars are being spent on this.  Each party could put candidates names on pieces of paper and whichever one gets eaten my a donkey or elephant last could be the winner.  What I mean is, how each party determines who will represent them is their own decision.

Further, why does the state have to spend tax-dollars on this process?  I'm not sure how much it costs to hold today's election, but on one side there was an unopposed race and on the other there was a race with only one candidate still running.  Yes, I know the down ballot issues are important, but the only one people really care about was essentially meaningless.

I hate to sound crotchety and say "and another thing," but, my other problem with this involves registering my party affiliation with the state.  Why's it any of their business?  I know that who votes is public record, but why is my party affiliation?  And more importantly, why do I have to tell the state?  How can they tell me I can't vote in either primary?  Can they make other decisions based on my affiliation?  Tax rates?  Unemployment benefits?  Hiring or firing?  Timeliness of EMTs or garbage collection?  I know (well, hope) these decisions aren't being effected by it (and trust in the courts to protect that), but would rather not be required to register my political decisions with the state.

I hope everybody in Arkansas, Kentucky, Texas, California, Montana, New Jersey, New Mexico, South Dakota, and Utah find their primaries more meaningful than I did.  But it's not likely.

Edit: I forgot the last part of my rant.
I have no problem with the state holding elections that are used for primaries.  There are several (mostly unopposed, entirely unimportant) non-partisan elections.  So since the state's already going through the expense of the election, if a party chooses to use it to select their representatives, they can feel free to.
For a fee.
This isn't just a gimmick to help the states balance the budget.  These elections are expensive and are  primarily for the parties' purpose, so let's make them pay for at least part of the expense.  Otherwise privatize the whole partisan part of the election and make each party hold and pay for their own.

Monday, May 7, 2012

Hello world!

Hello, world.  It's bugged me for a while that "hello world" is missing the comma.  Greetings like "hello" are supposed to have a comma after them.
I promise this blog won't be entirely about grammar.  Although it may be a periodic topic.