Category Archives: Operations

I’m a Data Professional (and I’m Okay)

I recently attended SciDataCon (part of the International Data Week) in Denver, which is a conference focused on all aspects of the science data workflow – from planning to long-term archival and all the people, processes, and other stuff surrounding it.  One talk in particular, that resonated with me was a panel discussion called “Defining the Data Professional” in which the panel discussed the difference between a data scientist and the myriad other roles involved in the data lifecycle that broadly define the professional field. Continue reading I’m a Data Professional (and I’m Okay)

Production Software Support

At work, I’ve been involved in a new effort to overhaul how we handle the development support of our production software.  As the Ops lead, that is something I am very passionate about, so I am excited to be involved in this effort.  I have to admit that I’m somewhat of an idealist when it comes to projects at work, but I have this vision that production software live in production… Continue reading Production Software Support

Interfacing with Remote Developers

One of the interesting things about my job is that I get to work with a lot of developers. I recently spent a week in Riverdale, MD with a group of developers that we usually only work with remotely – via phone or e-mail, or when we enter a trouble ticket against a system issue or software bug. Working with this team face-to-face was a really rewarding experience.

One of the things that I have learned is that it’s very easy for our team to reach out to developers when something isn’t working. From a professional standpoint, we have all the things necessary to kick up a conversation – something in common (the software), a catalyst (the bug), and a goal (making it work again). With our local development team, that we get to see every day, it’s easy to say, “hey – everything is working swell again, thanks.” With our remote developers, this is a lot harder; it doesn’t really make sense to call them up and say, “Hey – I just wanted to call and say everything is a-ok!” – there is no catalyst. Continue reading Interfacing with Remote Developers

Talking to Computer Support

Jessa recently bought a new computer from HP, and, unfortunately, the graphics card was defective-on-arrival. After doing a fair amount of troubleshooting ourselves (and, truthfully, correctly diagnosing the problem), it was time to call up HP support to get the ball rolling on getting a replacement. Listening to Jessa talk to support reminded me why talking to tech support is never a fun proposition. At the core, I think, is that neither person on the call trusts that the other knows that they are doing.

As technical customers with years of training and job experience in technical troubleshooting, Jessa and I did a lot of legwork before we even called technical support. As such, our hope was to be able to start somewhere closer to what we had narrowed down. The early questions the tech asked, such as, “what have you tried so far?” seemed like we might, but the hours we spent afterward doing things like turning it off and on again, reinstalling software, reinstalling drivers, resetting to factory settings, and, finally, getting the computer into a state where it wouldn’t stay booted, made us lose trust in our tech. He asked what we had done, then did it all over again, and got nowhere closer to the problem (and, in fact, made it worse). Continue reading Talking to Computer Support

Good Operational Software – Part 2

This is the second part of my thoughts on good operational software.  For part one, go back one post to here.

Second, is separating out configuration management from software development. Like logging, this one is easy on the surface; it’s pretty trivial to separate out the main configuration parameters into a configuration file rather than hard coding everything. The most interesting thing about this, however, is that it exposes dependencies in an explicit way. For example, if an application is dependent on a database connection, then there must be configuration related to that database – the host, port, database name, user, and (hopefully encrypted or obscured) the password. Continue reading Good Operational Software – Part 2

Good Operational Software – Part 1

As the Operations lead, I find myself pondering what the difference between good software and good operational software. The software development team here at NSIDC is a sharp group, they know good software when they see it. Further, many of them know software that is not good operational when they see it. But, as a technical group, I don’t think we’ve all nailed down what set of features we can use as a benchmark for “this makes the software operational.” As far as I can tell, these things vary depending on the project, and, like Science Fiction, “good operational software is best described by pointing at it.”

That being said, by way of getting some ideas out there, there are three things that I have noticed I consistently point at and say, “that makes this software operational.” Continue reading Good Operational Software – Part 1

New Operations White Board

This year, during the holiday season, I hatched an idea for a new white board for NSIDC Operations. Our old white board was serviceable, but getting long of the tooth and not as functional as we have needed more recently. So, I ordered a new magnetic white board and started putting together something a bit more flexible that could change as our duties evolve.

Below are pictures of the old and new white boards: Continue reading New Operations White Board

It Actually Paid Off

This past Sunday, I had one of those experiences that pretty much everyone dreads – I had a hard drive fail. However, it was also one of those days where all the prior preparations actually paid off. When I set up the archive disk on my server, I used a software RAID (level 1), so the data was duplicated across a second drive. Also, I had purchased a couple of other drives beforehand as “cold spares”. Even with all the preparations, it was still non-trivial to replace everything and get it running, so I thought I would document my experiences here.

Continue reading It Actually Paid Off