Skip to main content

2018 DevOps Days Minneapolis: Recap (part 2)

So much stuff to write about, I broke it up into two parts. Part 1 is available here.

Keynote #4: Staying Alive: patterns for Failure Management from the Bottom of the Ocean
Deep water SCUBA diving is both complex and fatally dangerous. Therefore, systems and practices are put into place to ensure that failures are contained and not cascaded.

We build safety in complex systems, but these systems cascade failures from one system to the next (system integrations anyone?). How do we contain the failures? Ronnie makes two solid points here:

  1. Unused safety systems don't exist
  2. Untested safety systems don't exist either. 
  3. Unused and untested safety systems are more dangerous than nothing at all. 
If failure is inevitable, a safety system built with the purpose of containing a failure that is untested is an unknown. What is risk? 

Risk for divers is based on two things: 
  • the chance of occurance
  • the chance of regret if it does occur
For a diver, the risk of your air gauge being inaccurate is low. But if it is inaccurate, you would certainly regret running out of air at 300 ft below the surface. That's why you have two gauges. For an IT Service Organization, the risk of a failure might be low, but you may regret delivering a substandard expense to your customers.  

Some other quick bullets and take-aways: 
  • Like a marching army, a team can only go as fast as the slowest person. Let the least experienced person lead and let everyone help. Everyone will be invested in the success. 
  • The goal of any post mortem is to better understand the complex system. 
  • Pre mortem or "What If" analysis. Include the highly likely and highly regrettable situations. 
  • Survivorship bias is assuming that just because you lived you make the right decision. Revisit past decisions and review operations to make sure it was the best decision. 
  • Celebrate success of non-failure. Boring launches, uptime metric. Don't reward heroic acts. 
Workshop #1: Kubernetes 101
I skipped two open spaces sessions after lunch to learn about Kubernetes. The link above contains the full slide deck from the session and it was totally great. I honestly knew nothing about kubernetes at the start of the session and now I know how to deploy containers using it. 


Seems like a good idea. I didn't get much out of this session.

Keynote #6: Ship of Fools: Securing Kubernetes
I funneled all my excitement from Kubernetes 101 mostly into terror about how bad hackers will use it to end my career. Ian Coldwater gave a great presentation, not only about some of the glaring, insecure defaults in Kubernetes but also gave insight into the thinking of an attacker that might want to gain access to your systems.

Biggest take away was Shodan, a search engine for the Internet of Things. Want to look for open ports in a building near you? Check it out.

Ian also discussed identifying a thread model to answer the following questions: What are you protecting? How are other systems accessing what you are protecting? Just like safety systems, security systems should also be layered to prevent cascading security failure.

Keynote #7: Iterative Technical Design
You should have a technical design. It should be iterative. Refer back to it to make sure you aren't building something that's not needed.

Amy Chen gave interesting talk, that really could have been called "Applied Product Thinking". I translated her presentation into a formula and an outline for a design document .

Product  = code(0.25) + everything else(0.75)

Where code is the manifestation of ideas, discussions, and collaboration. It also represents only a 1/4 of the total effort of a product.
and, Where everything else includes deadlines, handoffs, testing, requirements gathering, road mapping, etc. 
Product Design Document should contain all of the following:

  • Overview
  • Goals + Non-Goals (aka scope)
  • Background
  • Minimum Viable Product requirements
  • and, the Design
Tim Gross concluded the keynotes with a discussion of the ethical concerns with developing and services. Our industry moves to fast for strong governmental regulation or even meaningful self regulation. Therefore it's up to us to consider the impacts of the technology we are building. My mind was wandering towards lunch and I didn't take too many notes here. Sorry. 

OpenSpace #1: DevOps KPIs
I was so intrigued by Amy Patton's discussion of DevOps KPI's I went to another discussion. In fact, there is a lot more here that I'd like to discuss so I'm going to break it out into a future post. 

Comments