Automated DevOps Orchestration on Cloud

WalmartLabs_oneops

(Disclaimer: the blogs posted here only represent the author’s respective, OneOps project does not guarantee the support or warranty of any code, tutorial, documentation discussed here)

Application Lifecycle Management (ALM), sometimes called “DevOps” interchangeably, is a broad area but never runs out of excellent, open-source/free tools. Some popular tools are listed by category. Please note that: (1) this list never aims to be complete. (2) some tools may cover more than one category.

 

 

In theory, the existing open-source tools should suffice to cover most of the aspects and could assemble a workflow for DevOps, such as:

submit code change -> integration tests -> release -> deployment -> monitoring -> alerting.

What is the current status of building a DevOps workflow?

Based on my preliminary survey of various types of organizations, the summary of my personal observation is as follows:

  • Tech start-ups are more likely to hire talents to do the fundamental DevOps work (sometimes still not fully automated), rather than paying for vendors. This is because: (1) tight budgets, (2) do not want to be “locked-in” to a vendor at very early stage, (3) the cost of switching between vendors is typically high.
  • Middle-size tech companies with strong Engineering teams are able to customize the open-source DevOps tools to better fit their use cases. Meanwhile, they could always stay up-to-date with the latest commits while maintaining their own internal branch.
  • Large companies fall into the following three categories:
  1. Use vendors’ products or out-source (fully or partially), simply because they have enough budget and this is the quickest way to build things up.
  2. If they have a strong engineering team, they could code up their own entire DevOps workflow at the organization level, possibly having some advantages over open-source tools. The reason for doing this is: they either have a long-term goal to justify their investments on this space or they could envision and realize that the existing open-source tools will never solve their problems in scale.
  3. Every engineering team builds up and maintains their own customized workflow based on the open-source tools. (Much similar to the tech start-ups way, but a large organization could easily have over 50 engineering teams, so this practice looks more like “snowflakes” where each team runs DevOps separately.)

What are the pain points of building a DevOps workflow based on open-source?

Clearly, there are at least three issues that need to be solved:

  1. Identify the appropriate tools that fit into the particular organization’s use cases.
  2. Integrate or assemble them together and make them compatible with each other, upon software upgrade or patching the bug.
  3. Most of them are self-contained, so separately managing and monitoring them will not be scalable, leading to higher costs and more errors.

Desirably, a DevOps solution may support most popular open-source/free tools, seamlessly handle their integration, and orchestrate the entire DevOps workflow and tasks in one central portal.

Potentially low-cost or free solution for DevOps orchestration?

Since Feb 2016, WalmartLabs released a new PaaS open-source project called OneOps, which is a multi-cloud DevOps orchestration platform that is driving the technology transformation of Walmart Global eCommerce (walmart.com)one of the largest online retailers worldwide.

oneops

After trying and testing OneOps for a few months, I think OneOps mainly includes the following benefits:

  • DevOps orchestration: integrate major open-source or free DevOps tools and orchestrate them on one single UI.
  • Model-driven, “best-practice” based application template: Create once, deploy unlimited times.
  • Support deployment on major public and private clouds
  • Operation excellence: auto-scale the application by dynamically provisioning the machines by workload; auto-repair unhealthy machine and slow applications; auto-replace bad machines and dead applications.
  • Promote the DevOps culture: the dependencies among developers, QA and operations (SRE) are weakened, leading to an acceleration of the product delivery and a reduction on engineering costs.

What major problems could OneOps solve? Who may need OneOps?

(1) organizations who want to move their shared physical infrastructure and application to the cloud-based hosts, because of the following issues when resources are shared:

  • Difficult to make a performance guarantee on SLA: some teams may suddenly ingest workload “spikes” while other teams will be affected.
  • Security/compliance issues: One team could delete the data owned by other teams by mistake.
  • “One-size-fits-all” configuration: there are very few cases where “one-size-fits-all” really makes sense to software and hardware configurations. Each team wants them optimized for their specialized workload.
  • As tenants check in and out, over-subscribing or under-utilizing the physical hardware and software resources could happen.
  • Installing new hardware/software or replacing existing ones needs to follow complex IT procedures and this could be a long-time to wait for approval.

How the cloud-based infrastructure could solve the above problem?

  • Elasticity is the key to the cloud; each team/user could create a dedicated infrastructure and application for their own (performance) goals.
  • Easy to apply security/compliance rules to prevent unrecognized logins.
  • Fine-tune the infrastructure and application configuration independently.
  • Choices on various virtual machine types (CPU-optimized, memory-optimized) and request particular computing resources when needed
  • Spinning up a virtual machine in just a matter of minutes.

Particularly, OneOps is a cloud-oriented platform that supports major public IaaS (such as, AWS, Azure, Rackspace), private clouds (OpenStack) or even bare-metal provisioning (OpenStack Ironic).

A post from VP Engineering, Linkedin also discussed their OneOps-like solution that is shifting their database provisioning from physical box to the cloud: https://www.linkedin.com/pulse/invisible-infrastructure-alex-vauthey

(2) Organizations who want to reduce their operational costs. The motivation is that most individual team hires dedicated operational people and uses diverse technologies (puppet, chef, ansible…), but just to solely operate their own team’s services. In many cases, the operational people have little knowledge about how the application is programmed, so the only way to “save” the application from emergency is to hit the “restart button”.

OneOps adopts a DevOps culture; it orchestrates all DevOps tasks on a single UI which makes the operational tasks much more easier and accessible. Besides writing the codes, the engineers also deploy, monitor and scale the codes – truly taking the ownership of the codes throughout its entire life cycle.

(3) Organizations who want to advocate engineering best practice across multiple engineering teams. The motivation comes from the “re-invent the wheel” problem:  core and common technologies (e.g. messaging, cache, data store…) are separately developed and operated by each individual team, without much sharing on knowledge and best practice.

The OneOps could model and abstract the core and common technologies as reusable “Design” (or “Cookbook”, “Template”), which contains the “best practices” on deployment and operation. Teams who need certain technologies only need to use the application template on OneOps and deploy to their preferred cloud, so that they can fully focus on their own unique application-level development, without concerns about how to implement and operate the common technologies with best practices.

(4) Organizations who want to have a faster product release. Many engineering teams have rigid product release schedules, ranging from weeks to months, because of the dependencies from developers to QA, from local testing to cluster-wise staging, from one version to another.

OneOps uses the cloud-based infrastructure and best-practice based application template, so rolling out a testing or staging environment that resembles the production is much faster (minute-level) and easier. Plus, as OneOps follows the DevOps model, the dependencies among the developer, QA and Ops/SRE are weakened. Furthermore, OneOps could integrate with Maven/Jenkins/Nexus to provide the team’s software development lifecycle (SDLC) with Continuous Delivery capabilities, which will speed up the release process.

(5) Organizations who have “snowflake”, diverse DevOps practices (high costs, re-investment, incompatibility) but are looking for a DevOps “standard” across the organization, so that all teams follow the same best practices on a generic DevOps orchestration platform.

OneOps orchestrates the entire the DevOps workflow by utilizing and integrating the existing open-source tools (e.g. chef, puppet, nagios, git, jenkins, nexus…), the existing users of these tools will have a minimum learning curve to use OneOps. Moreover, OneOps itself is an open-source project and has a rich set of upstream best-practice based application templates, the organizations could “grab-n-use” (and modify if needed), which prevents vendor “lock-in”. Last, OneOps has been production-proven in Walmart.com, which supports hundreds of applications on website and millions of transactions per day.

What Next?

In up-coming posts, I plan to introduce how  application template could be deployed on a cloud-based infrastructure, for example OpenStack, through OneOps, then a “road show” of deploying and using the application template on OneOps.

Advertisements

3 thoughts on “Automated DevOps Orchestration on Cloud

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s