(Disclaimer: the blogs posted here currently neither represent Walmart’s official perspectives regarding to the technologies, nor meant to provide any support or warranty of the code, tutorial, documentation and etc.)
Apache Cassandra is a top level Apache project born at Facebook and built on Amazon’s Dynamo and Google’s BigTable, is a distributed database for managing large amounts of structured or unstructured data across commodity servers. Cassandra is a highly available, scalable and redundant service with no single point of failure.
Cassandra’s architecture by nature is designed for scaling and continuous uptime. Instead of using the common master-slave or a sharded architecture, Cassandra has the similar “ring” design as Dynamo that is elegant, easy to set up and maintain.
In this blog, I would like to describe the following three items:
- Cassandra deployment on cloud via OneOps
- Easy scaling and automatic failure handling feature by OneOps
- Cassandra performance monitoring on OneOps
In OneOps “Design” phase, choose “Cassandra” pack:
After creating the Cassandra design, you may click the “cassandra” component to change the default configurations of Cassandra.
In particular, the “topology” section, Cassandra pack uses GossipingPropertyFileSnitch by default, since it is recommended by the community for production. Also Cassandra has datacenter and rack awareness, “Map of Cloud to DC:Rack” attribute is the place to specify the mapping between OneOps clouds and <Data Center:Rack> pair, which is effective to GossipingPropertyFileSnitch and PropertyFileSnitch. For experimental purposes, it is totally fine to leave it blank. For more details about Snitch and Cassandra topology, please refer to this.
In the Design phase, add a new “User” component and your local SSH key to “user” component so that you could directly log into the Cassandra VM after the deployment.
After saving the design, create a new environment with “Availability Mode” = redundant and choose 1 cloud as “primary cloud” (if experimental purposes). Regarding setting up a cloud in OneOps, please refer to one of my previous blogs.
By default, the Cassandra cluster with 3 VMs will be created. The deployment plan will look like the following: (number of compute instances is 3, denoting 3 VMs will be created)
After the deployment finishes, we could now SSH into the VM. To check the IP address of the VM, go to “Operate” -> your_cassandra_platform_name -> “compute”. Then we will see the details of multiple compute instances, where IP address is shown.
From local machine, SSH into one of the VM, then run the nodetool, a Cassandra utility tool, to show the status of the cluster.
[nzhang2@cas-238213-1-23632914 ~]$ nodetool status Datacenter: dal =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.65.227.135 51.6 KB 256 67.3% 4d99d6ad-889a-42d5-b288-b79c02dd1c9a 2 UN 10.65.227.121 66.08 KB 256 63.3% 1e3e77b4-c39f-40f8-85ed-830a22d61afc 2 UN 10.65.227.12 99.01 KB 256 69.4% e4867580-d77c-47ce-9c8d-1b8ddc8f781d 2
Easy Scaling and Automatic Failure Handling by OneOps
It is typical that the Cassandra admin needs to scale up the cluster capacity when the workload grows. Without the help from the automation software like OneOps, the admin has to manually follow several steps in order to safely add new nodes to an exiting cluster. However OneOps hides those manual and error-prone details in a couple of mouse clicks. Let’s how this happens:
In OneOps “Transition” phase, choose the Cassandra environment, then on your right, click your_cassandra_platform_name. Then you will see the “Scaling” section:
Change the value of “Current” from 3 to 4 to add one more node into the existing cluster. Save and deploy.
After the deployment is done, let’s run “nodetool status” command to see the output:
[nzhang2@cas-238213-1-23632914 ~]$ nodetool status Datacenter: dal =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.65.227.135 109.28 KB 256 51.6% 4d99d6ad-889a-42d5-b288-b79c02dd1c9a 2 UN 10.65.227.121 123.55 KB 256 51.5% 1e3e77b4-c39f-40f8-85ed-830a22d61afc 2 UN 10.65.227.12 156.38 KB 256 48.3% e4867580-d77c-47ce-9c8d-1b8ddc8f781d 2 UN 10.65.227.14 66.57 KB 256 48.7% 60812954-0529-45bd-8f39-12a15a02ac01 2
Now we see that the cluster has added one more node and the load could be evenly distributed over all 4 nodes, which did not require us to type in a single command to make this happen!
Decommission a node is also just a couple of mouse clicks without accessing the CLI (Command-line Interface): In the “Scaling” section, decrease the value of “Current” by 1. After the deployment one node has been safely removed from the cluster without service interruption.
Auto-scaling feature is also available on OneOps, like some other automation software/service also support it, such as AWS, Ansible, SaltStack. However what is really unique to OneOps is called AUTO-REPLACE, meaning that when the VM and the service on it (or call “instance”) are really down, OneOps will automatically replace the bad instance with a new one and automatically make it join the existing service or cluster.
To take advantage of “auto-replace” feature: go to OneOps “Operate” page, click your_cassandra_platform_name, in the “summary” tab find “auto-replace” section:
“Enable autoreplace”, then “Change conditions” to set up the thresholds to trigger auto-replace:
Before auto-replace is triggered, OneOps will first try to repair the failed instance. As shown in above picture,
- Replace unhealthy after repairs #: how many time OneOps will attempt to repair the instance before running auto-replace.
- Replace unhealthy after minutes: before executing auto-replace, how long to wait after all repair attempts have been made.
For experimental purposes, we could define the auto-replace feature like this:
Now let’s simulate an instance is down hard by blocking a wide range of Linux ports via iptables. AFAIK – this is by far the most simple and effective way to simulate a VM down in the cloud. (On a side note, just shutting down the applications, like cassandra daemon, does not have the same effects as block Linux ports. Correctly “simulate a machine down” is a very important to debug and test a distributed system)
chkconfig --level 345 iptables on iptables -A INPUT -p tcp --match multiport --dport 22:65535 -j DROP;service iptables save
Then on a healthy VM, run “nodetool status” to see that there is a node in “Dead” (DN) status: (the last node in the following example output)
[nzhang2@cas-238213-1-23632914 ~]$ nodetool status Datacenter: dal =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.65.227.135 197.28 KB 256 51.6% 4d99d6ad-889a-42d5-b288-b79c02dd1c9a 2 UN 10.65.227.121 211.52 KB 256 51.5% 1e3e77b4-c39f-40f8-85ed-830a22d61afc 2 UN 10.65.227.12 244.12 KB 256 48.3% e4867580-d77c-47ce-9c8d-1b8ddc8f781d 2 DN 10.65.227.14 173.96 KB 256 48.7% 60812954-0529-45bd-8f39-12a15a02ac01 2
After a few minutes, we should receive an alerting notification (via Email) about “Heartbeat is violated” and “repair starting”:
As the blocking port rule will persist after reboot, OneOps will not be able to repair the VM, so after several minutes it will trigger the auto-replace:
On the OneOps “Operate” page, you should see a deployment triggered by the auto-replace is underway,
After the deployment is done, run “nodetool status” and it shows that all nodes (including the ex-down nodes) are up and working.
[nzhang2@cas-238213-1-23632914 ~]$ nodetool status Datacenter: dal =============== Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.65.227.135 157.05 KB 256 48.1% 4d99d6ad-889a-42d5-b288-b79c02dd1c9a 2 UN 10.65.227.121 189.84 KB 256 49.8% 1e3e77b4-c39f-40f8-85ed-830a22d61afc 2 UN 10.65.227.12 189.53 KB 256 51.7% e4867580-d77c-47ce-9c8d-1b8ddc8f781d 2 UN 10.65.227.90 83.09 KB 256 50.3% 503e39c5-3b79-4a50-b74f-de17dc3e4f46 2
In general, replacing a node in a distributed system requires two steps: (1) rotate the bad node away from the cluster, (2) add the replacement node into it, which requires multiple manual steps, especially for Cassandra. When running on the cloud-based infrastructure, there is a relatively higher chance that a VM become “dead” or inaccessible for many reasons (e.g. Hypervisor error, overloaded server…), so replacing a bad node may become a “routine” job for the operation folks.
From above example, OneOps could auto-detect the bad instances, try to repair them, after several attempts without luck, replace them and bootstrap the new instances to join the existing service. All of these happen automatically – OneOps “auto-pilot” the services or applications that are orchestrated on it !
Cassandra Performance Monitoring on OneOps
Some monitors about Cassandra performance could also be visualized on OneOps UI. For example, Read Operation, Write Operation and Logs.
Go to OneOps “Operate” page, choose the Cassandra environment, click your_cassandra_platform_name, then click the “cassandra” component:
Select any one of Cassandra instances:
Then the details of the selected instances are shown. Click “monitors” tab then “Write Operations”. On your right side, the write operations for that Cassandra instance (e.g. cassandra-238213-1) could be visualized over the time .
Click “Read Operations” to visualize the monitoring graph about read:
The “Log” tab shows the number of different types of logs (total, error, critical…)
For production or serious use cases, we could even set up alerts on the monitored statistics, so that when the read/write operations drop below a threshold, or the number of error log lines goes beyond a threshold, the “push” notifications will be sent to the sign-up Email, mobile or collaboration tools. In future blogs, I plan to introduce how to add customized monitoring and alerting for more production-driven cases.
What is Next?
The Cassandra admin will have better experiences and productivity if Cassandra could be managed and operated on GUI, such as OpsCenter. There has been an effort underway to develop OpsCenter pack on OneOps and I would like to keep updated and post a blog about this when it is ready.