Sunday, November 23, 2014

Build a Mesos cluster using vSphere Big Data Extensions 2.x


I've been helping customers virtualize Hadoop and scale-out applications for a while, starting with just creation and tuning of VMs through the old C# client and scripts.  The open-source project Serengeti was a simplified way to handle virtual deployment and is an integral part of vSphere's Big Data Extensions (BDE).  Under the covers, BDE has an open-source version of Chef installed to hold the Big Data cluster blueprints and recipes.  Part of what I have been working on in VMware's Office of the CTO to support our virtualized HPC customers is to take some of those lessons learned and apply them to the automation of creating and managing HPC clusters.

In essence, most if not all scale-out applications follow a simplistic model of master(s) and workers.  BDE defines a cluster model or blueprint in a JSON file that is then used to clone virtual machines from a base template and assign a master, worker, client or other type of role. In BDE's case, these roles are declared in Chef as well as the corresponding application cookbooks and recipes. I won't be going through an in-depth tutorial of Chef in this post and if you're not familiar, some of these steps below won't make much sense, so some basic lessons are here: http://learn.getchef.com/.  In this example, I'll be using Apache Mesos.

  • Download the BDE OVA, at least version 2.0.  It's not a top-level download but you can find it under the VMware vSphere product heading, enterprise and higher: https://my.vmware.com/web/vmware/details?downloadGroup=BDE_210_GA&productId=353&rPId=6997
  • Deploy the OVA using either the vCenter Web Client or C# client which will create a vApp with both the "Management Server" and a "template node" CentOS VM which will be the template for all node deployment. If you prefer another distro such as RHEL or Ubuntu, now is the time to prep that image: http://pubs.vmware.com/bde-2/topic/com.vmware.bigdataextensions.admin.doc/GUID-CAD01F1F-F915-42C9-A1C1-A6093C2564D3.html
  • Since you'll be using Chef, you'll want to get some Mesos recipes from the Supermarket community.opscode.com or from some other appropriate place on Github.
  • Log into the Management Server as the root user or serengeti and upload your Mesos cookbooks. The default Chef directory is /opt/serengeti/chef.


  • Make a copy of the basic cluster JSON file from /opt/serengeti/samples/basic_cluster.json to something to describe your new blueprint, such as mesos_cluster.json.
  • From the basic example, you can see the classes of node groups you can deploy (master or worker) and the role that BDE will assign after cloning and customization of that node. Right now, each node will only receive the "basic" role.  Adjust each nodeGroup type to get an appropriate role like "mesos_master" for master nodes and "mesos_worker" for worker nodes.
  • You can write your node roles in the /opt/serengeti/chef/roles directory and then upload them as you would with any other Chef server.  For example: knife node role from file /path/to/file/mesos_master.json
  • For a sample format, you can copy the basic.rb to a mesos_master.rb and mesos_worker.rb.  Make sure to adjust the Name parameter on line 1 and the run_list to match your Mesos recipes. 


  • Also make sure you've uploaded your cookbooks to the Chef server with knife cookbook upload -a or knife cookbook upload mycookbookname
  • The quickest way to test this is to run the serengeti CLI by entering the command serengeti.
  • Connect to the BDE server and login by entering connect --host mybdehostname:8443.  This is actually authenticating via SSO so you could use administrator@vsphere.local or an equivalent user with administrator privileges on that vCenter instance.
  • To create a cluster, enter at a minimum: cluster create --name myclustername --specFile /path/to/my/mesos_cluster.json --networkName defaultNetwork
  • You can add the optional argument --password yes to the cluster create command set the root password for each VM, which I typically recommend.
  • At this point, you can see the cluster creation occurring within the CLI window as well as under "All Tasks" in the vSphere client. The time that it takes mostly depends on the speed of cloning in your environment.
  • When the cluster creation completes you will now have your own mesos cluster to log into and check out, or you may have a broken cluster depending on a variety of issues such as no hostnames, no IP addresses, recipes failing to complete successfully, and so on. For testing purposes, I would recommend making small clusters of a single master and single worker to evaluate the accuracy of the Mesos recipes. If (when?) you need to completely delete a cluster, the command for that is: cluster --delete myclustername which will power off all the VMs and destroy those VM files.  Don't expect any data to remain from that cluster after this step.  At this point, you'll be in an experimentation mode where you can update recipes, upload to Chef, and recreate the cluster.
There's a lot more to learn if you plan on extending this recipes or writing your own, so I'm planning to post more work here. In particular, if you have dependencies within services, these will cause issues as well as proper setup of static IP/FQN or DHCP and DNS. Hope this was helpful to get started!

Additional links:
http://community.opscode.com
http://www.vmware.com/products/vsphere/features/big-data
https://mesosphere.com/