… well, I’m not sure what, exactly. But this is very early days in the progress of this little project and I’ve already learned a fair bit.
“What project,” you ask? Nice segue!
Why You So Crazy?
No job I have worked at in the years since graduating graduate school has made available a lab environment in which curious geeks can get their hands dirty tinkering with and breaking things in order to educate themselves (one of them wanted to make it a project, but we didn’t end up having the resources/time, and eventually I got distracted). To some extent this is understandable — who wants to spend money on stuff that doesn’t contribute directly to your bottom line? But really, it’s kind of stupid. Professional development of employees is key to any business, especially for knowledge workers whose definitions of “cutting edge” change on the regular.
So, lacking something provided to me, and having a little extra cash (and a little extra hardware) on hand, I decided to build my own.
I’m trying to learn a number of things here. First, I want to get a better picture of how digital infrastructure with VMware ESXi is done. My experience with it in the past has been mostly as a “user,” i.e., not the guy setting things up but rather the guy spinning up VMs and attending to their care and feeding. I want to learn how, for example, to team two NICs together, make them appear to be a single interface, assign them as an iSCSI channel, and then use them as a 2Gbps link to an iSCSI datastore. I’d also like to get an idea of how ESXi’s HA provisions and manageability features (all the stuff that goes into vCenter, for example) work, and use the host to create an “enterprisey” Windows networking environment to train myself on Windows Server 2012.
This is not a new idea. Googling for “esxi whitebox” or “esxi build” or similar things will return lots of hits, some which make more sense than others for a weirdo like me who wants to run an enterprise(ish)-class VM host in his living room. Noise can be a problem for these machines, as well as power consumption, so a lot of thought is put into what kind of tradeoffs you need to make in order to have a machine with good performance and not too many environmental issues (as in, the environment of your living space).
There were a couple other guides I consulted to get this project off the ground — I have David Seidl of Notre Dame’s InfoSec department, as well as David Sloane, formerly of Obama for America’s engineering team, to thank for their suggestions. They pointed me to a number of good resources, including two blogs that were also dedicated to this sort of thing. So, I did a little research, and came to a number of conclusions.
- Server-class hardware, mostly because it depends on expensive server-class memory, was not going to be an option. That eliminated the Xeon and the motherboard in the second link from contention pretty quickly.
- Maxing the RAM on the motherboard I ended up choosing (the one in the Shuttle XPC case) gave me 32GB of RAM, and that was going to have to be enough.
- Hey, hard drives are still cheap!
- Good tip from David Seidl: DO NOT, under any circumstances, buy an Intel CPU for this project that is a -K series processor (this one, for example: http://www.amazon.com/Intel-i7-3770K-Quad-Core-Processor-Cache/dp/B007SZ0EOW/ref=sr_1_1?ie=UTF8&qid=1367531007&sr=8-1&keywords=intel+cpu+i7+3770k) The K series chips do not completely support the Intel VT-x extensions, which will give virtualization an unnecessary overhead. Some things may not even work. Caveat emptor.
What I wanted out of the machine was a largely self-contained (for now) lab environment that I could use right from the start to stand up an entire “digital infrastructure” environment. I mostly got that, with one notable exception I’ll discuss below.
I ended up cleaving to Robert Novak’s example pretty closely, with the exception that at the time introducing a solid state disk to this build didn’t make a whole lot of sense to me. So I ended up with these parts:
The case is rather handsome when you pull it out of the box.
Front-side USB3! Who knows, maybe one day I’ll octopus a bunch of USB3 drives off these as additional datastores.
Here’s what the guts looked like after adding everything.
Note the Ethernet card which does not, apparently, have enough lanes to function at max speed. Also the RAID card behind it which was not part of the initial build.
Honestly, this is the easy bit. Physically assembling a computer is not tough (though the HSF on this particular machine is a little tricky). As for software, if you are dedicating a machine to ESXi, you should install the OS on an SD card or USB drive and let your server boot only from that, then devote any other bigger/faster/meaner storage to the exclusive use of ESXi as a datastore for virtual machines.
Operation and Testing
Ah, the trial-and-error bit! So, some things I learned:
You Need Better I/O
No matter how important you THINK disk I/O is for a project like this where you’re virtualizing several machines at once, it is more important. My first thought was “Hey, EVERY SINGLE piece of hardware on this Shuttle board is being detected and working perfectly with ESXi 5.1 with no modification whatsoever! Awesome! So, I’ll just use the on-board SATA controllers to drive my disks, and each one of those will be a datastore. Perfect. I can spread the VMs around the disks, and that should keep it from bottlenecking on disk IO.”
It turns out that you can sort of get by this way, but the results will drive you nuts. SATA (even 6Gb/s SATA3), at least as implemented on the built-in motherboard controllers on this machine, is not going to cut it. Creating a VM (i.e. formatting a VMDK, the longest phase of setting one up) takes forever. Anything involving a prolonged write to disk takes forever. I feel like this is a recipe for disaster long-term, so instead, I did this:
Buy a Cheap RAID Controller and Put Your Drives in RAID0
Enter the IBM ServeRAID M1015 controller, which is actually a rebadged LSI 9220-8i card. This is a very popular simple hardware RAID card with, sadly, no write buffer — but even without the write buffer, its write speeds are MUCH faster than the controller on the Shuttle motherboard, at least according to my anecdata. More usefully, someone has made a bit of a study of the performance of this card. Spoiler alert: It’s pretty great especially for the price. My two drives are now in RAID0 and performance is quite manageable.
VMware’s Free ESXi License Is Pretty Crappy
Having worked with VMware ESXi at previous jobs pretty extensively, I was rather spoiled on the feature set that you get with a “full” deployment of ESXi. Cloning, templates, vCenter management, all of these are missing in the “free” license you get from VMware. You can sort of clone, but it involves shutting down the VM you want to clone, copying the VMDK for its hard drive, and then copying it again and attaching it to new VMs you want to create. Which is less “cloning” and more “annoying copying process,” but you get what you pay for, I guess? (Honestly, I think not getting support for the system should be enough of an incentive for most people who are going to use this software in a production environment to actually go out and pay for it — crippleware is really not cool — but whatever. Maybe eventually VMware will resurrect their developer/training program.)
Windows Server 2012 Really Is Better Than Its Predecessors
As I mentioned, part of the point of this was to teach myself not just about enterprise-scale virtualization, but about Windows and Linux tech as well. What I’ve done so far with Windows Server 2012:
These were all comparatively easy. Managing Windows 2012 is a lot easier than 2008 R2 as well: the Server Manager application is now designed to make managing multiple machines much easier. You can assign groups based on roles (or anything else that strikes your fancy) and perform tasks on multiple machines at once. It’s pretty sweet.
I would really like to figure out how to do RADIUS wireless authentication and VPN just using Windows. I’ve tried following a number of guides for this, but haven’t found one that actually does the thing yet. I may have to resort to actually buying a book. (!!!) After the networking stuff is all set up, I am going to take a crack at installing and running Exchange, even though I already have Google Apps for Domains here at Reticulum. If anyone has some advice as to where to find a good guide at running RADIUS and VPN with Windows Server 2012, I am ALL EARS, by the way.
On the virtualization side, someone has gifted me an HP DL380 G5 server, which now has 24 GB RAM (another gift) and a number of NICs in it. I think I am going to try and work out how to make that an iSCSI datastore, and then connect it via a dedicated LAN (probably just with crossover cables, if the NICs even need that) to the virtual host. I am working with FreeNAS, but it’s been slow going thus far.
After all this is set up, I am going to go even more insane (it’s nice when you can plan out the slow degradation of your own mental faculties) and attempt to integrate Amazon EC2 instances into this setup via a point-to-point VPN. I have precisely zero idea how to do this right now, but I know it can be done and it’s valuable tech to know. So we shall see! This should be fun.
If anyone has suggestions as to what to do next, etc., I’m all ears. Comment here or on Twitter/FB.