Private Cloud Project – Ch02 – The Design Concept

This is a follow-up to my first post in this series, Private Cloud Project – Ch01 – The Mission.

As the mission was now set, I had to find resources how to go forward with the components that would make up our private cloud:

  • Hyper-V cluster
  • File Server cluster
  • Scale Out file server
  • Storage Spaces

When I started gathering information about these components, especially those related to storage, I first had to learn the nomenclature or vocabulary that was used in this area. “Storage Spaces”, f.e., which has been introduced in Windows Server 2012, is fairly new to the tech world, and usable resources, except for “marketing stuff”, are hard to find. To this date (September 2014) only few people seem to be using Storage Spaces, or actively writing about it, although it’s a great technology from Microsoft. A few gadgets are still missing, associated to management, which we all know from “standard” RAID controllers like the HP Smart and PERC. For example, you do not see the rebuild status for a drive when it has faulted and been replaced.

Also pretty hard to find was a hardware vendor whose whole stack of Server->Controller->Shelf->Drives would be suitable for Storage Spaces and also supported by Microsoft. After fiddling with Intel shelves connected to some DELL servers, talking to DELL representatives and well known local MVP, we settled with a design consisting of purely DELL equipment. A few of the certifications were still to be done by DELL, but as the process was obviously already advancing, we decided to go that way.

The result should look like this:
Microsoft private cloud stack
(Source)

The components of the environment are:

  1. The Hypervisor machines, based on Microsoft Hyper-V
  2. A central storage, based on Microsoft Windows Server 2012 R2 “Scale out File Server” (SOFS)
  3. • SOFS provides storage

  4. Microsoft System Center Virtual Machine Manager (VMM)
  5. • VM creation and management, workload deployment and distribution

  6. Microsoft System Center Data Protection Manager (DPM)
  7. • Backup

  8. Microsoft System Center Orchestrator (ORC)
  9. • Process automation of creation, deployment and monitoring

  10. Microsoft System Center App Controller (APC)
  11. • Self Service platform for internal customers

    As a more modern and more flexible alternative to using ORC+APC, Microsoft has issued the free “Azure Pack”, which is a port of their commercial platform software adapted for the usage with On-Premise private cloud environments.

  12. Azure Pack (AzP)
  13. • Self Service platform for admins and internal customers

Hypervisor Nodes

Hypervisors are the work horses of the Private Cloud environment. They host the Virtual Machines by virtualizing their physical hardware resources to the guest operating systems. To be able to host a significant number of virtual machines, the Hypervisors need to have powerful processor hardware, a large amount of memory and very fast network interfaces. The Hypervisor hardware setup consists of:

  • DELL Poweredge R620 (1U)
  • 2x E5-2650 v2
  • 16x 16GB DIMM (256GB)
  • NDC 4x I350 1GBit Ethernet
  • 2x X520 SFP LP PCIe Dual 10Gbit Network Card
  • 2x 300GB 10k SAS for System RAID1

Hypervisors have

  • Microsoft Windows Server 2012 R2 Datacenter installed
  • The Hyper-V role activated

Scale Out File Server (SOFS)

The SOFS replaces traditional enterprise storage systems. The SOFS will contain the virtual disk container files (VHDs) where the VMs which are hosted on the Hypervisor nodes store their data. The storage is made available to the Hypervisors through SMB 3.0, the known protocol for accessing file shares in the Windows world, which has been optimised for serving applications like MSSQL or – like in our case – VMs.

The SOFS is basically built on DELL PE 720xd, but contains some extras. These are:

  • Three DELL SAS Controllers (without RAID functionality)
  • An Intel X520 10Gbit network card
  • JBOD Hard Disk Shelves
  • SAS SSD Drives for storing “hot data”
  • SAS Hard Disks
  • 64GB of RAM

SOFS will be built as a cluster, with building blocks of initially two physical servers in a cluster. Both servers will be attached to three JBOD shelves, each exporting all of their hard drives to both servers. The SAS devices are managed by Windows Storage Spaces and are exported to the Hypervisors through the SOFS role as special “file shares”, f.e. “\SOFSvmshare”

The SOFS nodes will have

  • Microsoft Windows Server 2012 R2 Standard installed
  • The Failover Clustering feature installed
  • SOFS activated

Switch Fabric

Our network department advised us to some really fancy but also pretty expensive network hardware for setting up our environment. The 10 gigabit connections for the building block we were buidling would be provided by 2 Cisco N5K-C5596UP-FA switches. The 1 gigabit connections would be provided by 2 Cisco N2K-C2248TP-1GE, so-called “fabric extenders”, connected to the 10gb switches. They are switches that only have minimal logic of their own, the “real work” is done by the 10gb switch.

Here’s how hypervisors and file server are connected:

hy-net

We decided against converging the network connections on the hypervisor, as we would lose other functionality, like RSS. Instead we would use the first two onboard NICs (1gb) to build the “Managament Team”, then have a team of two of the 10gb ports – each one from a different 10gb NIC – for VM traffic, and the remaining two 10gb ports for storage traffic to the file servers via SMB3.

sofs-net

We used the same scheme for the file servers, just no VM team here.

All connections are distributed among the switches for redundancy. The teams are all LACP teams, therefore the switches need to be in a virtual chassis mode, so that LACP teams can be spread across the two physical units. Don’t ask for details, though, I do not know them 🙂

The rack space we intended to use would be set up like this:
dell-racks

So much for part 2 of the series.
In the next part: The management cluster (for the System Center machines).