Installing OpenStack for Federation with Aristotle

Software Recommendations:

New adopters are encouraged to use the most current supported version of OpenStack (release details). Several federation members are using Red Hat’s Openstack, but this is not a requirement to support an OpenStack cloud which can join the federation.

Hardware Recommendations:

OpenStack Management and Compute

Controller Cluster

For sites desiring high availability, a 3 controller node cluster is recommended. All Openstack components except for nova compute run on all 3 nodes of the controller cluster. Cinder (volume) runs in A/P mode whereas all other components run in A/A mode.

Each controller node needs two ethernet interfaces: one for internal/management network and one for public/provider network. It also needs enough disk space for Horizon to save uploaded images temporarily before they are uploaded to Glance.

Compute Elements

Compute nodes can have a broad range of specifications but instances deployed on the compute services will need to at least have sufficient hardware specifications to meet the instance flavors the site will offer and support the CPU/RAM oversubscription rates.

Each compute node requires two ethernet interfaces: one for internal/management network and one for the public/provider network.

Storage: Ceph

A Ceph cluster provides storage for

  • Volumes
  • Images
  • Boot Volumes for VMs
  • Object Storage

Monitor Servers

A minimum of 3 monitors on their own physical servers per cluster is needed. If the cluster has more than 400 OSDs, use 5 monitors. The monitors need to establish a quorum to update maps. Each monitor node should have:

  • 2 GB of RAM
  • SSD or fast SAS drives in RAID5

OSD Nodes

A minimum of 3 OSD nodes is required for high availability and fault tolerance. Objects are distributed across OSDs according to CRUSH rules. More OSD nodes means more bandwidth to data and fault tolerance.

For good block storage (volumes, images, VM boot volumes) performance, each OSD node should have at least:

  • 1 GHz of hyperthreaded CPU core per OSD
  • 1 GB of RAM per raw TB hosted
  • 2 10 Gb/s NICs on separate PCI cards: 1 NIC on the public network for client-facing traffic; 1 NIC on the cluster network (replication and recovery operations)
  • (optional) SSD for hosting OSD write journals:
    • SATA SSD: 6 OSD write journals/SSD
    • NVMe SSD: 12 OSD write journals/SSD

Sample OSD Node Configuration

The Ceph cluster at Cornell University Center for Advanced Computing has 12 of the following OSD nodes:

  • Dell PowerEdge R730 Server
    • Dual Intel Xeon E5-2630 v3 CPUs (16 cores/32 threads total)
    • 128 GB RAM
    • 2 mirrored SAS drives for server OS
    • 12 OSDs: Each OSD is hosted on a 8 TB 7200 RPM SAS drive
    • 2 200 GB SATA SSDs. Each SSD hosts 6 OSD write journals
    • 2 10 Gb/s NICs: 1 on motherboard; 1 on PCI card

RADOS Gateway

For maximum bandwidth, the server should have many CPU cores/threads as possible. For high availability, install multiple RADOS Gateway servers behind a load balancer. Swift/S3 API are RESTful.