Overview

RackHD serves as an abstraction layer between other M&O layers and the underlying physical hardware. Developers can use the RackHD API to create a user interface that serves as single point of access for managing hardware services regardless of the specific hardware in place.

RackHD has the ability to discover the existing hardware resources, catalog each component, and retrieve detailed telemetry information from each resource. The retrieved information can then be used to perform low-level hardware management tasks, such as BIOS configuration, OS installation, and firmware management.

RackHD sits between the other M&O layers and the underlying physical hardware devices. User interfaces at the higher M&O layers can request hardware services from RackHD. RackHD handles the details of connecting to and managing the hardware devices.

Feature List

Feature Description
Discovery and Cataloging Discovers the compute, network, and storage resources and catalogs their attributes and capabilities.
Telemetry and Genealogy Telemetry data includes genealogical details, such as hardware, revisions, serial numbers, and date of manufacture
Device Management Powers devices on and off. Manages the firmware, power, OS installation, and base configuration of the resources.
Configuration Configures the hardware per application requirements. This can range from the BIOS configuration on compute devices to the port configurations in a network switch.
Provisioning Provisions a node to support the intended application workflow, for example lays down ESXi from an image repository. Reprovisions a node to support a different workload, for example changes the ESXi platform to Bare Metal CentOS.
Firmware Management Manages all infrastructure firmware versioning.
Logging Log information can be retrieved for particular elements or collated into a single timeline for multiple elements within the management neighborhood.
Environmental Monitoring Aggregates environmental data from hardware resources. The data to monitor is configurable and can include power information, component status, fan performance, and other information provided by the resource.
Fault Detection Monitors compute and storage devices for both hard and soft faults. Performs suitable responses based on pre-defined policies.
Analytics Data Data generated by environmental and fault monitoring can be provided to analytic tools for analysis, particularly around predictive failure.

Major Service Components

  • ISC DHCP

This DHCP server provides IP addresses dynamically using the DHCP protocol. It is a critical component of a standard`Preboot Execution Environment (PXE)`_process.

  • MongoDB

The MongoDB provices the database for RackHD. RackHD is planing to support other kinds of databases.

  • RabbitMQ

The RabbitMQ provides the pub/sub interface for inter-communication between different RackHD services, also it provices a means for user to subscribe RackHD events.

  • on-dhcp-proxy

The DHCP protocol supports getting additional data specifically for the PXE process from a secondary service that also responds on the same network as the DHCP server. The DHCP proxy service provides that information, generated dynamically from the workflow engine.

  • on-tftp

TFTP is the common protocol used to initiate a PXE process. on-tftp is tied into the workflow engine to be able to dynamically provide responses based on the state of the workflow engine and to provide events to the workflow engine when servers request files via TFTP.

  • on-http

on-http provides both the REST interface to the workflow engine and data model APIs as well as a communication channel and potential proxy for hosting and serving files to support dynamic PXE responses. RackHD commonly uses iPXE as its initial bootloader, loading remaining files for PXE booting via HTTP and using that communications path as a mechanism to control what a remote server will do when rebooting.

on-http also serves as the communication channel for the microkernel to support deep hardware interrogation, firmware updates, and other actions that can only be invoked directly on the hardware (not through an out of band management channel).

  • on-syslog

on-syslog is a syslog receiver endpoint provideing annotated and structured logging from the hosts under management. It channels all syslog data sent to the host into the workflow engine.

  • on-taskgraph

on-taskgraph is the workflow engine, driving actions on remote systems and processing workflows for machines being managed. Additionally, the workflow engine provides the engine for polling and monitoring.

RackHD Homepage

The RackHD homepage is hosted in github.io, the address is: https://rackhd.github.io/, you can get nearly all RackHD news and public resources from it.

Public Documentation

There are various kinds of public documentation can help you to understand RackHD.

  • Readthedoc

readthedoc (http://rackhd.readthedocs.io/en/latest/index.html) provides a lot of technical detail and user manual, and it is activly maintained by developers and nearly always reflect the latest RackHD design. This is most recommended doc for you to understand RackHD.

This is the source code of readthedoc (https://github.com/RackHD/docs) if you find any bug, you could submit pull request to fix it.

  • Confluence

The Confluence (https://rackhd.atlassian.net/wiki/display/RAC1/RackHD) hosts the doc that doens't need to be kept sync with latest RackHD source code change, including the release notes, meeting minutes and others.

  • API Doc

The api doc (https://bintray.com/rackhd/docs/apidoc#files) is the reference manual for both RackHD APIs. This doc is auto updated if there is any change on APIs.

  • Task Doc:

This is an automatically generated documentation for all RackHD built-in tasks, it helps you to understand and use the RackHD tasks without touching the code.

You need to follow this guide to generate the task doc by youself: http://rackhd.readthedocs.io/en/latest/rackhd/tasks.html#task-annotation

RackHD JIRA

RackHD development is executed in SCRUM model, thus RackHD chooses the JIRA (https://rackhd.atlassian.net/secure/Dashboard.jspa) is to track the roadmap, issues and every team (even every developer)'s status. Following is some important boards:

  • RackHD All Issues (RAC): Record all RackHD opened issues. If you want to sumbit issues, you could create a story into this board.
  • RackHD Initiatives (RI): Record RackHD roadmap features.
  • RackHD Team Boards: There are multiple teams accross the world to work together on the RackHD project, each team has its own board, which records the team's working status. Usually each team picks up some stories from RAC or RI and then add into team's backlog.

Community

  • Slack

RackHD setups some channels in Slack to provide instance communication between RackHD users and developers, usually you could get very quick response if you have any question. The widely used channle is #rackhd, if you prefer to talk with Chinese you could use channle #rackhd-chinese

To join in Slack channle, You can get an invite by requesting one at http://community.codedellemc.com

  • Google Group

Comparing with Slack, the google group is suitable for discussion about a specific topic. If you have a new feature design and want to collect other developer's review comment, you could create a topic in Google Group.

You could visit the group page via https://groups.google.com/d/forum/rackhd

RackHD Meetings

There are various meeting to discuss RackHD in different areas, if you have any interest on those meeting, you could see their meeting minutes from Confluence:

RackHD Release

RackHD's current release cycle is 2 weeks, you could obtain the release information from this page: https://rackhd.atlassian.net/wiki/display/RAC1/RackHD+Release+Page

There are various artifacts per release, you could choose whatever you prefer:

  • Debian package
  • OVA
  • Vagrant box
  • Docker
  • NPM package

RackHD Glossary

RackHD official maintains a list of glossary at this Confluence page: https://rackhd.atlassian.net/wiki/display/RAC1/RackHD+Glossary

Exercise

  1. Go through RackHD homepage, try to see what you can get.
  2. Register all necessary account, including Github/Confluence/Jira/Slack/GoogleGroup.
  3. Setup plan to go through the doc that hosted in readthedoc.

results matching ""

    No results matching ""