Why this document is free gold

NASA has been applying RCM to its facilities since 1996, and the 2008 guide is the mature distillation of that program. It was written for the people who actually do the work – facility planners, designers, commissioning agents, maintenance and operations crews – which is why it reads like an operations manual rather than a theory book.

What makes it unique is the span. Commercial books cover one slice each: Moubray covers the decision logic, vendors cover their monitoring technology, consultants cover the program management. The NASA guide covers the whole chain in one free volume: the RCM philosophy, system selection and criticality, FMEA, the task-selection logic trees, PM optimization, a genuinely good handbook of Predictive Testing & Inspection (PT&I) technologies with alert criteria, and even reliability-centered acceptance of new equipment – catching defects at commissioning before they become your maintenance backlog.

Core idea

RCM is not a maintenance type – it's the logic for mixing them. NASA defines RCM as employing the full range of strategies, from deliberate run-to-failure to streamlined FMEA combined with predictive testing, chosen per asset based on consequences and economics. No single approach is "best"; the mix is.

The four maintenance strategies

The guide's foundation chapter lays out the four approaches every plant uses – knowingly or not – and what each is legitimately for.

RCM the right mix, per asset REACTIVE · RUN-TO-FAILURE Fix it when it breaks Legitimate for small, redundant, non-critical items where repair is cheap and harmless. Wrong as a default — right as a decision. PREVENTIVE · INTERVAL-BASED Service on a schedule For failure modes with a real age signature (wear, fouling) and for statutory tasks. Over-used everywhere — see PM optimization. PT&I · PREDICTIVE / CONDITION-BASED Measure, then act Vibration, thermography, oil, ultrasonics — intervene on evidence, not on the calendar. NASA's preferred approach where applicable. PROACTIVE · ELIMINATE THE CAUSE Stop failures being born Root cause analysis, precision rebuild specs, acceptance testing, design improvement. The highest-return quadrant, least funded.
Fig. 1 — The four maintenance strategies blended by the NASA RCM approach, drawn by Rob Reliability after the NASA RCM Guide (2008, ch. 3). Criticality and consequence decide which mix each asset gets.

Two NASA emphases stand out even today. First, the guide is unusually honest that reactive maintenance is a valid strategy when chosen deliberately for the right assets – most plants either pretend they don't run anything to failure, or run everything to failure by accident. Second, its proactive chapter goes beyond maintenance into precision specifications, root-cause failure analysis and reliability-centered acceptance testing of new installations – the same territory as Plucknette's D-I-P-F proactive domain, written by a facilities organization two decades ago.

The RCM analysis process

The guide's working core: a repeatable pipeline from "list of equipment" to "defensible maintenance program". This is the same skeleton we still use on client sites – only the tooling has changed.

STEP 1 System selection & criticality ranking Boundaries, functions, rank by mission/safety/cost impact STEP 2 FMEA on the critical systems Failure modes, effects, causes — streamlined where justified STEP 3 Consequence assessment & logic tree Evident or hidden? Safety, operational or economic? STEP 4 Task selection Pick the lowest-cost task that handles the consequence PT&I / CBM condition-based task Interval PM restore / replace Failure-finding test hidden functions Run-to-failure deliberate, documented Redesign when no task works STEP 5 Implement, measure, feed back Package into PMs & routes, track results, adjust with age exploration CONTINUOUS IMPROVEMENT — failures & data re-enter the analysis
Fig. 2 — The RCM analysis pipeline, drawn by Rob Reliability after the NASA RCM Guide (2008). NASA's pragmatism shows in step 2: full FMEA rigor for the critical few, streamlined templates for the rest – analysis effort should follow criticality.

That proportionality is the guide's quiet masterstroke. A classical RCM analysis of everything is unaffordable; no analysis at all is negligent. NASA's answer – rank first, then scale the analysis depth to criticality – is exactly how we run programs today, with one update: AI now drafts the FMEAs and criticality worksheets from manuals and CMMS history, so the engineers' time goes into validation (see our Maintenance Strategy Optimization solution).

The PT&I toolbox

Nearly half the guide is a practical handbook of Predictive Testing & Inspection technologies – what each one detects, on which equipment, with starting-point alert criteria. The main entries:

TechnologyDetectsTypical targets
Vibration analysisImbalance, misalignment, bearing defects, looseness, gear faultsAll rotating machinery
Infrared thermographyHot connections, overloads, refractory loss, steam trap failure, moistureElectrical gear, switchboards, roofs, insulation
Lubricant & wear particle analysisWear metals, contamination, degraded lubricant propertiesGearboxes, hydraulics, engines, large bearings
Ultrasonics (airborne & structural)Leaks, electrical discharge, early bearing/lube distressCompressed air/gas systems, valves, bearings
Motor circuit analysisInsulation degradation, rotor/stator faults, connection issuesMotors and motor circuits
Process parameter trendingEfficiency loss, fouling, degradation hiding in normal operationPumps, heat exchangers, chillers, compressors

The guide's framing has aged perfectly into the IIoT era: each technology is just a way of finding the P point earlier on the P-F curve. Wireless sensors and ML anomaly detection didn't replace this chapter – they made it cheaper to apply continuously. The discipline NASA insists on (know the failure mode, know the alert criteria, act on the result) is precisely what most "smart sensor" rollouts still skip.

Using it in 2026

How we'd actually exploit a free 472-page guide today.

  1. Use it as the free textbook for your team. New reliability engineers get more practical RCM education from this PDF than from most paid courses. Assign chapters 3–5 first.
  2. Steal the criticality and FMEA templates. They're public, proven, and better than most blank-page attempts. Adapt the scoring to your site's consequences.
  3. Benchmark your PdM program against the PT&I chapters. For each technology you pay for: does every route point at a named failure mode with defined alert criteria? NASA's tables make the gaps obvious.
  4. Adopt reliability-centered acceptance. The guide's most under-used idea: PT&I testing at commissioning (vibration signature, alignment records, thermal scans at handover) so new equipment starts life defect-free – the cheapest reliability you'll ever buy.
  5. Modernize the delivery, not the logic. Pair the guide's process with current tooling: AI-drafted FMEAs, automated bad-actor screening, continuous monitoring. Same skeleton, 10× the speed.

Bottom line

Moubray tells you why (read our RCM II summary); NASA shows you how, with templates, logic trees and a monitoring handbook – for free. Together they're a complete RCM education for the price of one book.

References & further reading

This summary is original explanatory writing. All concepts belong to their authors – go to the sources.

  1. NASA. Reliability-Centered Maintenance Guide for Facilities and Collateral Equipment. Final, September 2008. Official free PDF (nasa.gov)
  2. NASA. Reliability Centered Building and Equipment Acceptance Guide, July 2004 – the companion on commissioning acceptance. Official free PDF (nasa.gov)
  3. WBDG. Reference page for the NASA RCM Guide. wbdg.org
  4. Moubray, J. RCM II – the decision logic in depth. Our summary
  5. Nowlan, F.S. & Heap, H.F. Reliability-Centered Maintenance, 1978 – the origin. DTIC record (free)

Disclaimer. This page is an independent educational summary written entirely in Rob Reliability's own words. The NASA RCM Guide is a work of the U.S. Government and is in the public domain; this summary is nonetheless our own original writing and our own diagrams, not a reproduction of the guide. NASA does not endorse Rob Reliability, and the NASA name is used solely to identify the publication being discussed. If you have any concern about this page, contact us at hello@robreliability.com and we will address it promptly.

Done for you

Want this applied to your site?

We run RCM logic at fleet scale on real CMMS data: useless, over-frequent and missing PMs identified, savings quantified.

Maintenance Strategy Optimization