EU AI Act general-purpose AI provisions apply from Aug 2025NIST AI RMF 2.0 draft open for public comment through Q2 2025EU AI Act general-purpose AI provisions apply from Aug 2025NIST AI RMF 2.0 draft open for public comment through Q2 2025

GRC ai hub

Subscribe →

Technical

Reinforcement Learning from Human Feedback (RLHF)

A training technique in which AI models are fine-tuned using feedback from human evaluators to align model outputs with human preferences. Used to train modern LLMs including ChatGPT, Claude, and Gemini to be helpful and reduce harmful outputs. The quality of RLHF depends heavily on the diversity and representativeness of human feedback.

Referenced in frameworks

NIST AI RMF

Related terms

Fine-tuning Alignment Constitutional AI

← Back to Glossary