All Tracks
Applied

AI Safety for Product Teams

Evaluate any AI system's safety properties, explain alignment to stakeholders, and make informed decisions about what to ship.

InstructGPT outperforms GPT-3 using human feedback, showing size isn't everything in AI models.

Why this paper, here

The paper that created the RLHF paradigm — training on human preferences is now the industry standard for alignment.

Read paper