Research by Harvard students on catastrophic risks from advanced AI.

Managing risks from advanced artificial intelligence is one of the most important problems of our time.¹ We are a community of technical and policy researchers at Harvard aimed at reducing these risks and steering the trajectory of AI development for the better.

We run a semester-long introductory technical reading group on AI safety research, covering topics like neural network interpretability,¹ learning from human feedback,² goal misgeneralization in reinforcement learning agents,³ and eliciting latent knowledge.

We also run an introductory AI policy reading group, where we discuss core strategic issues posed by the development of transformative AI systems.

Join our mailing list →

Our members have worked with:

Note: Use of organizational logos does not imply current affiliation with or endorsement by these organizations.