Impactful research. At scale.
- Matias Zabaljauregui
- 26 dic 2024
- 1 Min. de lectura
Apart Research is a non-profit AI safety lab. We host open-to-all research sprints, publish papers, and incubate talented researchers to make AI safe and beneficial for humanity.

Safe AI
Publishing rigorous empirical work for safe AI: evaluations, interpretability and more.
“Do you have a design in mind for your blog? Whether you prefer a trendy postcard look or you’re going for a more editorial style blog - there’s a stunning layout for everyone.”
Novel approaches
Our research is underpinned by novel approaches focused on neglected topics.
Improving Llama-3-8B-Instruct Hallucination Robustness in Medical Q&A Using Feature Steering
This paper addresses the risks of hallucinations in LLMs within critical domains like medicine. It proposes methods to (a) reduce hallucination probability in responses, (b) inform users of hallucination risks and model accuracy for specific queries, and (c) display hallucination risk through a user interface. Steered model variants demonstrate reduced hallucinations and improved accuracy on medical queries. The work bridges interpretability research with practical AI safety, offering a scalable solution for the healthcare industry. Future efforts will focus on identifying and removing distractor features in classifier activations to enhance performance.



Comentarios