We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem.
This repository is tracked by Trending Repos. The badge upgrades automatically if it ever cracks the top 100.
<img src="https://trending-repos.com/badge/tongjingqi/AI-Can-Learn-Scientific-Taste.svg" alt="Trending Repos" />https://trending-repos.com/badge/tongjingqi/AI-Can-Learn-Scientific-Taste.svg