Reinforcement Learning from Human Feedback (RLHF) is all the rage in AI today. Now that I have two nieces I see regularly, I am reminded that humans are extremely sensitive RLHFs… and incredible amounts of resources are dedicated to deciding who gets to provide the feedback during the initial training run.
Humans are Reinforcement-Learners too
Humans are Reinforcement-Learners too
Humans are Reinforcement-Learners too
Reinforcement Learning from Human Feedback (RLHF) is all the rage in AI today. Now that I have two nieces I see regularly, I am reminded that humans are extremely sensitive RLHFs… and incredible amounts of resources are dedicated to deciding who gets to provide the feedback during the initial training run.