Reinforcement Learning
Introduction
- ZipRecruiter on Classifying Job Titles With Noisy Labels Using REINFORCE - Fine-grained job title classification with noisy labels using the REINFORCE algorithm and multi-task learning - -> this article has a very nice trick in adding a reward component to the loss function in order to mitigate for unbalanced class label problem, instead of the usual balancing. 
Q-LEARN
- Markov chain problem, (state, action, new state, reward) 
- Lots of Exploration in the beginning, then exploitation 
- Returns optimal policy. 
- Refer to youtube here 
Deep Learning
- Pytorch 
RLHF
Last updated
Was this helpful?
