Methods in Reinforcement Learning
July 4, 2023•182 words
Artificial Intelligence
- Instincts (predefined logics)
- Machine Learning (statistical data + improvement)
- Supervised Learning (All data)
- Reinforcement Learning (Target data only)
- Bruteforce
- Sequential bruteforce
- Random bruteforce
- Intuitive Explore and Exploit (Policy Network)
- Single action explore and exploit
- Random action but with known actions in the past
- Sequence of actions explore and exploit
- Monte Carlo Tree Search (for max) explore and exploit
- Single action explore and exploit
- Explore and Exploit with Bellman (Q-learning)
- Single action explore and exploit
- Use q-table (large memory issue, but all cases)
- Update formula:
- learning_rate is for the q-table; the right side of learning_rate is temporal difference
- Explore and Exploit with Bellman (Q-network)
- Single action explore and exploit
- Use q-network (small memory, but similar cases only)
- Network expected value:
- learning_rate is not in this formula, it is defined with optimiser
- Bruteforce
- Unsupervised Learning (No data)
- With neuralnet:
- Autoclustering, etc.
- Non-neuralnet:
- K-means, etc.
- With neuralnet: