Methods in Reinforcement Learning

July 4, 2023•182 words

Artificial Intelligence

Instincts (predefined logics)
Machine Learning (statistical data + improvement)
- Supervised Learning (All data)
- Reinforcement Learning (Target data only)
  - Bruteforce
    - Sequential bruteforce
    - Random bruteforce
  - Intuitive Explore and Exploit (Policy Network)
    - Single action explore and exploit
      - Random action but with known actions in the past
    - Sequence of actions explore and exploit
      - Monte Carlo Tree Search (for max) explore and exploit
  - Explore and Exploit with Bellman (Q-learning)
    - Single action explore and exploit
    - Use q-table (large memory issue, but all cases)
    - Update formula:
      - $q += learning\_rate \times (reward + discount \times max(q_{t+1}) - q)$
      - learning_rate is for the q-table; the right side of learning_rate is temporal difference
  - Explore and Exploit with Bellman (Q-network)
    - Single action explore and exploit
    - Use q-network (small memory, but similar cases only)
    - Network expected value:
      - $q_{target} = reward + discount \times max(q_{t+1})$
      - learning_rate is not in this formula, it is defined with optimiser
- Unsupervised Learning (No data)
  - With neuralnet:
    - Autoclustering, etc.
  - Non-neuralnet:
    - K-means, etc.

👍❤️🫶👏👌🤯🤔😂😍😭😢😡😮

Subscribe to the author

You'll only receive email when they publish something new.

More from Dan D's Blog
All posts

How to Optimise Games for Visual Yet Max Framerate

June 21, 2023•70 words

Graphics Settings Prioritise texture over others Prioritise a lot of stuff near and around the character instead of far away Fully detailed for objects around Flat images for objects at distance User Interface On laptop display: Try to scale out text if there are such options On large external display: Scale out text if playing from far of display, eg. on couch ...

Read post

The Thread-safe new_lock Function in Python

July 4, 2023•48 words

# Async lock def new\_lock(): event = aio.Event() return event.wait(),event.set # Async lock thread-safe def new\_lock\_ts(): class Event\_Ts(aio.Event): def set(self): self.\_loop.call\_soon\_threadsafe(super().set) event = Event\_Ts() return event.wait(),event.set ...

Read post

Methods in Reinforcement Learning

More from Dan D's BlogAll posts

How to Optimise Games for Visual Yet Max Framerate

The Thread-safe new_lock Function in Python

More from Dan D's Blog
All posts