What I don’t know
Published:
- Bounded rationality
- Non-parametric bandits: Arms’ distribution can be multi modal, not conforming to a single parameter exponential family.
- Thomson Sampling only works for parameteric bandits.
- Statistical bootstrap (ET94)
- Pure exploration
- Ambient dimension
- Optimal Design
- Round robin
- Compact MDPs
- KWIK learners
- Tower rule
- Mansour, Littman, et al. (Lunch 2023)
- IMPACT (Trimbach 2019)
- Mirror Descent
- Paper: Recursive Reward Aggregation
- Rich’s Dyna with function approximation.
- Banach Spaces
- Black-well optimality
- discrete vs countable
- e values/processes
- Why $n-1$ for the empirical variance?
- ergodic, i.e., aperiodic, recurrent, and irreducible MDPs
- Use of the law of total variance in RL
- $\sigma$-algebra vs. a topology
- Why $\mathrm{rank}(A \otimes B) = \mathrm{rank}(A) \cdot \mathrm{rank}(B)$
- Bayesian vs. Frequenist analysis of TS
- Gradient vs. derivative
- t-student distribution
- Exchanging derivative and expectation?