What I don’t know

less than 1 minute read

Published:

  • Bounded rationality
  • Non-parametric bandits: Arms’ distribution can be multi modal, not conforming to a single parameter exponential family.
  • Thomson Sampling only works for parameteric bandits.
  • Statistical bootstrap (ET94)
  • Pure exploration
  • Ambient dimension
  • Optimal Design
  • Round robin
  • Compact MDPs
  • KWIK learners
  • Tower rule
  • Mansour, Littman, et al. (Lunch 2023)
  • IMPACT (Trimbach 2019)
  • Mirror Descent
  • Paper: Recursive Reward Aggregation
  • Rich’s Dyna with function approximation.
  • Banach Spaces
  • Black-well optimality
  • discrete vs countable
  • e values/processes
  • Why $n-1$ for the empirical variance?
  • ergodic, i.e., aperiodic, recurrent, and irreducible MDPs
  • Use of the law of total variance in RL
  • $\sigma$-algebra vs. a topology
  • Why $\mathrm{rank}(A \otimes B) = \mathrm{rank}(A) \cdot \mathrm{rank}(B)$
  • Bayesian vs. Frequenist analysis of TS
  • Gradient vs. derivative
  • t-student distribution
  • Exchanging derivative and expectation?