The linux scheduler: a Decade of Wasted Cores

Шедулер на одном ядре очень прост: фиксированный интервал, за который любой тред должен работать как минимум один раз. Интервал делится между тредами пропорционально их весу.
Thread accumulates vruntime = runtime/weight. Runqueue: Red-Black tree of threads, sorted by vruntime.
Multi-core => complex. Scalability dictates per-core runqueue: context switch is expensive
Balance runqueue on different cores
Balancing - modify cached data strucures (cache misses)
Periodic load balancing (rare) + emergency load balancing (if core is idle)
load = combination of thread’s weight and CPU utilization
autogroup = tty -> the same cgroup
scheduling: NUMA nodes -> 4 groups by 2 cores -> 2 cores
Bug1: Group Imbalance bug (one R process - vs - 64 make threads). Fix: compare average load -> compare minimum load
Bug2: Scheduling Group Construction bug ({0,1,2,4,6} , {1,2,3,4,5,7}: 1,2 in both groups). Fix: scheduling group construction algorithm -> 1,2 no long included in all scheduling groups
Bug3: Overload-on-Wakeup bug: wake up on current core if it is idle, otherwise - on the longest idle core.
Bug4: Missing Scheduling Domain bug (disable+enable via /proc -> load balancing is no longer performs). Fix: fix the bug :)
Tools: Online Sanity Checker, Scheduler Visualization tool.

Demystifying Deep Reinforcement Learning

small company in London “Deep Mind” uploaded paper “Playing Atari with Deep Reinforcement Learning”
Reinforcement Learning for Breakout game.
Credit assignment problem
Explore-exploit dilemma