1000 nodes and beyond
http://blog.kubernetes.io/2016/03/1000-nodes-and-beyond-updates-to-Kubernetes-performance-and-scalability-in-12.html
- 1.1. Watching from API server instead of etcd
- 1.2. Created a “read cache” at the API server level (in-memory cache, instead of etcd read). Cache is updated via watch in the background
- pods density: 30 (1.1) -> 100 (1.2): Pod Lifecycle Event Generator (PLEG). Pull docker state per pod -> per host
- scheduler throughput. 30k pod scheduled time was cut by ~1400% + link (coreos)
- JSON parser
- 1.3 and beyond:
- API server is bottleneck (JSON). -> protobuff
- label selector performance improvement
- etcd 3.0 with Kubernetes usecase in mind