CAS++: Cold-Aware Scheduling with Configurable TTL and Prewarming

优先级感知的Serverless调度交互式演示 Interactive demonstration of priority-aware serverless scheduling

Cong Ye • Qinjian Zhao • Guanlin Li

温州肯恩大学 Wenzhou Kean University

复合优先级 • 可配置TTL • 智能预热 Composite Priority • Configurable TTL • Intelligent Prewarming

第一部分:性能对比 Section 1: Performance Comparison

在相同工作负载下对比5种调度器。调整参数并运行模拟以查看性能指标。 Compare 5 schedulers on the same workload. Adjust parameters and run simulation to see performance metrics.

核心概念:优先级与调度策略 Core Concepts: Priority & Scheduling Strategy

1. 任务优先级分类 1. Task Priority Classification

在Serverless系统中,不同任务有不同的业务重要性和延迟容忍度。我们将任务分为四个优先级: In serverless systems, different tasks have varying business importance and latency tolerance. We classify tasks into four priority levels:

🔴 Critical (P0)
业务关键请求,需要立即响应。例如:支付交易、实时通信 Business-critical requests requiring immediate response. E.g.: payment transactions, real-time communication
🟡 Normal (P1)
标准请求,有一定延迟容忍度。例如:用户查询、数据处理 Standard requests with moderate latency tolerance. E.g.: user queries, data processing
🔵 Batch (P2)
批处理任务,可以延迟执行。例如:日志分析、报表生成 Batch tasks that can be delayed. E.g.: log analysis, report generation
⚪ Low (P3)
后台任务,优先级最低。例如:数据备份、清理任务 Background tasks with lowest priority. E.g.: data backup, cleanup tasks

2. 容器状态与冷启动 2. Container States & Cold Start

Serverless函数运行在容器中。容器状态直接影响任务执行延迟: Serverless functions run in containers. Container state directly affects task execution latency:

  • Warm Container (热容器) 已初始化完成,可立即执行任务 → 低延迟 (~100ms) Pre-initialized, ready to execute immediately → Low latency (~100ms)
  • Cold Container (冷容器) 需要创建和初始化 → 高延迟 (~2000ms) Requires creation and initialization → High latency (~2000ms)
  • Container Pool (容器池) 系统资源有限,容器总数受限 (例如:最多5个) Limited system resources, total containers constrained (e.g., max 5)
  • Eviction (驱逐策略) 当容器池满时,需要驱逐空闲容器为新请求腾出空间 When pool is full, evict idle containers to make room for new requests
核心挑战: Core Challenge: 如何在有限的容器资源下,让高优先级任务尽可能使用热容器,同时保证低优先级任务不被饿死? How to maximize warm container usage for high-priority tasks while preventing starvation of low-priority tasks under limited resources?

3. CAS++调度策略 3. CAS++ Scheduling Strategy

CAS++通过复合优先级公式同时考虑业务优先级和冷启动敏感度: CAS++ uses a composite priority formula that considers both business priority and cold-start sensitivity:

P(t) = α·B(t) + β·Csens(t) + aging(t)
α = 0.6 (业务优先级权重) • β = 0.4 (冷启动敏感度权重) • aging (防止饥饿) α = 0.6 (business priority weight) • β = 0.4 (cold-start sensitivity weight) • aging (prevents starvation)

调度流程: Scheduling Process:

  1. 任务到达时,计算其复合优先级分数 Calculate composite priority score when task arrives
  2. 优先分配热容器给高分数任务(Critical任务优先) Prioritize warm containers for high-score tasks (Critical tasks first)
  3. 当容器池满时,可以驱逐低优先级任务的热容器 Evict low-priority warm containers when pool is full
  4. 通过aging机制防止低优先级任务饿死 Prevent starvation of low-priority tasks through aging mechanism
  5. 在流量突发时,智能预热高冷启动成本的函数 Intelligently prewarm high cold-start cost functions during traffic bursts
CAS++优势: CAS++ Advantages: Critical任务冷启动率降低10倍,整体延迟降低47%,同时保持所有优先级的公平性 10× lower cold start rate for Critical tasks, 47% lower overall latency, while maintaining fairness across all priorities
模拟参数 Simulation Parameters

第二部分:交互式调度可视化 Section 2: Interactive Scheduler Visualization

逐步观察任务执行过程。调整参数创建不同场景,查看调度器的行为。 Watch tasks being executed step-by-step. Adjust parameters to create different scenarios and see how schedulers behave.

本节解释策略为何有效:通过动画与指标联动,直观看到优先级队列、容器驱逐与智能预热如何影响冷启动和尾延迟,并与上方对比图形成一致结论。 This section explains why the policy works: with animation and live metrics, you’ll see how priority queues, eviction and intelligent prewarming affect cold starts and tail latency, matching conclusions from the comparison charts above.
动画参数 Animation Parameters
推荐: 20-50 Recommended: 20-50
Scheduler Execution Visualization
Ready to animate! 🎬
Adjust parameters above and click "▶ Start Animation" to see how the scheduler executes tasks step-by-step.
Average Latency Comparison
平均延迟越低越好;在相同负载下,CAS++显著降低延迟。 Lower is better; under the same load, CAS++ achieves significantly lower latency.
Priority Score Comparison (Higher is Better)
优先级分数越高表示更好地兼顾业务优先级与冷启动成本;CAS++得分最高。 Higher indicates better alignment of business priorities with cold-start cost; CAS++ scores the highest.
Tail Latency Comparison: p95 & p99 (Lower is Better)
p95/p99 越低越好;CAS++显著降低尾延迟,提升用户体验。 Lower p95/p99 is better; CAS++ markedly reduces tail latency, improving user experience.

可视化分析与关键机制 Visualization Analysis & Key Mechanisms

🚀 智能预热机制 Intelligent Prewarming

工作原理:当系统检测到突发流量(请求到达率快速上升)且存在高冷启动成本函数时, 调度器会主动为可能遇到冷启动的函数创建热容器,确保请求到达时容器已就绪。这防止了冷启动级联效应导致的严重尾延迟峰值。
How it works: When a traffic burst (arrival rate spikes) is detected and there are high cold-start cost functions, the scheduler proactively creates warm containers so that requests arrive to ready containers. This prevents cold-start cascades causing severe tail latency spikes.

触发条件: Triggers:
  1. 检测到流量突发(到达率激增)— Burst detected (arrival rate surge)
  2. 识别高冷启动成本函数(如Critical)— Identify high cold-cost functions (e.g., Critical)
  3. 容器池有可用空间— Container pool has free capacity
  4. 预测即将到来的请求— Upcoming requests are predicted
关键优势: Key Advantages:
  • 突发流量下尾延迟降低19% — 19% lower tail latency under bursts
  • 保护Critical函数免受冷启动 — Protects Critical functions from cold starts
  • 资源开销极小(~2%)— Minimal overhead (~2%)
  • 自适应工作负载模式— Adapts to workload patterns

调度过程关键观察 Key Scheduling Observations

通过上方的交互式可视化,你可以观察到: From the interactive visualization above, you can observe:

  • 优先级队列排序:Critical (P0)任务始终排在队列前端,优先获得热容器 Priority queue ordering: Critical (P0) tasks stay at the front, getting warm containers first
  • 容器驱逐策略:当容器池满时,低优先级任务的热容器可能被驱逐,为高优先级任务腾出空间 Eviction policy: When the pool is full, low-priority warm containers may be evicted to serve high-priority tasks
  • 冷启动分布:CAS++让Critical任务冷启动率降至12.3%,而Batch任务承担更多冷启动(但其延迟容忍度更高) Cold-start distribution: Critical tasks drop to 12.3% cold starts; batch tasks take more (higher latency tolerance)
  • 预热时机:在流量突发时,预热指示器会亮起,显示正在主动创建热容器 Prewarming timing: Indicator lights up during bursts, showing proactive warm container creation
  • 公平性保证:aging机制确保低优先级任务不会被无限期延迟,等待时间越长优先级越高 Fairness guarantee: Aging increases priority over wait time, preventing starvation

第三部分:实验结果与性能分析 Section 3: Experimental Results & Performance Analysis

来自Poster Paper的真实实验数据,展示CAS++在不同参数配置下的性能表现 Real experimental data from Poster Paper, showing CAS++ performance across different parameter configurations

实验设置与主要发现 Experimental Setup & Main Findings

实验配置 Experimental Setup

我们在自研Simulator上进行了大规模实验,对比了5种调度器的性能。实验数据来自Poster Paper We conducted large-scale experiments on our custom Simulator, comparing 5 schedulers. Data is from our Poster Paper:

  • FIFO-NoEvict: 先进先出,不驱逐容器(基准方法) First-In-First-Out, no eviction (baseline)
  • FQFS: Fair Queuing for Serverless,公平队列调度 Fair Queuing for Serverless
  • GlobalLRU: 全局LRU驱逐策略 Global LRU eviction policy
  • BizPriority: 仅考虑业务优先级的调度器 Business priority only scheduler
  • CAS++: 我们提出的复合优先级调度器 Our proposed composite priority scheduler

工作负载:Workload: 混合优先级任务(Critical/Normal/Batch),模拟真实生产环境 Mixed priority tasks (Critical/Normal/Batch), simulating real production
评估指标:Metrics: 平均延迟、冷启动率、优先级分数、p95/p99尾延迟 Avg latency, cold start rate, priority score, p95/p99 tail latency

优先级分数 vs TTL Priority Score vs TTL

观察:CAS++在所有TTL值下均保持最高优先级分数,展现出对TTL配置的鲁棒性。 较长的TTL有助于减少冷启动,但CAS++即使在较短TTL下也能保持优异性能。
Observation: CAS++ maintains the highest priority score across all TTL values, showing robustness to TTL configuration. Longer TTL reduces cold starts, but CAS++ remains competitive even at shorter TTLs.

平均延迟 vs 容器容量 Avg Latency vs Capacity

观察:在高负载下(容器容量≥15),CAS++延迟显著低于其他调度器。 FQFS在低容量时表现较好,但随着容量增加性能下降。CAS++在各种容量下均保持稳定的低延迟。
Observation: Under high load (capacity ≥15), CAS++ latency is significantly lower. FQFS performs well under low capacity but degrades as capacity grows. CAS++ stays consistently low across capacities.

优先级分数 vs 容器容量 Priority Score vs Capacity

观察:CAS++在所有容量级别下均达到最高优先级分数,成功平衡了业务优先级和冷启动成本。 随着容量增加,所有调度器的优先级分数都有所提升,但CAS++始终领先。
Observation: CAS++ achieves the highest priority score across all capacities, balancing business priorities and cold costs. All schedulers improve with capacity, but CAS++ remains ahead.

冷启动率 vs TTL Cold Start Rate vs TTL

观察:更长的TTL降低了所有调度器的冷启动率。CAS++通过智能预热机制保持竞争力的冷启动率, 同时确保Critical任务获得最大保护(P0冷启动率仅12.3%)。
Observation: Longer TTL reduces cold-starts for all schedulers. CAS++ keeps competitive rates via intelligent prewarming, ensuring maximum protection for Critical tasks (P0 cold rate only 12.3%).

图表分析总结 Chart Analysis Summary

  • TTL影响:CAS++对TTL配置不敏感,在0.6-2.0秒范围内均保持最优性能 TTL effect: CAS++ is robust to TTL; stays optimal across 0.6–2.0s
  • 容量扩展性:随着容器容量增加,CAS++的优势更加明显,特别是在高负载场景 Capacity scalability: Advantage grows with capacity, especially under high load
  • 冷启动权衡:虽然整体冷启动率较高,但CAS++通过优先级机制确保Critical任务获得最大保护 Cold-start trade-off: Overall rate is high, but Critical is maximally protected
  • 综合性能:在延迟、优先级分数、尾延迟等多个维度上,CAS++均表现最优或接近最优 Overall performance: Best or near-best across latency, score, and tail metrics

主要实验结果对比 Main Experimental Results Comparison

调度器 Scheduler 平均延迟 Avg Latency 冷启动率 Cold Rate 优先级分数 Priority Score p95 p99
FIFO-NoEvict 100.40 ms 70.5% 0.5084 196.17 ms 203.50 ms
FQFS 100.90 ms 75.7% 0.5021 193.34 ms 200.43 ms
GlobalLRU 101.57 ms 89.5% 0.4925 190.01 ms 198.05 ms
BizPriority 69.49 ms 82.1% 0.6812 198.87 ms 209.67 ms
CAS++ 52.83 ms 88.9% 0.7180 159.17 ms 170.65 ms
关键发现 Key Findings
  • 最低平均延迟:CAS++ 52.83ms,比FIFO降低47%,比BizPriority降低24% Lowest avg latency: CAS++ 52.83ms, 47% lower than FIFO, 24% lower than BizPriority
  • 最高优先级分数:0.7180,比BizPriority提高5.4%,比FIFO提高41% Highest priority score: 0.7180, 5.4% higher than BizPriority, 41% higher than FIFO
  • 最低尾延迟:p95降低19%,p99降低16%,显著改善用户体验 Lowest tail latency: p95 reduced by 19%, p99 by 16%, significantly improving user experience
  • 冷启动率权衡:虽然整体冷启动率88.9%,但Critical任务冷启动率仅12.3%(其他调度器45-85%) Cold start trade-off: Overall cold start rate 88.9%, but Critical tasks only 12.3% (others 45-85%)

超参数敏感性分析 (α / β) Hyperparameter Sensitivity Analysis (α / β)

我们测试了不同的α(业务优先级权重)和β(冷启动敏感度权重)组合,验证CAS++的鲁棒性: We tested different combinations of α (business priority weight) and β (cold-start sensitivity weight) to verify CAS++ robustness:

α β 平均延迟Avg Latency 冷启动率Cold Rate 优先级分数Priority Score
0.4 0.2 51.96 ms 83.4% 0.710
0.4 0.4 49.55 ms 80.3% 0.726
0.4 0.6 52.95 ms 89.2% 0.712
0.6 0.2 50.20 ms 82.4% 0.722
0.8 0.4 52.96 ms 88.6% 0.720