LAPRAS: Learning-Augmented Private Answering for Linear Query Streams
The value of sensitive data often lies in aggregate analytics—counting queries, histograms, and linear summations that reveal population-level trends without exposing any individual. Differential Privacy (DP) is the rigorous standard for such systems, but deploying it in real database workloads exposes a fundamental mismatch: every query answered spends from a finite, global privacy budget, and once that budget is gone, no further queries can be answered accurately.
Yet real query streams are far from adversarial. Modern database workloads are highly predictable: they are dominated by recurring jobs and templates, even when the exact arrival order is unknown in advance. This motivates a learning-augmented view of online DP analytics, and the central question of our work: can an algorithm use predictions about which queries will occur to improve utility under a single global privacy budget, while remaining robust when those predictions are wrong?
The Setting
We study online DP query answering, where a curator must answer a stream of S linear queries arriving in uniformly random order under a privacy budget (ε, δ). We introduce LAPRAS, which assumes access to an oracle that outputs a prediction set of queries likely to appear in the stream, and uses it to guide how privacy is spent.
How LAPRAS Works
LAPRAS answers predicted queries using the offline-optimal Matrix Mechanism, and answers the remaining (unpredicted) queries online from a residual budget. The key technical challenge is pacing: we do not know in advance how many unpredicted queries will arrive, so naively over- or under-spending the residual budget destroys utility.
Our solution is Smooth Allocation. It forms an unbiased stopping-time estimate of the total unpredicted load from just the first T = Θ(log² S) unpredicted queries, and then continuously recalibrates per-query expenditure as the stream unfolds. This yields graceful, adaptive spending across an unknown horizon.
The Consistency–Robustness Trade-off
Across two real datasets, LAPRAS realizes the trade-off we want from a learning-augmented algorithm:
- Consistency: when the predictions overlap heavily with the true stream, LAPRAS achieves near-offline utility—close to what an algorithm that knew all the queries in advance could do.
- Robustness: when overlap is low (bad predictions), LAPRAS degrades gracefully to baseline-level performance rather than failing.
In short: good predictions are rewarded, and bad predictions are never catastrophic—all under a single, fixed global privacy budget.
Pranay Mundra, Adam Sealfon, Ziteng Sun, and Quanquan C. Liu. ICML 2026.
← Back to Blog Index