fbpx

Research Papers Library

Optimal Aggregation Policy for Reducing Tail Latency of Web Search

A web search engine often employs partition-aggregate architecture, where an aggregator propagates a user query to all index serving nodes (ISNs) and collects the responses from them. An aggregation policy determines how long the aggregators wait for the ISNs before returning aggregated results to users, crucially affecting both query latency and quality. Designing an aggregation policy is, however, challenging: Response latency among queries and among ISNs varies significantly, and aggregators lack of knowledge about when ISNs will respond. In this paper, we propose aggregation policies that minimize tail latency of search queries subject to search quality service level agreements (SLAs), combining data-driven offline analysis with online processing. Beginning with a single aggregator, we formally prove the optimality of our policy: It achieves the offline optimal result without knowing future responses of ISNs. We extend our policy for commonly-used hierarchical levels of aggregators and prove its optimality when messaging times between aggregators are known. We also present an empiricallyeffective policy to address unknown messaging time. We use production traces from a commercial search engine, a commercial advertisement engine, and synthetic workloads to evaluate the aggregation policy. The results show that compared to prior work, the policy reduces tail latency by up to 40% while satisfying same quality SLAs.

Download PDF

airs logo

Association of Internet Research Specialists is the world's leading community for the Internet Research Specialist and provide a Unified Platform that delivers, Education, Training and Certification for Online Research.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.

Follow Us on Social Media

Finance your Training & Certification with us - Find out how?      Learn more