Vector+SQL Retrieval with Selectivity Workloads: Measuring Tail Latency and Quality under Filtered Top-K

Authors

DOI:

https://doi.org/10.21015/vtse.v14i1.2353

Abstract

We study Vector+SQL hybrid retrieval under structured filtering conditions and propose a reproducible Top-K evaluation framework for uniformly analyzing the quality and tail latency of different retrieval strategies. We construct controllable selectivity workloads on three text domains: arXiv, news, and movies, and compute exact ground truth within the same filter candidate set to ensure fair comparison. Experiments uniformly evaluate exact prefilter, ANN post-filter, two-stage probing, partition-aligned retrieval, and adaptive routing, and report Recall@20 and p50/p95/p99 latency. The results show that filter selectivity and predicate structure significantly affect the quality--latency trade-off in hybrid retrieval: exact prefilter has the highest accuracy but the largest tail latency; ANN post-filter is faster, but recall drops significantly under strict filtering; adaptive routing achieves more robust overall performance on the evaluated workloads. These results demonstrate that reporting only average latency is insufficient to reflect the true system cost of hybrid retrieval. Our report provides a systematic framework for evaluating, designing, and deploying hybrid retrieval strategies under filtering conditions, accounting for correctness, tail latency, and reproducibility.

References

J. J. Pan, J. Wang, and G. Li, “Survey of vector database management systems,” VLDB Journal, vol. 33, no. 5, pp. 1591–1615, 2024.

T. Taipalus, “Vector database management systems: Fundamental concepts, use-cases, and current challenges,” Cognitive Systems Research, vol. 85, Art. no. 101216, 2024.

J. J. Pan, J. Wang, and G. Li, “Vector database management techniques and systems,” in Proc. ACM SIGMOD Int. Conf. Management of Data (Companion), Santiago, Chile, 2024, pp. 597–604.

Y. A. Malkov and D. A. Yashunin, “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 4, pp. 824–836, 2020.

J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” IEEE Trans. Big Data, vol. 7, no. 3, pp. 535–547, 2021.

H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 1, pp. 117–128, 2011.

S. J. Subramanya et al., “DiskANN: Fast accurate billion-point nearest neighbor search on a single node,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, 2019.

Q. Chen et al., “SPANN: Highly-efficient billion-scale approximate nearest neighbor search,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 34, pp. 5199–5212, 2021.

C. Fu, C. Xiang, C. Wang, and D. Cai, “Fast approximate nearest neighbor search with the navigating spreading-out graphs,” Proc. VLDB Endow., vol. 12, no. 5, pp. 461–474, 2019.

R. Guo et al., “Accelerating large-scale inference with anisotropic vector quantization,” in Proc. Int. Conf. Machine Learning (ICML), vol. 119, pp. 3887–3896, 2020.

M. Aumüller, E. Bernhardsson, and A. Faithfull, “ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms,” Information Systems, vol. 87, Art. no. 101374, 2020.

J. Wang et al., “Milvus: A purpose-built vector data management system,” in Proc. ACM SIGMOD Int. Conf. Management of Data, 2021, pp. 2614–2627.

R. Guo et al., “Manu: A cloud native vector database management system,” Proc. VLDB Endow., vol. 15, no. 12, pp. 3548–3561, 2022.

C. Chen et al., “SingleStore-V: An integrated vector database system in SingleStore,” Proc. VLDB Endow., vol. 17, no. 12, pp. 3772–3785, 2024.

S. Gong et al., “VStream: A distributed streaming vector search system,” Proc. VLDB Endow., vol. 18, no. 6, pp. 1593–1606, 2025.

J. Sun et al., “GaussDB-Vector: A large-scale persistent real-time vector database for LLM applications,” Proc. VLDB Endow., vol. 18, no. 12, pp. 4951–4963, 2025.

J. Dean and L. A. Barroso, “The tail at scale,” Commun. ACM, vol. 56, no. 2, pp. 74–80, 2013.

D. Lemire et al., “Roaring bitmaps: Implementation of an optimized software library,” Softw. Pract. Exper., vol. 48, no. 4, pp. 867–895, 2018.

B. Raza, S. Bibi, S. Bibi, and A. Nawaz, “SADA color dataset (SCD): 9 paper colors × 4 illumination conditions for robust color vision evaluation,” Spectrum of Engineering Sciences, vol. 4, no. 2, pp. 871–887, 2026, doi: 10.5281/zenodo.18844499.

Downloads

Published

2026-03-31

How to Cite

Bibi, S., Rajput, F. A., Younis, M., Bibi, S., & Raza, B. (2026). Vector+SQL Retrieval with Selectivity Workloads: Measuring Tail Latency and Quality under Filtered Top-K. VFAST Transactions on Software Engineering, 14(1), 335–349. https://doi.org/10.21015/vtse.v14i1.2353