Vector+SQL Retrieval with Selectivity Workloads: Measuring Tail Latency and Quality under Filtered Top-K
DOI:
https://doi.org/10.21015/vtse.v14i1.2353Abstract
We study Vector+SQL hybrid retrieval under structured filtering conditions and propose a reproducible Top-K evaluation framework for uniformly analyzing the quality and tail latency of different retrieval strategies. We construct controllable selectivity workloads on three text domains: arXiv, news, and movies, and compute exact ground truth within the same filter candidate set to ensure fair comparison. Experiments uniformly evaluate exact prefilter, ANN post-filter, two-stage probing, partition-aligned retrieval, and adaptive routing, and report Recall@20 and p50/p95/p99 latency. The results show that filter selectivity and predicate structure significantly affect the quality--latency trade-off in hybrid retrieval: exact prefilter has the highest accuracy but the largest tail latency; ANN post-filter is faster, but recall drops significantly under strict filtering; adaptive routing achieves more robust overall performance on the evaluated workloads. These results demonstrate that reporting only average latency is insufficient to reflect the true system cost of hybrid retrieval. Our report provides a systematic framework for evaluating, designing, and deploying hybrid retrieval strategies under filtering conditions, accounting for correctness, tail latency, and reproducibility.
References
J. J. Pan, J. Wang, and G. Li, “Survey of vector database management systems,” VLDB Journal, vol. 33, no. 5, pp. 1591–1615, 2024.
T. Taipalus, “Vector database management systems: Fundamental concepts, use-cases, and current challenges,” Cognitive Systems Research, vol. 85, Art. no. 101216, 2024.
J. J. Pan, J. Wang, and G. Li, “Vector database management techniques and systems,” in Proc. ACM SIGMOD Int. Conf. Management of Data (Companion), Santiago, Chile, 2024, pp. 597–604.
Y. A. Malkov and D. A. Yashunin, “Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 4, pp. 824–836, 2020.
J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with GPUs,” IEEE Trans. Big Data, vol. 7, no. 3, pp. 535–547, 2021.
H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 1, pp. 117–128, 2011.
S. J. Subramanya et al., “DiskANN: Fast accurate billion-point nearest neighbor search on a single node,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 32, 2019.
Q. Chen et al., “SPANN: Highly-efficient billion-scale approximate nearest neighbor search,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 34, pp. 5199–5212, 2021.
C. Fu, C. Xiang, C. Wang, and D. Cai, “Fast approximate nearest neighbor search with the navigating spreading-out graphs,” Proc. VLDB Endow., vol. 12, no. 5, pp. 461–474, 2019.
R. Guo et al., “Accelerating large-scale inference with anisotropic vector quantization,” in Proc. Int. Conf. Machine Learning (ICML), vol. 119, pp. 3887–3896, 2020.
M. Aumüller, E. Bernhardsson, and A. Faithfull, “ANN-Benchmarks: A benchmarking tool for approximate nearest neighbor algorithms,” Information Systems, vol. 87, Art. no. 101374, 2020.
J. Wang et al., “Milvus: A purpose-built vector data management system,” in Proc. ACM SIGMOD Int. Conf. Management of Data, 2021, pp. 2614–2627.
R. Guo et al., “Manu: A cloud native vector database management system,” Proc. VLDB Endow., vol. 15, no. 12, pp. 3548–3561, 2022.
C. Chen et al., “SingleStore-V: An integrated vector database system in SingleStore,” Proc. VLDB Endow., vol. 17, no. 12, pp. 3772–3785, 2024.
S. Gong et al., “VStream: A distributed streaming vector search system,” Proc. VLDB Endow., vol. 18, no. 6, pp. 1593–1606, 2025.
J. Sun et al., “GaussDB-Vector: A large-scale persistent real-time vector database for LLM applications,” Proc. VLDB Endow., vol. 18, no. 12, pp. 4951–4963, 2025.
J. Dean and L. A. Barroso, “The tail at scale,” Commun. ACM, vol. 56, no. 2, pp. 74–80, 2013.
D. Lemire et al., “Roaring bitmaps: Implementation of an optimized software library,” Softw. Pract. Exper., vol. 48, no. 4, pp. 867–895, 2018.
B. Raza, S. Bibi, S. Bibi, and A. Nawaz, “SADA color dataset (SCD): 9 paper colors × 4 illumination conditions for robust color vision evaluation,” Spectrum of Engineering Sciences, vol. 4, no. 2, pp. 871–887, 2026, doi: 10.5281/zenodo.18844499.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY