0

arXiv:2502.02193v3 Announce Type: replace
Abstract: These days, Key-Value Stores are widely used for scalable data storage. In this environment, Bloom filter (BF) serves as an efficient probabilistic data structure for representing sets of keys. They allow for set membership queries with no false negatives and with the right choice of the main parameters - length of the BF, number of hash functions used to map an element to the array's indices, and the number of elements inserted - the false positive rate is optimized. However, the number of hash functions is constrained to integer values, and the length of a BF is usually chosen to be a power of two to allow for efficient modulo operations using binary arithmetic. In this paper, we relax these constraints by proposing the Rational Bloom filter, which allows for non-integer numbers of hash functions. This results in optimized fraction-of-zero values for a known number of elements to be inserted. Based on this, we construct the Variably-Sized Block BF to allow for a flexible filter length, especially for large filters, with efficient computation.