Key/value and document filters

The yk_list_*, yk_doc_list_*, yk_iter, and yk_doc_iter functions accept a filter argument that can be used to only return key/value pairs or document satisfying a certain condition. This tutorial shows how to use such filters.

Key prefix and suffix filters

By default, any filter data provided to the yk_list_* functions will be interpreted as a prefix that the key must start with. If the YOKAN_MODE_SUFFIX mode is used, it will be interpreted as a suffix.

Lua key/value filters

When using YOKAN_MODE_LUA_FILTER is used, the content of the filter will be interpreted as Lua code, which will be executed against each key/value pair. Within such a code, the __key__ variable will be a string set to the current key. If YOKAN_MODE_FILTER_VALUE is specified in the mode, the __value__ variable will be set to the current value. The Lua script must return a boolean indicating whether the pair satisfies the user-provided condition.

Lua document filters

For yk_doc_list_* functions, Lua scripts will be provided with the __id__ and __doc__ variables. The former is an integer containing the document id. The latter is a string containing the document’s content.

Since document stores are often used to store JSON documents, Yokan already integrates the lua-cjson library in its code, making it possible to convert the __doc__ variable into a Lua hierarchy of tables:

data = cjson.decode(__doc__)

Dynamic library filters

The YOKAN_MODE_LIB_FILTER mode can use used to provide a filter implemented in a shared library. The code bellow shows a custom key/value filter and a custom document filter.

filters.cpp (show/hide)

#include <yokan/filters.hpp>
#include <cstring>

struct CustomKeyValueFilter : public yokan::KeyValueFilter {

    std::string m_to_append;

    CustomKeyValueFilter(margo_instance_id mid,
                         int32_t mode,
                         const yokan::UserMem& data)
    : m_to_append(data.data, data.size) {
        (void)mid;
        (void)mode;
    }

    bool requiresValue() const override {
        return true;
    }

    bool check(const void* key, size_t ksize,
               const void* val, size_t vsize) const override {
        // This custom filter will check if the key size and
        // value size have the same parity
        (void)key;
        (void)val;
        return (vsize % 2) == (ksize % 2);
    }

    size_t keySizeFrom(const void* key, size_t ksize) const override {
        (void)key;
        return ksize;
    }

    size_t valSizeFrom(
        const void* val, size_t vsize) const override {
        (void)val;
        return vsize + m_to_append.size();
    }

    size_t keyCopy(
        void* dst, size_t max_dst_size,
        const void* key, size_t ksize) const override {
        // This custom copy function will reverse the key
        if(max_dst_size < ksize) return YOKAN_SIZE_TOO_SMALL;
        for(size_t i=0; i < ksize; i++) {
            ((char*)dst)[i] = ((const char*)key)[ksize-i-1];
        }
        return ksize;
    }

    size_t valCopy(
        void* dst, size_t max_dst_size,
        const void* val, size_t vsize) const override {
        // This custom copy function will append
        // the user-provided filter argument to the value
        if(max_dst_size < vsize + m_to_append.size()) return YOKAN_SIZE_TOO_SMALL;
        std::memcpy(dst, val, vsize);
        std::memcpy((char*)dst+vsize, m_to_append.data(), m_to_append.size());
        return vsize + m_to_append.size();
    }
};
YOKAN_REGISTER_KV_FILTER(custom_kv, CustomKeyValueFilter);

struct CustomDocFilter : public yokan::DocFilter {

    CustomDocFilter(margo_instance_id mid, int32_t mode,
                    const yokan::UserMem& data) {
        (void)mid;
        (void)mode;
        (void)data;
    }

    bool check(const char* collection, yk_id_t id, const void* doc, size_t docsize) const override {
        // This custom filter will only let through the document with an even id
        (void)collection;
        (void)doc;
        (void)docsize;
        return id % 2 == 0;
    }

    size_t docSizeFrom(const char* collection, const void* val, size_t vsize) const override {
        (void)collection;
        (void)val;
        return vsize;
    }

    /**
     * @brief Copy the document to the target destination. This copy may
     * be implemented differently depending on the mode, and may alter
     * the content of the document.
     * This function should return the size actually copied.
     */
    virtual size_t docCopy(
        const char* collection,
        void* dst, size_t max_dst_size,
        const void* val, size_t vsize) const {
        vsize = std::min(vsize, max_dst_size);
        std::memcpy(dst, val, vsize);
        return vsize;
    }
};
YOKAN_REGISTER_DOC_FILTER(custom_doc, CustomDocFilter);

A custom key/value filter must inherit from yokan::KeyValueFilter and provide the following member functions.

  • A constructor accepting a margo instance id, a mode, and filter data.

  • requiresValue should return whether the filter needs the value.

  • check runs the filter against a key/value pair.

  • keySizeFrom computes the new key size after the filter is applied, or an upper bound of the key size. This is used for buffer allocation.

  • valSizeFrom computes the new value size after the filter is applied, or an upper bound of the value size. This is used for buffer allocation.

  • keyCopy and valCopy are used to copy the keys and values into a destination buffer. These functions can be used to extract or modify keys and values on the fly when they are read back by the user. They should return the actual size copied.

  • shouldStop (optional) can be implemented to optimize iterations by signaling when no more keys will match the filter (e.g., in a prefix filter).

The filter should be registered using the YOKAN_REGISTER_KV_FILTER macro, which takes the name of the filter, and the name of the class.

Once such a filter is provided, say in a library libmy_custom_filter.so, yk_list_* functions can be called with the following string as filter: "libmy_custom_filter.so:custom_kv:..." . Anything after the second column will be interpreted as binary data and passed to the filter class constructor’s third argument, hence making it possible to provide arguments to a custom filter.

The code above also shows a custom document filter. Such a filter must provide the following member functions.

  • A constructor accepting a margo instance id, a mode, and filter data.

  • check runs the filter against the document.

  • docSizeFrom computes the new document size after the filter is applied, or an upper bound of the document size. This is used for buffer allocation.

  • docCopy is used to copy the document into a destination buffer. This function can be used to extract or modify documents on the fly when they are read back by the user. It should return the actual size copied.

  • shouldStop (optional) can be implemented to optimize iterations by signaling when no more documents will match the filter.

Similarly, the YOKAN_REGISTER_DOC_FILTER macro should be used to register the filter.