Backends and their configurations
So far we have used the map
backend for this tutorial.
Yokan provides a number of backends, listed below, along
with some key information on their configuration.
Note that the full configuration format of a backend can be known by creating a database using the backend in a Bedrock context, then using the bedrock-query tool to retrieve the full configuration.
Some configuration parameters expect a symbol name that will
be looked up using dlsym
and dlopen
. These names can be written
in the form "symbol_name"
to look them up in the currently
loaded executable and libraries, or "libsome_library.so:symbol_name"
to look them up in an exernal library that needs to be loaded.
Map backend
Backend type: “map”
Spack variant needed: none
Special requirements: none
The Map backend is an in-memory key/value store implemented
using a C++ std::map
. Important configuration parameters
include the following.
use_lock
: true by default, this parameter can be set to false if you know that the database will never be accessed by multiple execution stream concurrently (either because only one client accesses it, or because only one execution stream services RPCs for it).comparator
: a function with signaturebool (*)(const void*, size_t, const void*, size_t)
that provides the “less than” comparison operator for two keys.allocators
: a JSON object containing thekey_allocator
,value_allocator
, andnode_allocator
parameters, along with correspondingkey_allocator_config
,value_allocator_config
, andnode_allocator_config
.
The memory allocators used by the Map backend can be customized
using the allocators
configuration. The keys and values allocators
are used to allocate memory to store keys and values, respectively.
The node allocator is used to allocate memory for the nodes in the map’s
underlying red-black tree. By default, an allocator that uses new
and
delete
will be used, but if you know in advance that, for instance,
the keys or the values are going to be of a constant size, or that there
will be no deletion from the database, or that the database won’t be accessed
concurrently by multiple threads, you can provide your own allocators
accordingly.
By default the *_allocator
fields are set to “default”.
To implement a custom allocator, write a dynamic library implementing
functions from the yk_allocator
structure (found in include/yokan/allocator.h),
along with an initialization function of type yk_allocator_init_fn
.
You can then set these fields to "libmy_allocators.so:my_key_alloc_init"
,
for example. Note that the initialization functions’ second argument is a
const char*
. It will be passed a serialized version of the JSON object
stored in the corresponding *_config
field (e.g. the key allocator
initialization function will be provided with the content of the “key_allocator_config”
field).
Unordered Map backend
Backend type: “unordered_map”
Spack variant needed: none
Special requirements: none
The Unordered Map backend is similar to the Map backend but relies
on C++’s std::unordered_map
. It has the same configuration fields.
Contrary to the Map backend, this backend does not support the list_*
functions. This backend aims to provide faster lookup by relying on a hash
table rather than Map’s red-black tree, at the expense of not supporting
key ordering.
Set backend
Backend type: “set”
Spack variant needed: none
Special requirements: none
The Set backend uses C++’s std::set
. It is similar to the Map backend
but does not store values. Trying to put anything other than zero-sized values
will result in an error. This backend is useful over the Map backend in that
even if you were to store only zero-sized values in the Map backend, you would
still have to store a size_t
(generally 8 bytes) field for each value.
This backend’s configuration is similar to that of the Map backend, but without the allocator fields related to values.
Unordered Set backend
Backend type: “unordered_set”
Spack variant needed: none
Special requirements: none
This backend is the Unordered Map equivalent for Sets. It stores only
zero-sized values, and does not provide ordering, hence list_*
functions
are not available.
BerkeleyDB backend
Backend type: “berkeleydb”
Spack variant needed: +berkeleydb
Special requirements: none
This backend uses BerkeleyDB to implement a key/value store backed by a local file system. Important configuration fields include the following.
type
: may be “btree” or “hash”. The former provides a sorted key/value store, while the latter is unsorted and will not provide thelist_*
operations.home
: path to the “home” of the BerkeleyDB environment (a “yokan” subdirectory will be added to this path).file
: name of the file storing the database.name
: name of the database.create_if_missing
: create the files if they are not present at the specified location in the file system.
If the “file” and “name” fields are empty or not provided, the database will be stored in memory rather than files.
GDBM backend
Backend type: “gdbm”
Spack variant needed: +gdbm
Special requirements: none
This backend uses GDBM, a widely use database management library for Unix systems. Important configuration fields include the following.
path
: path to the database file.
This backend does not provide the list_*
functionalities.
LevelDB backend
Backend type: “leveldb”
Spack variant needed: +leveldb
Special requirements: none
This backend uses Google’s LevelDB to provide key/value storage capabilities. Important configuration fields include the following.
path
: path to the database file.error_if_exists
: fail to open the database if it already exists.create_if_missing
: fail to open the database if it does not exist.
LMDB backend
Backend type: “lmdb”
Spack variant needed: +lmdb
Special requirements: none
This backend uses LMDB to provide key/value storage capabilities. Important configuration fields include the following.
path
: path to the database file.create_if_missing
: create the file if it is missing.no_lock
: disable locking.
LMDB uses its own locks internally, which are not Argobots-aware.
The no_lock
option disables this internal locking, but as
of now, Argobots locks were not added to compensate. Hence, use
this option only if you know that database accesses will be serialized
(either because only one ES accesses it, or because only one client
accesses it, in a serial manner).
RocksDB backend
Backend type: “rocksdb”
Spack variant needed: +rocksdb
Special requirements: none
This backend uses Facebook’s RocksDB for key/value storage. Important configuration fields include the following.
create_if_missing
: create the database if it does not exist.error_if_exists
: fail to open the database if it exists.path
: path to the database.db_paths
: an array of JSON objects representing storage targets to use to store the database files. Each such object should have apath
field and atarget_size
field.
TKRZW backend
Backend type: “tkrzw”
Spack variant needed: +tkrzw
Special requirements: compiler allowing C++17
The TKRZW library provides multiple types of backends. Important configuration fields include the following.
type
: may be “tree”, “hash”, “tiny”, or “baby”. The first is a typical tree-based key/value store backed up by a file. The second is a hash-based key/value store (no ordering, so nolist_*
operations) backed up by a file. The latter two are in-memory versions of the former.path
: path to the database file, if relevant given thetype
.
Unqlite backend
Backend type: “unqlite”
Spack variant needed: +unqlite
Special requirements: none
Unqlite is a document store targetting JSON documents. However its key/value storage capabilities are accessible natively. This backend ditches the “document store” aspect and simply relies on they key/value storage capabilities.
This backend is unsorted and does not provide list_*
operations.
Important configuration fields include the following.
path
: path to the database file.mode
: either “create”, “read_write”, “read_only”, “mmap”, or “memory”.temporary
: will erase the database file upon closing.use_abt_lock
: protect database accesses using Argobots locks.no_unqlite_mutex
: disable the use of mutex inside Unqlite.
Mutex used by Unqlite are not Argobots-aware. It may therefore make sense
to disable them and enable use_abt_lock
for better performance.
Writing your own backend
Yokan aims to provide an easy way for researchers to implement and try
their own backend. To implement your own key/value
storage backend, look at include/yokan/backend.hpp and implement
a child class of the DatabaseInterface
abstract class.
You can take inspiration from src/backends/map.cpp to understand
the semantics of each member function.
Once your backend is implemeted in a .cpp file, use the
YOKAN_REGISTER_BACKEND
macro, e.g. YOKAN_REGISTER_BACKEND(mybackend, MyBackend)
.
Its first argument is the name you want to give to the backend
(i.e. the type that will be used in configuration files).
The second argument is the name of your backend class.
Compile your .cpp file into a dynamic library. Then, when specifying the type of a database, use the syntax “library.so:name”, where “library.so” is your dynamic library, and “name” is the name of your backend.
Note that Yokan will use dlopen
to load the library, so its lookup
rules apply (for instance dlopen
will usually look for libraries
in the LD_LIBRARY_PATH
environment variable).
Important
If you already implemented your backend’s key/value storage functions
and your backend is sorted (i.e. provides the listKeys
and listKeyValues
member functions), you can automatically support document storage
by making your class inherit from DocumentStoreMixin<DatabaseInterface>
instead of DatabaseInterface
. This will immediately provide
implementations for the document storage on top of your key/value storage
implemetation (see yokan/doc-mixin.hpp for details).