Exercise 1: simple RPC and RDMA using Thallium
Note
The instructions in these exercises have a checkbox that you can click on to help you keep track of your progress. These checkboxes are not connected to any action, they are just there for you to mark your progress.
The code for this exercise has been cloned in thallium-tutorial-exercises
in your development environment.
In a terminal connected to your docker container, make sure you are in the
appropriate directory.
cd margo-tutorial/margo-tutorial-exercises
The src
directory provides a client.cpp
client code,
a server.cpp
server code, a types.hpp
header defining
some types, and a phonebook.hpp
file containing an
implementation of a phonebook using an std::unordered_map
.
In this exercise we will make the server manage a phonebook and service two kinds of RPCs: adding a new entry, and looking up a phone number associated with a name.
Let’s start by setting up the spack environment and building the existing code:
spack env create thallium-tuto-env spack.yaml
spack env activate thallium-tuto-env
spack install
mkdir build
cd build
cmake ..
make
This will create the client and server programs.
You can test your client and server programs by opening two terminals
(make sure you have run spack env activate thallium-tuto-env
in
them to activate your spack environment) and running the following
from the build
directory.
For the server:
src/server na+sm
This will start the server and print its address. na+sm
(the shared memory transport) may be changed to tcp if you run this
code on multiple machines connected via an Ethernet network.
For the client:
src/client na+sm <server-address>
Copying <server-address>
from the standard output of the
server command. The server is setup to run indefinitely.
You may kill it with Ctrl-C.
Looking at the API in phonebook.hpp
, edit server.cpp
to instanciate a phonebook in main()
(see comment (1)).
Our two RPCs, which we will call “insert” and “lookup”, will need
argument and return types. While you could pass an std::string
and
an uint64_t
directly to your RPC, in this tutorial we will define a
class encapsulating them to showcase custom serialization.
Edit the types.hpp
file to add an entry
class, which will be used as input argument for the insert RPC,
ontaining a std::string name
field and a uint64_t number
field
(see comment (2) in the file). Look at the vector3d
class
as an example and define a serialize
template function in
your own classes following this model.
Note
You may need to include thallium/serialization/stl/string.hpp
so that Thallium knows how to serialize strings.
To summarize:
The insert RPC will take an
entry
as input and respond with anuint32_t
error code (0 representing a successful operation).The lookup RPC will take an
std::string
as input and respond with auint64_t
(we will assume as a phone number of 0 represents a failed lookup).
Edit server.cpp
to add the definitions and declarations of the
lambda functions for our two RPCs. Feel free to copy/paste and modify
the existing sum
RPC (comments (3) and (4)).
Important
Thallium relies on templates and type deduction to know what to serialize
and how when sending RPC arguments and responses. If you
write req.respond(0)
, C++ will infer that you want to send an int
.
If your client expects an uint64_t
as a response, this will cause
serialization issues. It is always recommanded to explicitely define the
variable that will be returned, e.g. uint64_t ret = 0; req.respond(ret)
.
Edit client.cpp
and use the existing code as an example to
register the two RPCs here as well (comment (5)). Make sure that the client
uses the same types as the server for RPC inputs and output. Failing to do so
will cause serialization issues.
Try out your code by calling these insert and lookup functions a few times in the client.
Bonus: using RDMA to transfer larger amounts of data
Do this bonus part only if you have time, or as an exercise later.
In this part, we will add a lookup_multi
RPC that uses RDMA
to send multiple names at once and return the array of associated
phone numbers (in practice this would be too little data to call
for the use of RDMA, but we will just pretent).
For this, you may use the example on Transferring data over RDMA.
We assume that the names to lookup are in a std::vector<std::string>
on the client.
You will need to create two bulk handles (tl::bulk
) on the client
and two on the server. On the client, the first will expose the names as
read-only (remember that engine::expose
can take a vector of
non-contiguous segments, but you will need to use name.size()+1
as the
size of each segment to keep the null terminator of each name), and the second
will expose the output array as write only. The engine::expose
function
can be used to create these bulk handles. It takes an std::vector<std::pair<void*, size_t>>
of segments (represented by their address and size).
The address of the memory of an std::string
str can be obtained
using str.data()
(which should then be cast to void*
).
You will need to transfer the two bulk handles in the RPC arguments, and since names can have a varying size, you will have to also transfer the total size of the bulk handle wrapping them, so that the server knows how much memory to allocate for its local buffer.
On the server side, you will need to allocate two buffers; one to
receive the names (you can use an std::vector<char>
which you
resize to the size required to receive all the names; they will end up
in this contiguous buffer, separated by null characters) via a pull operation,
the other to send the phone numbers via a push (you can use an
std::vector<uint64_t>
for this one).
You will need to create two bulk instances to expose these buffers.
After having transferred the names (remote_names_bulk >> local_names_bulk
),
they will be in the server’s contiguous buffers. You can rely on the null-terminators
to know where one name ends and the next starts, lookup each name in the phonebook,
fill the std::vector<uint64_t>
buffer allocated for the phone numbers,
then transfer the content of this local buffer to the client
(remote_numbers_bulk << local_numbers_bulk
).