Creating a group

The sample code hereafter shows how to create an SSG group. The ssg_group_create() function requires a margo instance id, a group name, an array of null-terminated strings representing the list of addresses of processes that are members of this group, the number of addresses in this array, a configuration structure, a membership update callback, and a pointer to user data for this callback. The last argument is the resulting opaque group id.

main.c (show/hide)

#include <assert.h>
#include <stdio.h>
#include <ssg.h>

static void my_membership_update_cb(void* uargs,
        ssg_member_id_t member_id,
        ssg_member_update_type_t update_type)
{
    switch(update_type) {
    case SSG_MEMBER_JOINED:
        printf("Member %ld joined\n", member_id);
        break;
    case SSG_MEMBER_LEFT:
        printf("Member %ld left\n", member_id);
        break;
    case SSG_MEMBER_DIED:
        printf("Member %ld died\n", member_id);
        break;
    }
}

int main(int argc, char** argv)
{
    margo_instance_id mid = margo_init("tcp", MARGO_SERVER_MODE, 1, 0);
    assert(mid);

    int ret = ssg_init();
    assert(ret == SSG_SUCCESS);

    hg_addr_t my_addr;
    margo_addr_self(mid, &my_addr);
    char my_addr_str[128];
    size_t my_addr_str_size = 128;
    margo_addr_to_string(mid, my_addr_str, &my_addr_str_size, my_addr);
    margo_addr_free(mid, my_addr);

    const char* group_addr_strs[] = { my_addr_str };
    ssg_group_config_t config = {
        .swim_period_length_ms = 1000,
        .swim_suspect_timeout_periods = 5,
        .swim_subgroup_member_count = -1,
        .swim_disabled = 0,
        .ssg_credential = -1
    };

    ssg_group_id_t gid;
    ret = ssg_group_create(
            mid, "mygroup", group_addr_strs, 1,
            &config, my_membership_update_cb, NULL,
            &gid);
    assert(ret == SSG_SUCCESS);

    // ...
    // do stuff using the group
    // ...

    ret = ssg_group_leave(gid);
    assert(ret == SSG_SUCCESS);

    ret = ssg_group_destroy(gid);
    assert(ret == SSG_SUCCESS);

    ret = ssg_finalize();
    assert(ret == SSG_SUCCESS);

    margo_finalize(mid);

    return 0;
}

In this example we initialize a group of only one process. When multiple processes create a group by this way, all the members of the group have to provide the same input parameters (group name, array of addresses, and configuration).

Important

Because SSG group members have to send messages to each other, they need to be initialized as Margo servers and have an actively running process loop. Anything preventing the progress loop from running will prevent the process from responding in the SWIM protocol, which may lead to the process being marked as dead by other processes.

Group configuration

The group configuration structure ssg_group_config_t includes the following parameters. * swim_period_length_ms: the number of milliseconds between each invokation of the SWIM protocol. * swim_suspect_timeout_periods: the number of periods of the SWIM protocol that should pass without a process answering for this process to be marked as suspected. * swim_subgroup_member_count: when a process A cannot reach a process B directly during the execution of the SWIM protocol, it will ask swim_subgroup_member_count to try reaching it on its behalf before considering it suspected. * swim_disabled: can be set to 1 to disable the SWIM protocol. * ssg_credential: some credential information.

Group membership callback

The my_membership_update_cb() function will be called whenever a membership change is detected. This membership change is indicated by the ssg_member_update_type_t argument, and the ssg_member_id_t argument indicates which member joined, left, or died.

Leaving and destroying a group

The ssg_group_leave() function is used to notify other members that the caller is leaving the group. The ssg_group_destroy() is then used to destroy the internal data structures associated with the group.

If a member calls ssg_group_destroy() without having called ssg_group_leave() first, other member will eventually consider that it has died.

It is possible for a process to call ssg_group_leave() and then continue to use the group id to lookup member addresses, as a simple observer of the group. However SSG does not currently provide a way to rejoin a group once a process has left it.