Flock integration

Flock is Mochi’s group management service for distributed applications. When building distributed services with Bedrock, Flock enables:

  • Service discovery across multiple processes

  • Dynamic membership management

  • Group-aware service deployment

  • Fault tolerance and elasticity

This tutorial shows how to integrate Flock with Bedrock for building robust distributed services.

Configuring Flock in Bedrock

To use Flock in a Bedrock configuration, you need to:

  1. Load the Flock module

  2. Create a Flock provider

  3. Configure the bootstrap method

Basic Flock configuration

{
    "libraries": [
        "libflock-bedrock-module.so"
    ],
    "providers": [
        {
            "name": "my_group",
            "type": "flock",
            "provider_id": 1,
            "dependencies": {
                "pool": "__primary__"
            },
            "config": {
                "bootstrap": "self",
                "group": {
                    "type": "static",
                    "config": {}
                },
                "file": "my_group.flock"
            }
        }
    ]
}

This configuration creates a Flock provider that:

  • Uses the “self” bootstrap method (single-member group initially)

  • Uses the “static” backend (fixed membership)

  • Persists the group view to a file

Bootstrap methods

Flock supports multiple bootstrap methods for initializing groups.

Self bootstrap

Creates a single-member group with just the current process:

{
    "type": "flock",
    "config": {
        "bootstrap": "self",
        "group": {"type": "static", "config": {}}
    }
}

MPI bootstrap

Creates a group from all MPI processes:

{
    "type": "flock",
    "config": {
        "bootstrap": "mpi",
        "group": {"type": "static", "config": {}}
    }
}

Then launch with MPI:

$ mpirun -n 4 bedrock na+sm -c config.json

An extra "mpi_ranks": [ ... ] field may be used in the configuration to specify which MPI ranks are part of the group. By default, all the ranks of MPI_COMM_WORLD will be part of the group.

Join bootstrap

For dynamic groups where members join an existing group:

{
    "type": "flock",
    "config": {
        "bootstrap": "join",
        "file": "/path/to/group_file",
        "group": {"type": "centralized", "config": {}}
    }
}

Bedrock servers using the “join” method will expect the group file to exist and will join the group identified by the group file.

Note

The “join” method cannot be used with a “static” group.

Using multiple methods

The "bootstrap" field can accept a list of strings instead of a string, in which case the bootstrap methods will be attempted one after the other until one works.

Multi-process deployment

Here’s a complete example for deploying a distributed key-value store using Flock and Yokan across multiple processes.

Configuration file (same for all processes)

{
    "margo": {
        "argobots": {
            "pools": [
                {
                    "name": "rpc_pool",
                    "kind": "fifo_wait",
                    "access": "mpmc"
                }
            ]
        },
        "rpc_pool": "rpc_pool"
    },
    "libraries": [
        "libflock-bedrock-module.so",
        "libyokan-bedrock-module.so"
    ],
    "providers": [
        {
            "name": "service_group",
            "type": "flock",
            "provider_id": 1,
            "dependencies": {
                "pool": "__primary__"
            },
            "config": {
                "bootstrap": ["join", "mpi"],
                "group": {
                    "type": "centralized",
                    "config": {}
                },
                "file": "distributed_service.flock"
            }
        },
        {
            "name": "kv_store",
            "type": "yokan",
            "provider_id": 42,
            "dependencies": {
                "pool": "__primary__"
            },
            "config": {
                "database": {
                    "type": "map"
                }
            }
        }
    ]
}

Launching the service

# Launch with MPI
$ mpirun -n 4 bedrock na+sm -c distributed-config.json

# Or launch manually with file-based bootstrap
# Process 1 (creates the group file)
$ bedrock na+sm -c distributed-config.json

# Processes 2-4 (join via the group file)
$ bedrock na+sm -c distributed-config.json

In the case where process 1 is launched before other processes, the use of ["join", "mpi"] as bootstrap method will make process 1 try to join, fail, and resort to “mpi”, creating a group with itself as the only member. Subsequent processes will find the group file present and successfully use the “join” method.

Service discovery example

Once the service is running, clients can discover all members:

/*
 * (C) 2024 The University of Chicago
 *
 * See COPYRIGHT in top-level directory.
 */
#include <thallium.hpp>
#include <flock/cxx/client.hpp>
#include <iostream>

namespace tl = thallium;

int main(int argc, char** argv) {
    if(argc != 3) {
        std::cerr << "Usage: " << argv[0]
                  << " <flock_address> <flock_provider_id>" << std::endl;
        return 1;
    }

    std::string flock_addr = argv[1];
    uint16_t flock_provider_id = std::atoi(argv[2]);

    try {
        // Initialize Thallium engine
        tl::engine engine("na+sm", THALLIUM_CLIENT_MODE);

        // Create Flock client
        flock::Client client(engine);

        // Create group handle
        flock::GroupHandle group = client.makeGroupHandle(
            flock_addr, flock_provider_id
        );

        std::cout << "=== Service Discovery via Flock ===" << std::endl;

        // Get group view
        flock::GroupView view = group.view();

        std::cout << "Group size: " << view.members().count() << " members" << std::endl;
        std::cout << "\nService members:" << std::endl;

        // List all members
        for(size_t i = 0; i < view.members().count(); i++) {
            const auto& member = view.members()[i];
            std::cout << "  [" << i << "]" << std::endl;
            std::cout << "      Address: " << member.address << std::endl;
            std::cout << "      Provider ID: " << member.provider_id << std::endl;

            // You can now connect to services at these addresses
            // For example, to connect to a Yokan provider at this address:
            // yokan::Client yokan_client(engine);
            // auto yokan_db = yokan_client.makeDatabaseHandle(
            //     member.address, 42 /* yokan provider id */);
        }

        std::cout << "\n=== Service discovery completed ===" << std::endl;

    } catch(const flock::Exception& ex) {
        std::cerr << "Flock error: " << ex.what() << std::endl;
        return 1;
    } catch(const std::exception& ex) {
        std::cerr << "Error: " << ex.what() << std::endl;
        return 1;
    }

    return 0;
}

Python integration

Using Flock with Bedrock Python bindings:

#!/usr/bin/env python
"""
Example of using Flock with Bedrock Python bindings.
"""
from mochi.bedrock.server import Server
from mochi.bedrock.client import Client
import time

# Configuration with Flock
config = {
    "libraries": [
        "libflock-bedrock-module.so",
        "libyokan-bedrock-module.so"
        ],
    "providers": [
        {
            "name": "my_group",
            "type": "flock",
            "provider_id": 1,
            "config": {
                "bootstrap": "self",
                "group": {
                    "type": "static",
                    "config": {}
                },
                "file": "/tmp/my_group.flock"
            }
        },
        {
            "name": "my_database",
            "type": "yokan",
            "provider_id": 42,
            "config": {
                "database": {"type": "map"}
            }
        }
    ]
}

# Start server with Flock
print("Starting Bedrock server with Flock...")
server = Server("na+sm", config=config)
address = server.margo.engine.address
print(f"Server started at {address}")

# Connect as client
print("\nConnecting to service...")
client = Client("na+sm")
service = client.make_service_handle(address, provider_id=0)

# Query configuration
config_result = service.config
print(f"\nService has {len(config_result['providers'])} providers:")
for provider in config_result['providers']:
    print(f"  - {provider['name']} (type={provider['type']}, id={provider['provider_id']})")

# Use ServiceGroupHandle to interact with Flock group
# (assuming you have the group file path)
group_file = "/tmp/my_group.flock"
try:
    group = client.make_service_group_handle_from_flock(group_file, provider_id=0)
    print(f"\nFlock group size: {group.size}")
except Exception as e:
    print(f"\nNote: Could not create group handle (single member group): {e}")