Schedulers and Work Distribution

In this tutorial, you will learn about Argobots schedulers and how they control the distribution and execution of work units.

Key Concepts

Schedulers

Each execution stream is associated with a scheduler, which is responsible for pulling work units from pools and executing them on the execution stream. Every execution stream has exactly one main scheduler.

The scheduler’s main loop: 1. Check pools for available work units 2. Select a work unit to execute 3. Execute the work unit 4. Repeat until no more work or shutdown signal

Scheduler Types

  • ABT_SCHED_DEFAULT: Default scheduler (usually same as BASIC)

  • ABT_SCHED_BASIC: Simple FIFO scheduler with blocking wait

  • ABT_SCHED_BASIC_WAIT: BASIC with efficient waiting

  • ABT_SCHED_PRIO: Priority scheduler (checks pools in order)

  • ABT_SCHED_RANDWS: Random work-stealing scheduler

Scheduler-Pool Relationship
  • A scheduler can access multiple pools

  • Pools are checked in order (first pool = highest priority)

  • When the first pool is empty, scheduler checks the second pool, etc.

  • This enables work-stealing: schedulers steal from other pools

BASIC Scheduler

The BASIC scheduler is simple and efficient for single-pool scenarios:

 1/*
 2 * BASIC scheduler example: Simple round-robin scheduling
 3 */
 4
 5#include <stdio.h>
 6#include <abt.h>
 7
 8#define NUM_THREADS 8
 9
10void thread_func(void *arg)
11{
12    int id = *(int *)arg;
13    int rank;
14    ABT_xstream_self_rank(&rank);
15    printf("Thread %d on ES %d (BASIC scheduler)\n", id, rank);
16}
17
18int main(int argc, char **argv)
19{
20    ABT_xstream xstream;
21    ABT_pool pool;
22    ABT_sched sched;
23    ABT_thread threads[NUM_THREADS];
24    int thread_ids[NUM_THREADS];
25
26    ABT_init(argc, argv);
27
28    printf("=== BASIC Scheduler Example ===\n\n");
29
30    /* Create a pool */
31    ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_MPMC,
32                          ABT_TRUE, &pool);
33
34    /* Create BASIC scheduler */
35    ABT_sched_create_basic(ABT_SCHED_BASIC, 1, &pool,
36                           ABT_SCHED_CONFIG_NULL, &sched);
37
38    /* Set scheduler on primary xstream */
39    ABT_xstream_self(&xstream);
40    ABT_xstream_set_main_sched(xstream, sched);
41
42    /* Create threads */
43    for (int i = 0; i < NUM_THREADS; i++) {
44        thread_ids[i] = i;
45        ABT_thread_create(pool, thread_func, &thread_ids[i],
46                          ABT_THREAD_ATTR_NULL, &threads[i]);
47    }
48
49    /* Wait for completion */
50    for (int i = 0; i < NUM_THREADS; i++) {
51        ABT_thread_free(&threads[i]);
52    }
53
54    printf("\nBASIC scheduler: simple FIFO execution\n");
55
56    ABT_finalize();
57    return 0;
58}

The BASIC scheduler: - Pulls work units from its pool in FIFO order - Blocks when the pool is empty (efficient waiting) - Simple, low overhead - Best for dedicated pools with steady work arrival

Priority Scheduler

The PRIO scheduler checks pools in priority order:

 1/*
 2 * PRIO scheduler example: Priority-based scheduling
 3 */
 4
 5#include <stdio.h>
 6#include <abt.h>
 7
 8#define NUM_HIGH 4
 9#define NUM_LOW 4
10
11void work_func(void *arg)
12{
13    char *priority = (char *)arg;
14    int rank;
15    ABT_xstream_self_rank(&rank);
16    printf("Thread (%s priority) on ES %d\n", priority, rank);
17}
18
19int main(int argc, char **argv)
20{
21    ABT_xstream xstream;
22    ABT_pool high_pool, low_pool;
23    ABT_sched sched;
24    ABT_thread high_threads[NUM_HIGH], low_threads[NUM_LOW];
25    ABT_pool pools[2];
26
27    ABT_init(argc, argv);
28
29    printf("=== PRIO Scheduler Example ===\n\n");
30
31    /* Create high and low priority pools */
32    ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_MPMC,
33                          ABT_TRUE, &high_pool);
34    ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_MPMC,
35                          ABT_TRUE, &low_pool);
36
37    /* Priority order: high_pool first, then low_pool */
38    pools[0] = high_pool;
39    pools[1] = low_pool;
40
41    /* Create PRIO scheduler with priority pools */
42    ABT_sched_create_basic(ABT_SCHED_PRIO, 2, pools,
43                           ABT_SCHED_CONFIG_NULL, &sched);
44
45    ABT_xstream_self(&xstream);
46    ABT_xstream_set_main_sched(xstream, sched);
47
48    /* Create low priority threads first */
49    for (int i = 0; i < NUM_LOW; i++) {
50        ABT_thread_create(low_pool, work_func, "LOW",
51                          ABT_THREAD_ATTR_NULL, &low_threads[i]);
52    }
53
54    /* Create high priority threads */
55    for (int i = 0; i < NUM_HIGH; i++) {
56        ABT_thread_create(high_pool, work_func, "HIGH",
57                          ABT_THREAD_ATTR_NULL, &high_threads[i]);
58    }
59
60    /* Wait for all threads */
61    for (int i = 0; i < NUM_HIGH; i++) {
62        ABT_thread_free(&high_threads[i]);
63    }
64    for (int i = 0; i < NUM_LOW; i++) {
65        ABT_thread_free(&low_threads[i]);
66    }
67
68    printf("\nPRIO scheduler: High priority pool drained before low priority\n");
69
70    ABT_finalize();
71    return 0;
72}

Key Points

Pool Priority
pools[0] = high_pool;  /* Checked first */
pools[1] = low_pool;   /* Checked second */

The PRIO scheduler always drains higher-priority pools before moving to lower-priority ones. This is useful for ensuring critical work executes first.

Use Cases:
  • RPC handlers (high priority) vs background tasks (low priority)

  • Latency-critical operations vs throughput-oriented operations

  • Interactive tasks vs batch processing

Work-Stealing Scheduler

The RANDWS (random work-stealing) scheduler enables dynamic load balancing:

 1/*
 2 * Work-stealing scheduler (RANDWS) example
 3 */
 4
 5#include <stdio.h>
 6#include <stdlib.h>
 7#include <unistd.h>
 8#include <abt.h>
 9
10#define NUM_XSTREAMS 4
11#define NUM_THREADS 16
12
13void work_func(void *arg)
14{
15    int id = *(int *)arg;
16    int rank;
17    ABT_xstream_self_rank(&rank);
18
19    printf("Thread %2d started on ES %d", id, rank);
20
21    /* Simulate varying workload */
22    if (id % 4 == 0) {
23        printf(" (heavy)");
24        usleep(50000);
25    } else {
26        printf(" (light)");
27        usleep(5000);
28    }
29
30    ABT_xstream_self_rank(&rank);
31    printf(" -> finished on ES %d\n", rank);
32}
33
34int main(int argc, char **argv)
35{
36    ABT_xstream xstreams[NUM_XSTREAMS];
37    ABT_pool pools[NUM_XSTREAMS];
38    ABT_sched scheds[NUM_XSTREAMS];
39    ABT_thread threads[NUM_THREADS];
40    int thread_ids[NUM_THREADS];
41
42    ABT_init(argc, argv);
43
44    printf("=== Work-Stealing (RANDWS) Scheduler Example ===\n\n");
45
46    /* Create pools */
47    for (int i = 0; i < NUM_XSTREAMS; i++) {
48        ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_MPMC,
49                              ABT_TRUE, &pools[i]);
50    }
51
52    /* Create schedulers with work-stealing capability */
53    for (int i = 0; i < NUM_XSTREAMS; i++) {
54        ABT_pool *sched_pools = malloc(sizeof(ABT_pool) * NUM_XSTREAMS);
55
56        /* Each scheduler can access all pools (own pool first) */
57        for (int j = 0; j < NUM_XSTREAMS; j++) {
58            sched_pools[j] = pools[(i + j) % NUM_XSTREAMS];
59        }
60
61        ABT_sched_create_basic(ABT_SCHED_RANDWS, NUM_XSTREAMS,
62                               sched_pools, ABT_SCHED_CONFIG_NULL, &scheds[i]);
63        free(sched_pools);
64    }
65
66    /* Setup execution streams */
67    ABT_xstream_self(&xstreams[0]);
68    ABT_xstream_set_main_sched(xstreams[0], scheds[0]);
69
70    for (int i = 1; i < NUM_XSTREAMS; i++) {
71        ABT_xstream_create(scheds[i], &xstreams[i]);
72    }
73
74    /* Create threads distributed across pools */
75    for (int i = 0; i < NUM_THREADS; i++) {
76        thread_ids[i] = i;
77        int pool_id = i % NUM_XSTREAMS;
78        ABT_thread_create(pools[pool_id], work_func, &thread_ids[i],
79                          ABT_THREAD_ATTR_NULL, &threads[i]);
80    }
81
82    /* Wait for all threads */
83    for (int i = 0; i < NUM_THREADS; i++) {
84        ABT_thread_free(&threads[i]);
85    }
86
87    /* Cleanup */
88    for (int i = 1; i < NUM_XSTREAMS; i++) {
89        ABT_xstream_join(xstreams[i]);
90        ABT_xstream_free(&xstreams[i]);
91    }
92
93    printf("\nRANDWS scheduler: Work stolen for load balancing\n");
94
95    ABT_finalize();
96    return 0;
97}

Key Points

Multi-Pool Access

Each scheduler gets access to all pools, ordered so its own pool is first. When a scheduler’s pool is empty, it randomly steals from other pools.

Dynamic Load Balancing

Work-stealing automatically balances load: - Execution streams with light work finish early - They steal heavy work from busy execution streams - Better utilization of all cores

Overhead

Work-stealing has higher overhead than fixed allocation due to: - Synchronization on shared pools (MPMC access) - Random selection overhead - Cache effects from work migration

Fibonacci Example

Recursive divide-and-conquer algorithms benefit greatly from work-stealing:

  1/*
  2 * Recursive Fibonacci with work-stealing schedulers
  3 * Demonstrates divide-and-conquer parallelism with dynamic load balancing
  4 */
  5
  6#include <stdio.h>
  7#include <stdlib.h>
  8#include <abt.h>
  9
 10#define NUM_XSTREAMS 4
 11#define FIB_N 20
 12
 13ABT_pool *pools;
 14int num_pools;
 15
 16typedef struct {
 17    int n;
 18    int result;
 19} fib_arg_t;
 20
 21void fibonacci_ult(void *arg)
 22{
 23    fib_arg_t *fib = (fib_arg_t *)arg;
 24    int n = fib->n;
 25
 26    if (n <= 2) {
 27        fib->result = 1;
 28        return;
 29    }
 30
 31    /* Create child tasks */
 32    fib_arg_t child1 = {n - 1, 0};
 33    fib_arg_t child2 = {n - 2, 0};
 34
 35    int rank;
 36    ABT_xstream_self_rank(&rank);
 37    ABT_pool target_pool = pools[rank % num_pools];
 38
 39    ABT_thread thread1;
 40    ABT_thread_create(target_pool, fibonacci_ult, &child1,
 41                      ABT_THREAD_ATTR_NULL, &thread1);
 42
 43    /* Calculate child2 directly (no need to create another ULT) */
 44    fibonacci_ult(&child2);
 45
 46    /* Wait for child1 */
 47    ABT_thread_free(&thread1);
 48
 49    fib->result = child1.result + child2.result;
 50}
 51
 52int main(int argc, char **argv)
 53{
 54    ABT_xstream xstreams[NUM_XSTREAMS];
 55    ABT_pool local_pools[NUM_XSTREAMS];
 56    ABT_sched scheds[NUM_XSTREAMS];
 57    fib_arg_t fib_task = {FIB_N, 0};
 58
 59    ABT_init(argc, argv);
 60
 61    printf("=== Fibonacci with Work-Stealing ===\n");
 62    printf("Computing fib(%d) with %d execution streams\n\n", FIB_N, NUM_XSTREAMS);
 63
 64    num_pools = NUM_XSTREAMS;
 65    pools = local_pools;
 66
 67    /* Create pools */
 68    for (int i = 0; i < NUM_XSTREAMS; i++) {
 69        ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_MPMC,
 70                              ABT_TRUE, &pools[i]);
 71    }
 72
 73    /* Create work-stealing schedulers */
 74    for (int i = 0; i < NUM_XSTREAMS; i++) {
 75        ABT_pool *sched_pools = malloc(sizeof(ABT_pool) * NUM_XSTREAMS);
 76        for (int j = 0; j < NUM_XSTREAMS; j++) {
 77            sched_pools[j] = pools[(i + j) % NUM_XSTREAMS];
 78        }
 79        ABT_sched_create_basic(ABT_SCHED_RANDWS, NUM_XSTREAMS,
 80                               sched_pools, ABT_SCHED_CONFIG_NULL, &scheds[i]);
 81        free(sched_pools);
 82    }
 83
 84    /* Setup execution streams */
 85    ABT_xstream_self(&xstreams[0]);
 86    ABT_xstream_set_main_sched(xstreams[0], scheds[0]);
 87
 88    for (int i = 1; i < NUM_XSTREAMS; i++) {
 89        ABT_xstream_create(scheds[i], &xstreams[i]);
 90    }
 91
 92    /* Compute fibonacci */
 93    ABT_thread main_thread;
 94    ABT_thread_create(pools[0], fibonacci_ult, &fib_task,
 95                      ABT_THREAD_ATTR_NULL, &main_thread);
 96    ABT_thread_free(&main_thread);
 97
 98    printf("fib(%d) = %d\n", FIB_N, fib_task.result);
 99
100    /* Cleanup */
101    for (int i = 1; i < NUM_XSTREAMS; i++) {
102        ABT_xstream_join(xstreams[i]);
103        ABT_xstream_free(&xstreams[i]);
104    }
105
106    printf("\nWork-stealing enabled dynamic load balancing across execution streams\n");
107
108    ABT_finalize();
109    return 0;
110}

Key Points

Recursive Parallelism (lines 30-45)

Each fibonacci call spawns a child ULT for one branch and directly computes the other. This creates a tree of ULTs that work-stealing distributes across execution streams.

Dynamic Work Distribution

Work-stealing is ideal for recursive algorithms because: - Work is created dynamically (can’t predict load upfront) - Some branches complete faster than others - Idle execution streams can steal pending work

Performance

With work-stealing, this fibonacci computation utilizes all cores effectively. Without it, work would be statically assigned and load imbalance would waste cores.

Choosing a Scheduler

Use BASIC when:
  • Single pool per execution stream

  • Work arrives at steady rate

  • Minimal overhead is important

  • No load balancing needed

Use PRIO when:
  • You have distinct work priority levels

  • Critical work must execute before background work

  • Multiple pool types (RPC, I/O, computation)

  • Willing to accept some overhead for prioritization

Use RANDWS when:
  • Workload is unpredictable or bursty

  • Recursive or divide-and-conquer algorithms

  • Maximum throughput is more important than minimal latency

  • You have multiple execution streams

  • Load balancing is critical

Mochi/Bedrock Integration

Bedrock configurations expose scheduler choices in each xstream definition:

{
    "argobots": {
        "pools": [
            {"name": "rpc_pool", "kind": "fifo_wait", "access": "mpmc"},
            {"name": "io_pool", "kind": "fifo_wait", "access": "mpmc"}
        ],
        "xstreams": [
            {
                "name": "rpc_stream",
                "scheduler": {
                    "type": "basic_wait",
                    "pools": ["rpc_pool"]
                }
            },
            {
                "name": "io_stream",
                "scheduler": {
                    "type": "basic_wait",
                    "pools": ["io_pool", "rpc_pool"]
                }
            }
        ]
    }
}

This configuration: - Creates dedicated xstreams for RPC and I/O - I/O stream can steal from RPC pool when idle - BASIC_WAIT scheduler for efficient waiting

Common Pitfalls

Wrong Pool Access Mode for Work-Stealing
/* WRONG: PRIV access with multiple schedulers */
ABT_pool_create_basic(ABT_POOL_FIFO, ABT_POOL_ACCESS_PRIV,
                      ABT_TRUE, &pool);
/* Multiple schedulers accessing this = race conditions */

For work-stealing, pools must use ABT_POOL_ACCESS_MPMC.

Too Many Pools per Scheduler
/* WRONG: Scheduler with too many pools */
ABT_pool pools[100];
ABT_sched_create_basic(ABT_SCHED_PRIO, 100, pools, ...);

Each pool adds overhead. Usually 2-4 pools per scheduler is sufficient.

Forgetting Pool Order Matters
/* Pool order is priority order */
pools[0] = low_priority_pool;   /* Will be checked first! */
pools[1] = high_priority_pool;  /* Checked second */

The first pool has highest priority. Order matters for PRIO schedulers.

API Reference

Scheduler Creation
  • int ABT_sched_create_basic(ABT_sched_predef predef, int num_pools, ABT_pool *pools, ABT_sched_config config, ABT_sched *newsched)

    Create a predefined scheduler.

    Parameters:
    • predef: Scheduler type (DEFAULT, BASIC, BASIC_WAIT, PRIO, RANDWS)

    • num_pools: Number of pools this scheduler accesses

    • pools: Array of pool handles

    • config: Configuration (use ABT_SCHED_CONFIG_NULL for defaults)

    • newsched: Output scheduler handle

Scheduler Management
  • int ABT_xstream_set_main_sched(ABT_xstream xstream, ABT_sched sched)

    Set the main scheduler for an execution stream.

  • int ABT_sched_finish(ABT_sched sched)

    Request scheduler to finish (stop scheduling new work).

  • int ABT_sched_free(ABT_sched *sched)

    Free a scheduler (only if not automatically freed).

Configuration
  • int ABT_sched_config_create(ABT_sched_config *config, ...)

    Create scheduler configuration.

  • int ABT_sched_config_free(ABT_sched_config *config)

    Free scheduler configuration.