Multiple Communicators

A Cicada communicator represents a group of players who have agreed to work on a problem together; importantly, the number of players in a communicator is set when the communicator is created, and never changes. This is usually what you want, but for more complex computations, and in situations where you need the number of players to change over time, you can work with multiple communicators in a single program.

One way to to do this is with the split() method, which can be used to create one-or-more new communicators from an existing communicator. First, let’s look at the name and list of players for a “parent” communicator:

[1]:
import logging

from cicada.communicator import SocketCommunicator
from cicada.logging import Logger

logging.basicConfig(level=logging.INFO)

def main(parent):
    log = Logger(logging.getLogger(), parent)

    log.info(f"Player {parent.rank} parent name: {parent.name} ranks: {parent.ranks}")

SocketCommunicator.run(world_size=4, fn=main);
INFO:root:Player 0 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 1 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 2 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 3 parent name: world ranks: [0, 1, 2, 3]

We refer to the communicator created by run() as the parent in this case because we will be using it to create a set of “child” communicators to follow. As you can see, its default name is “world” and it has four players with ranks “0” through “3”.

In the simplest case, we can use the parent communicator to create a second communicator that includes the same set of players:

[2]:
def main(parent):
    log = Logger(logging.getLogger(), parent)

    log.info(f"Player {parent.rank} parent name: {parent.name} ranks: {parent.ranks}")
    with parent.split(name="child") as child:
        log.info(f"Player {parent.rank} child name: {child.name} ranks: {child.ranks}")

SocketCommunicator.run(world_size=4, fn=main);
INFO:root:Player 0 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 1 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 2 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 3 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 0 child name: child ranks: [0, 1, 2, 3]
INFO:root:Player 1 child name: child ranks: [0, 1, 2, 3]
INFO:root:Player 2 child name: child ranks: [0, 1, 2, 3]
INFO:root:Player 3 child name: child ranks: [0, 1, 2, 3]

The child communicator goes through the same startup process as the parent and is freed automatically when the with statement goes out of scope. You can go through this process repeatedly to create as many communicators as you like; however, they would all have the same players as the parent.

In a more useful scenario, you might start a computation with many players, then divide them into smaller groups to provide redundancy in case of failure. In other words, you need to create multiple communicators that partition the parent communicator into smaller groups that contain subsets of players. To do this, arrange to have different players pass different names in the call to split(), which will create a communicator for every unique name and distribute the players accordingly.

For example, we can split our four players into two groups of two:

[3]:
def main(parent):
    log = Logger(logging.getLogger(), parent)

    log.info(f"Player {parent.rank} parent name: {parent.name} ranks: {parent.ranks}")

    if parent.rank in [0, 1]:
        name = "red"
    if parent.rank in [2, 3]:
        name = "blue"

    with parent.split(name=name) as child:
        log.info(f"Player {parent.rank} child name: {child.name} ranks: {child.ranks}")

SocketCommunicator.run(world_size=4, fn=main);
INFO:root:Player 0 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 1 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 2 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 3 parent name: world ranks: [0, 1, 2, 3]
INFO:root:Player 0 child name: red ranks: [0, 1]
INFO:root:Player 1 child name: red ranks: [0, 1]
INFO:root:Player 2 child name: blue ranks: [0, 1]
INFO:root:Player 3 child name: blue ranks: [0, 1]

Now, we see two child communicators being created, “red” and “blue”, each with two players.

Important

If you look carefully, you’ll see that both child communicators report their ranks as ranks: [0, 1] … what gives? The red communicator should have players 0 and 1, and the blue communicator should have players 2 and 3! Relax, grasshopper. Within a communicator, ranks always count from zero, so players 2 and 3 really are members of the blue communicator, they just have different ranks within its context.

Finally, here’s a more thoroughly-worked example of how split() could be used to work on separate “games” in round-robin fashion. The parent communicator with four players partitions them into three groups of three, then each group broadcasts messages:

[4]:
import collections
import logging

import numpy

Game = collections.namedtuple("Game", ["communicator", "log"])

def main(communicator):
    # Setup multiple games with separate communicators.
    games = []
    partitions = [[0, 1, 2], [1, 2, 3], [2, 3, 0]]
    for index, partition in enumerate(partitions):
        game_communicator = communicator.split(name=f"game-{index}" if communicator.rank in partition else None)
        if game_communicator is not None:
            game = Game(
                communicator=game_communicator,
                log=Logger(logging.getLogger(), game_communicator),
                )
            games.append(game)

    # Run games in round-robin fashion.
    for i in range(2):
        for game in games:
            value = f"{game.communicator.name} message {i}" if game.communicator.rank == 0 else None
            value = game.communicator.broadcast(src=0, value=value)
            game.log.info(f"{game.communicator.name} player {game.communicator.rank} received broadcast value: {value}")

    # Cleanup games.
    for game in games:
        game.communicator.free()

SocketCommunicator.run(world_size=4, fn=main);
INFO:root:game-0 player 0 received broadcast value: game-0 message 0
INFO:root:game-0 player 1 received broadcast value: game-0 message 0
INFO:root:game-0 player 2 received broadcast value: game-0 message 0
INFO:root:game-1 player 0 received broadcast value: game-1 message 0
INFO:root:game-2 player 0 received broadcast value: game-2 message 0
INFO:root:game-1 player 1 received broadcast value: game-1 message 0
INFO:root:game-1 player 2 received broadcast value: game-1 message 0
INFO:root:game-2 player 1 received broadcast value: game-2 message 0
INFO:root:game-2 player 2 received broadcast value: game-2 message 0
INFO:root:game-0 player 0 received broadcast value: game-0 message 1
INFO:root:game-0 player 1 received broadcast value: game-0 message 1
INFO:root:game-0 player 2 received broadcast value: game-0 message 1
INFO:root:game-2 player 0 received broadcast value: game-2 message 1
INFO:root:game-1 player 0 received broadcast value: game-1 message 1
INFO:root:game-1 player 1 received broadcast value: game-1 message 1
INFO:root:game-1 player 2 received broadcast value: game-1 message 1
INFO:root:game-2 player 1 received broadcast value: game-2 message 1
INFO:root:game-2 player 2 received broadcast value: game-2 message 1