{ "cells": [ { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. _sockets:\n", "\n", "Socket Communication\n", "====================\n", "\n", "As you've seen elsewhere in this documentation, the Cicada :any:`Communicator` interface defines a set of :ref:`communication-patterns` that can be used by algorithms, with :any:`SocketCommunicator` providing an implementation that uses standard operating system sockets and TCP/IP networking for the transport layer. While advanced users with specialized hardware may wish to write their own communicators from scratch, there is much more to :any:`SocketCommunicator` than you've seen so far; this article will introduce you to the full generality and flexibility of socket-based communication in Cicada.\n", "\n", "SocketCommunicator Creation\n", "---------------------------\n", "\n", "You may-or-may-not have noticed that none of the examples in this documentation create an instance of the :any:`SocketCommunicator` class directly - in some cases the communicator is created implicitly as part of another call (:any:`SocketCommunicator.run`), and in others a factory function returns a new communicator, ready for use (:any:`SocketCommunicator.connect`, :any:`SocketCommunicator.split`). This is because an instance of :any:`SocketCommunicator` doesn't just wrap a single socket, it wraps a collection of sockets, allowing every player to communicate directly with every other player. Creating those sockets and setting-up the connections is a complex process involving careful coordination among players, which is why Cicada divides :any:`SocketCommunicator` creation into separate high-level and low-level APIs. The low-level APIs create and initialize sockets that are only handed off for use by a new instance of :any:`SocketCommunicator` once they're ready." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "High Level API\n", "--------------\n", "\n", "The high level API provides the following methods:\n", "\n", "**SocketCommunicator.run**\n", "\n", ":any:`SocketCommunicator.run` is a static method that creates a set of player processes on the local machine, each executing a function that you provide. Before your function is executed, the underlying sockets are created with the low-level API and :any:`SocketCommunicator` instances are created and passed as the first argument to your function. This method is widely used for development, debugging, and throughout this documentation.\n", "\n", "**SocketCommunicator.connect**\n", "\n", ":any:`SocketCommunicator.connect` is a static factory function that returns a new instance of :any:`SocketCommunicator`. It must be called by every player that will be members of the new communicator. The sockets are created using the low level API and parameters you provide, or environment variables set by the :ref:`cicada` `run` and :ref:`cicada` `start` commands. This is a good choice for running your programs in production on separate hosts. See :ref:`running-programs` for examples.\n", "\n", "Note that you can call :any:`SocketCommunicator.connect` more than once within a program to create multiple communicators, which is more flexible (but less convenient) than using :any:`SocketCommunicator.split`.\n", "\n", "**SocketCommunicator.split**\n", "\n", ":any:`SocketCommunicator.split` partitions an existing communicator's players into groups, and returns a new instance of :any:`SocketCommunicator` for the new group containing the calling player, if any. It's a good choice when you need to split a group of players into smaller subgroups to work on separate tasks, and it must be called by every member of the existing communicator. See :ref:`multiple-communicators` for examples. \n", "\n", "**SocketCommunicator.shrink**\n", "\n", ":any:`SocketCommunicator.shrink` is intended for use when failures have occurred - it should be called by as many members of the existing communicator as possible, and returns a new instance of :any:`SocketCommunicator` that contains the subset of players that are still alive and responding. Note that :any:`SocketCommunicator.shrink` cannot guarantee that all remaining players will be included in the new communicator, so you would likely never want to use it outside of failure recovery situations." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Low Level API\n", "-------------\n", "\n", "The low level API is located in the :mod:`cicada.communicator.socket.connect` module, and includes the following:\n", "\n", "**Timer**\n", "\n", "The :class:`~cicada.communicator.socket.connect.Timer` class is used to manage timeouts during initialization. Callers typically create one instance of :class:`~cicada.communicator.socket.connect.Timer` and pass it the other low level API functions, to specify an overall timeout for the entire process.\n", "\n", "**listen**\n", "\n", "The :func:`~cicada.communicator.socket.connect.listen` function provides a standardized way to create a listening (server) socket, ready and waiting to accept connections. Since :any:`SocketCommunicator` depends on connections between every pair of players, every player must create a listening socket in order to setup the network. Typically, the listening socket is passed to :func:`~cicada.communicator.socket.connect.direct` or :func:`~cicada.communicator.socket.connect.rendezvous` and then discarded; however, advanced users may wish to use the listening socket even after establishing the rest of the network. For example, it can be used to setup an *MPC Service* where the players act as servers that listen for requests from clients, performing MPC calculations on their behalf.\n", "\n", "**direct**\n", "\n", "The :func:`~cicada.communicator.socket.connect.direct` function is used to create the complete network of connected sockets required to create an instance of :any:`SocketCommunicator`, when you know the address of every player in advance. It must be called by every player that will become a member of the new communicator. This is useful in managed or datacenter environments where addresses and port numbers can be specified up-front, which greatly simplifies and streamlines the setup process.\n", "\n", "**rendezvous**\n", "\n", "The :func:`~cicada.communicator.socket.connect.rendezvous` function is used to create the complete network of connected sockets required to create an instance of :any:`SocketCommunicator`, when only the address of the root (rank 0) player is known in advance. It must be called by every player that will become a member of the new communicator. This is more convenient when setting-up a group of players in an ad-hoc environment, but requires more time and provides more points for failure.\n", "\n", "There are several other helper functions and classes in :mod:`cicada.communicator.socket.connect` that will not be covered here." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Addressing\n", "----------\n", "\n", "Network addresses throughout the low-level and high-level APIs must be specified using string URLs. The use of URLs makes the choice of address family unambiguous, and allows for possible future expansion. Currently, two types of address family are supported. For TCP/IP networking, use the following:\n", "\n", "* tcp\\://{ip address}:{port} *# e.g. tcp\\://192.168.0.6:25252*\n", "* tcp\\://{ip address} *# e.g. tcp\\://192.168.0.7*\n", "\n", "When using the first form, a port number is specified explicitly. With the second, a port number will be chosen at random by the operating system. Note that in some contexts the API will not allow you to use the second form - for example, :func:`~cicada.communicator.socket.connect.direct` requires explicit ports for every address, while :func:`~cicada.communicator.socket.connect.rendezvous` only requires an explicit port for player 0's address.\n", "\n", "Alternatively, you can use Unix domain sockets for connections between players running on the same machine. To do so, use URLs like the following:\n", "\n", "* file\\://{path} *# e.g. file\\:///tmp/run5/player3*" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "Transport Layer Security (TLS/SSL)\n", "----------------------------------\n", "\n", "Cicada's socket communicator infrastructure supports Transport Layer Security (formerly known as Secure Sockets Layer) for encrypted communication and player authentication.\n", "\n", "To enable TLS when creating a communicator, each player must supply the following:\n", "\n", "* An *identity* - a private key and a certificate that the player will use to identify themselves to others.\n", "* A set of *trusted* identities (certificates) with which the player will communicate. These might be player certificates, a *certificate authority* used to sign player certificates, or any combination thereof.\n", "\n", "The :ref:`cicada` `credentials` command can be used to generate player identity and certificate files. The only required argument is the player rank, so three players could generate their own credentials as follows:\n", "\n", ".. code-block:: bash\n", "\n", " $ cicada credentials --rank 0\n", " \n", ".. code-block:: bash\n", " \n", " $ cicada credentials --rank 1\n", " \n", ".. code-block:: bash\n", "\n", " $ cicada credentials --rank 2\n", "\n", "This would create an identity file and a certificate file in the current directory for each player. The identity file *player-{rank}.pem* contains the player's private key and certificate in PEM format, and must be safeguarded by the player to prevent any other party from assuming their identity. The certificate file *player-{rank}.cert* contains just the player's certificate in PEM format, and should be distributed to the other players so they can recognize the player at runtime.\n", "\n", ".. tip::\n", "\n", " :ref:`cicada` `credentials` generates 2048-bit empty-passphrase RSA private keys\n", " and self-signed certificates for players as a convenience.\n", " Advanced users will likely want to use their own existing tools and workflows to\n", " create proper credentials with passphrases that are signed by real certificate authorities.\n", " \n", "Once credentials have been created for every player, they can be supplied during startup to create encrypted communicators. To use TLS with the high-level API, callers pass the filesystem paths to player identities and certificates. For example, to use TLS with :any:`SocketCommunicator.run`, you might do something like the following:\n", "\n", ".. code-block:: python\n", "\n", " def main(communicator):\n", " pass\n", "\n", " world_size=3\n", " identities = [f\"player-{rank}.pem\" for rank in range(world_size)]\n", " trusted = [f\"player-{rank}.cert\" for rank in range(world_size)]\n", " SocketCommunicator.run(world_size=world_size, fn=main, identities=identities, trusted=trusted)\n", " \n", ".. warning::\n", "\n", " This example assumes that all of the credential files - including the players' private keys -\n", " are in the current directory. This might be acceptable for testing with\n", " :any:`SocketCommunicator.run` but should never happen in production, since it would allow players\n", " to assume each others' identities by loading other players' identity files." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 2 }