_images/cicada.png

Tutorial

The Millionaires’ Dilemma

Imagine a pair of millionaires [*] who want to settle a bet over whose fortune is larger, yet neither wants to reveal the size of their fortune to the other, and neither is willing to reveal it to a third party. These may seem like mutually-exclusive goals, yet it turns out that we can arrive at a correct answer without revealing the size of either fortune. Using secure multiparty computation (MPC), the millionaires can cooperatively compute which fortune is larger in such a way that both learn the result, yet neither learns the other’s private information.

Cicada provides a collection of components that can be combined in flexible ways to create MPC programs. This tutorial will introduce you to the basic building blocks of a Cicada program, and solve the millionaires’ dilemma. Ready? Let’s get started!

The Basics

An important point to understand fully is that - just as the name says - secure multiparty computation involves multiple cooperating parties, which we will refer to as players throughout this documentation. In computer science terms, each player is a separate process, typically running on a separate host. In other words, you should think of an MPC program as a single set of instructions that run in parallel across multiple computers, communicating and coordinating as they execute.

Writing programs this way can feel a little weird until you become accustomed to it. Fortunately, the high performance computing (HPC) community has been writing programs this way for decades, and Cicada brings that expertise to MPC. If you’re familiar with writing software using popular HPC tools like MPI, you’ll be right at home in Cicada. If not, don’t worry! We’ll explain how it all works as we go.

Before we begin, let’s setup Python’s builtin logging functionality, which Cicada uses to provide feedback from running MPC programs:

[1]:
import logging
logging.basicConfig(level=logging.INFO)

Next, we need to create two players. That means we’re going to have to start two processes, each running the same program, and the programs will need to communicate with each other. In Cicada, all communication is handled through communicators, which coordinate and pass messages between players. What this means is that every Cicada program must do (at a minimum) two things:

  • Start a collection of processes, one per player.

  • Initialize a communicator for each process.

For this tutorial, we’ll be using Cicada’s SocketCommunicator, which uses Python’s builtin socket module for communication.

Tip

Advanced Cicada users can substitute their own custom communicators for use with alternate networking libraries, protocols, and hardware.

Cicada provides several methods to simplify an otherwise tricky bootstrapping process. For example, SocketCommunicator can run both processes on the local machine and setup the communicators for us, which is ideal for development. Let’s see how that works:

[2]:
from cicada.communicator import SocketCommunicator

def main(communicator):
    print("Hello, World!")

SocketCommunicator.run(world_size=2, fn=main);
Hello, World!Hello, World!

There’s a lot to unpack here, so we’ll go over things in detail. First, we define a function named main, which prints a familiar message:

def main(communicator):
    print("Hello, World!")

Note that the main function takes a single communicator argument, which SocketCommunicator.run provides for us: it starts two separate processes (specified using the world_size parameter), creates a SocketCommunicator instance for each, and calls main with the communicator as the first argument, for each process.

We can watch all of this in greater detail as it happens by re-running our function with increased logging:

[3]:
logging.getLogger("cicada.communicator").setLevel(logging.INFO)

SocketCommunicator.run(world_size=2, fn=main);
INFO:cicada.communicator.socket.connect:Comm world player 0 listening to tcp://127.0.0.1:61481 for connections.
INFO:cicada.communicator.socket.connect:Comm world player 1 listening to tcp://127.0.0.1:61483 for connections.
INFO:cicada.communicator.socket.connect:Comm world player 0 direct connect with ['tcp://127.0.0.1:61481', 'tcp://127.0.0.1:61483'].
INFO:cicada.communicator.socket.connect:Comm world player 1 direct connect with ['tcp://127.0.0.1:61481', 'tcp://127.0.0.1:61483'].
INFO:cicada.communicator.socket.connect:Comm world player 0 tcp://127.0.0.1:61481 accepted connection from tcp://127.0.0.1:61487
INFO:cicada.communicator.socket.connect:Comm world player 1 tcp://127.0.0.1:61487 connected to player 0 tcp://127.0.0.1:61481
INFO:cicada.communicator.socket:Comm world player 0 communicator ready.
INFO:cicada.communicator.socket:Comm world player 1 communicator ready.
Hello, World!Hello, World!

INFO:cicada.communicator.socket:Comm world player 1 communicator freed.
INFO:cicada.communicator.socket:Comm world player 0 communicator freed.
INFO:cicada.communicator.socket:Comm world player 0 result: None
INFO:cicada.communicator.socket:Comm world player 1 result: None

First, the individual player processes begin listening for connections from each other:

INFO:cicada.communicator.socket.connect:Comm world player 0 listening to ...
INFO:cicada.communicator.socket.connect:Comm world player 1 listening to ...

Then, the players begin making connections:

INFO:cicada.communicator.socket.connect:Comm world player 1 direct connect ...
INFO:cicada.communicator.socket.connect:Comm world player 0 direct connect ...

Next, we see a confirmation that communications have been established:

INFO:cicada.communicator.socket:Comm world player 0 communicator ready.
INFO:cicada.communicator.socket:Comm world player 1 communicator ready.

Then, we see the output from main … in this case, two copies of “Hello, World!” (one from each player):

Hello, World!Hello, World!

The outputs appear on the same line because they’re being printed by both players at the same time - if you run this notebook yourself, the output may look different, depending on random quirks of timing. We’ll see in a moment how to improve on this.

Once main ends, the communicators are automatically cleaned-up:

INFO:cicada.communicator.socket:Comm world player 0 communicator freed.
INFO:cicada.communicator.socket:Comm world player 1 communicator freed.

Finally, we see the value (if any) returned by each player from main:

INFO:cicada.communicator.socket:Player 0 returned: None
INFO:cicada.communicator.socket:Player 1 returned: None

In this case, our main function doesn’t have a return statement, so the returned value is None. The returned values are also collected and returned by SocketCommunicator.run, though we won’t be using them for this tutorial.

So here’s what we’ve done so far: we defined a function named main. When we pass main to SocketCommunicator.run, the latter executes it in parallel within separate processes (two in this case), using communicators that it automatically creates for us. Be sure you understand these steps before proceeding.

Tip

The name of the function isn’t important - you can pass any function that takes a communicator as its first argument to SocketCommunicator.run.

Before we continue, let’s restore our logging output to “normal” levels:

[4]:
logging.getLogger("cicada.communicator").setLevel(logging.WARNING)

Logging

As we saw above, when multiple players print to stdout at the same time, the results can step on each other. This becomes a significant problem when we start doing real MPC computation and need to debug programs and print results. Let’s add Cicada’s Logger to the current example to tidy things up:

[5]:
from cicada.logging import Logger

def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    log.info("Hello, World!")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Hello, World!
INFO:root:Hello, World!

Now, the output messages are nicely printed on separate lines. Cicada’s Logger wraps a standard Python logger, and uses a communicator to coordinate among the players so that only one player generates output at a time.

Of course, it would be especially useful to know which message belongs to which player. In Cicada, each player has a zero-based integer identifier – its rank – which we can use in our message:

[6]:
def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    log.info(f"Hello from player {communicator.rank}!")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Hello from player 0!
INFO:root:Hello from player 1!

Notice that the player’s rank is accessed using the communicator (a concrete example of how communicators provide the organization for a group of players), and that the logger automatically prints messages to the console in rank order. As you will see in the examples that follow, the player’s rank is one of the most-used pieces of information in an MPC program - we will use rank frequently to change a player’s behavior based on their identity, including targeting communications and MPC operations to specific players based on their rank.

To round things out, a good MPC program should be written in such a way that it can be run using any number of players. Instead of hard-coding the number of players into your programs, you should use the communicator’s world_size attribute to determine at runtime how many players are participating in the computation, and adjust player behavior as-needed. Note that the value of the world_size attribute will always match the value of the world_size parameter passed to SocketCommunicator.run:

[7]:
def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    log.info(f"Hello from player {communicator.rank} of {communicator.world_size}!")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Hello from player 0 of 2!
INFO:root:Hello from player 1 of 2!

Encodings

One of the trickiest topics when working with MPC is managing encodings - as users of MPC, we typically want to perform computation on real numbers, while most MPC protocols require integer operands with special properties - for example, “only positive integers”, or “only integers mod \(p\) where \(p\) is a prime number”.

To manage this, Cicada must convert real numbers between encoded and unencoded representations. While this is normally handled for you automatically, it’s good to have an understanding of the process. To see encoding at work, let’s create an instance of Cicada’s FixedPoint encoding:

[8]:
import cicada.encoding

encoding = cicada.encoding.FixedPoint()
encoding
[8]:
cicada.encoding.FixedPoint(precision=16)

FixedPoint encodes real values as integers by reserving a fixed number of bits (16 by default) to store the fractional part of the original value. To test it, let’s create a real value to encode:

[9]:
import numpy

value = numpy.array(numpy.pi)
value
[9]:
array(3.14159265)

Note

Cicada uses numpy arrays as arguments throughout the API. This greatly simplifies application and implementation code by eliminating error-prone loops, and provides important speedups. This is why our value is a numpy.array in the example above, despite being a scalar value - you may not have known it, but Numpy treats a scalar as “just another array” – one that has zero dimensions, size equal to 1, and shape equal to an empty tuple.

Now that we have a real value and the means to encode it, we need to define how we will store the encoded result. It is extremely common in MPC to perform arithmetic using finite fields:

[10]:
import cicada.arithmetic

field = cicada.arithmetic.Field()
field
[10]:
cicada.arithmetic.Field(order=18446744073709551557)

In this case the Field object defines a finite set of integers (the order in the Field representation) for which multiplication, addition, subtraction and division are defined and satisfy certain basic rules. Now, we can encode (convert) our real value to its representation in the field:

[11]:
encoded_value = encoding.encode(value, field)
encoded_value
[11]:
array(205887, dtype=object)

The encoder turns the unencoded array of real values into an array of integers with the same shape that encode the original values. You may feel that 205887 is an unlikely way to store the value of \(\pi\), but let’s try decoding it to see if what we get out matches the original:

[12]:
decoded_value = encoding.decode(encoded_value, field)
decoded_value
[12]:
array(3.1415863)

You can see that the result is a value that’s pretty close to the original, but not an exact match. This is because the default 16 bits of precision used by the FixedPoint encoding to represent fractions can only approximate some values (of course, the original value was itself a finite approximation of \(\pi\), so this shouldn’t bother you too much). For many computations 16 bits of fractional precision is more than enough, but if you need more (or less) precision, you can arrange to do so. Furthermore, Cicada provides special encodings for working with bits and boolean values, which we’ll see shortly.

Secure Multiparty Computation (MPC)

OK, with the preliminaries out of the way, let’s do some MPC! Recall that we have two players (the millionaires), that each has a secret (their fortune), and that they wish to identify which secret is larger, but without revealing the secret values to each other.

The key technique provided by MPC to accomplish this is secret sharing, where each secret is split into pieces called secret shares, and the shares are distributed to the other players. Because MPC provides protocols that allow players to collaboratively perform mathematical operations on shares, and because it’s provably impossible to reconstruct a secret unless a player has obtained all of the necessary shares, the players in our computation can perform arithmetic and logical operations on values without knowing what those values are!

There is more than one way to generate secret shares, and there are many protocols that have been developed to manipulate them. For this tutorial we’re going to focus on additive secret sharing, where the shares of a secret are randomly-chosen numbers that reveal the secret value when summed. Thus, a player can only learn (or reveal) the value of a secret if they have access to every share, in order to add them all together.

Let’s see what this looks like in Cicada. To create and manipulate additive secret shares, we need an instance of AdditiveProtocolSuite. In the following example, we create the protocol suite, and use it to create a set of secret shares that are distributed to all players:

[13]:
import cicada.additive

def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    protocol = cicada.additive.AdditiveProtocolSuite(communicator=communicator)

    secret = numpy.array(numpy.pi) if communicator.rank == 0 else None
    log.info(f"Player {communicator.rank} secret: {secret}")

    share = protocol.share(src=0, secret=secret, shape=())
    log.info(f"Player {communicator.rank} share: {share}")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Player 0 secret: 3.141592653589793
INFO:root:Player 1 secret: None
INFO:root:Player 0 share: cicada.additive.AdditiveArrayShare(storage=8875355198802812926)
INFO:root:Player 1 share: cicada.additive.AdditiveArrayShare(storage=9571388874906944518)

Let’s dig into this. First, we create the protocol suite:

import cicada.additive
...
protocol = cicada.additive.AdditiveProtocolSuite(communicator=communicator)

Much like the Logger object created previously, we pass the communicator to the protocol suite because it needs to communicate among players to implement its functionality.

Next, we initialize a secret value known only to player 0:

secret = numpy.array(numpy.pi) if communicator.rank == 0 else None

Note that, as we described earlier, we’re using the player rank to change the behavior of our program depending on which player is executing it. In this case, player 0 sets the value of the secret variable to \(\pi\) while the other player leaves it uninitialized.

Note

“But wait!” you may be thinking … how is this value a “secret” if every player is executing the same code!? The other player may not be initializing the variable, but the code is the same everywhere, isn’t it? Doesn’t that mean that every player “knows” the secret?

You are absolutely correct. In this example the secret is embedded into the program code as a literal value, which means it isn’t really a secret. We do this often to keep our examples succinct. Whenever you see examples that embed literal secrets into code, you should keep in mind that a “real” program with “real” secrets would supply them in a privacy-preserving way, by loading data from a player-specific file or database, reading a sensor that only that player has access to, prompting a human to enter a value on a secure terminal, or similar.

The same goes for the log output … in real life we wouldn’t log secret values to stdout where anyone can see them! We do it here for strictly pedagogical purposes.

Our log output confirms that the secret value is only defined for player 0:

INFO:root:Player 0 secret: 3.141592653589793
INFO:root:Player 1 secret: None

Next, player 0 shares the secret value with the other player using additive secret sharing:

share = protocol.share(src=0, secret=secret, shape=())

Again, there’s a lot of detail here to unpack. Remember that every player is running the code in parallel - AdditiveProtocolSuite.share is an example of a collective operation, one that must be called by every player and, except for the values of the secret argument, must be called with the same arguments by every player. This is a common pattern throughout Cicada (and MPI, if you’re familiar with HPC programming). In this case, the arguments indicate that the actual secret value will be provided by player 0 (src=0), and that the secret is a scalar value (shape=() … remember that Cicada can work with arrays of any shape as secrets). Every player has to provide some value for secret because the Python language requires it, but only the value provided by the player matching src matters.

The protocol object takes the secret provided by player 0, creates secret shares, and distributes them to all players, where they become the return value from AdditiveProtocolSuite.share.

Note

The protocol suite provides default field and encoding objects that are used to encode secrets when they’re shared, and decode them when they’re revealed.

Note from the log output that both players receive a share of the secret, including the player that provided it:

INFO:root:Player 0 share: cicada.additive.AdditiveArrayShare(storage=...)
INFO:root:Player 1 share: cicada.additive.AdditiveArrayShare(storage=...)

The values of the shares are random numbers from the protocol object’s field that when summed using field arithmetic will equal the encoded representation of the original secret. To see that this is so, let’s re-run the experiment with a final step where the players cooperatively combine their shares to reveal the original secret:

[14]:
def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    protocol = cicada.additive.AdditiveProtocolSuite(communicator=communicator, seed=1234)

    secret = numpy.array(numpy.pi) if communicator.rank == 0 else None
    log.info(f"Player {communicator.rank} secret: {secret}")

    share = protocol.share(src=0, secret=secret, shape=())
    log.info(f"Player {communicator.rank} share: {share}")

    revealed = protocol.reveal(share)
    log.info(f"Player {communicator.rank} revealed: {revealed}")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Player 0 secret: 3.141592653589793
INFO:root:Player 1 secret: None
INFO:root:Player 0 share: cicada.additive.AdditiveArrayShare(storage=16993927038499601873)
INFO:root:Player 1 share: cicada.additive.AdditiveArrayShare(storage=1452817035210155571)
INFO:root:Player 0 revealed: 3.1415863037109375
INFO:root:Player 1 revealed: 3.1415863037109375

Now, when every player passes their share of the secret to AdditiveProtocolSuite.reveal (which is also a collective operation) the result is the original secret:

INFO:root:Player 0 revealed: 3.1415863037109375
INFO:root:Player 1 revealed: 3.1415863037109375

(Remember that the value isn’t an exact match because the encoding-decoding round trip is lossy.)

Now that we know how to share and reveal secrets, we’re finally ready to do some computation. Let’s get our millionaires a solution for their dilemma. The trick is that, in addition to using AdditiveProtocolSuite to create and reveal shares, we can use it to perform mathematical operations on shares, which return shares of the result. For example, we can perform a less-than comparison on secret-shared values, which returns a share of the answer. If we only reveal that to the players, then they will know which of the two fortunes is larger, without knowing the value of either. Let’s see what it looks like:

[15]:
def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    protocol = cicada.additive.AdditiveProtocolSuite(communicator=communicator)

    if communicator.rank == 0:
        fortune = numpy.array(10000000)
    elif communicator.rank == 1:
        fortune = numpy.array(12000000)
    log.info(f"Player {communicator.rank} fortune: {fortune}")

    share0 = protocol.share(src=0, secret=fortune, shape=())
    share1 = protocol.share(src=1, secret=fortune, shape=())

    if protocol.reveal(protocol.less(share0, share1), encoding=cicada.encoding.Boolean()):
        winner = 1
    else:
        winner = 0

    log.info(f"Winner revealed to player {communicator.rank}: {winner}")

SocketCommunicator.run(world_size=2, fn=main);
INFO:root:Player 0 fortune: 10000000
INFO:root:Player 1 fortune: 12000000
INFO:root:Winner revealed to player 0: 1
INFO:root:Winner revealed to player 1: 1

Note that both players provide a secret now, not just player 0, so that the fortune variable contains a different value for each process. Also, the same fortune variable is passed as the secret value to AdditiveProtocolSuite.share twice, once for each player. This works because AdditiveProtocolSuite.share only uses the value supplied by the player specified in the src parameter.

We use AdditiveProtocolSuite.less to compare the shares of the two values, which returns a share of the result - either a share of 0 if the comparison is false, or a share of 1 if it is true. Because these are boolean field values rather then real numbers, we pass the Boolean encoding to reveal them.

Looking at the results, we can verify that player 1 does have the largest fortune, so our millionaires finally have their answer!

Although this program works, it is hardcoded in such a way that it will only work with two players. As we mentioned above, it’s usually a good idea to write MPC programs to work with any number of players - in addition to being more flexible, this approach can often make code more compact and easier to understand. Let’s rewrite our example to be agnostic about the number of players, and while we’re at it, let’s have every player get their input from a file, so that we’re no longer embedding secrets in the code as literals:

[16]:
def main(communicator):
    log = Logger(logger=logging.getLogger(), communicator=communicator)
    protocol = cicada.additive.AdditiveProtocolSuite(communicator=communicator, seed=1234)

    fortune = numpy.loadtxt(f"millionaire-{communicator.rank}.txt")

    winner = None
    winning_share = protocol.share(src=0, secret=numpy.array(0), shape=())
    for rank in communicator.ranks:
        fortune_share = protocol.share(src=rank, secret=fortune, shape=())
        less_share = protocol.less(fortune_share, winning_share)
        less = protocol.reveal(less_share, encoding=cicada.encoding.Boolean())
        if not less:
            winner = rank
            winning_share = fortune_share

    log.info(f"Winner revealed to player {communicator.rank}: {winner}")

SocketCommunicator.run(world_size=4, fn=main);
INFO:root:Winner revealed to player 0: 2
INFO:root:Winner revealed to player 1: 2
INFO:root:Winner revealed to player 2: 2
INFO:root:Winner revealed to player 3: 2

In this version of the program, we set aside storage for a winning share that starts with a value of zero. Then, we loop over all of the players, comparing each player’s shared secret with the winning share and updating it if the player’s secret is larger.

If we examine the contents of the individual players’ files, we see that player 2’s fortune is the largest, so the choice of winner is correct:

[17]:
for rank in range(4):
    fortune = numpy.loadtxt(f"millionaire-{rank}.txt")
    print(f"Player {rank} fortune: {fortune:>10}")
Player 0 fortune: 10000000.0
Player 1 fortune:  9000000.0
Player 2 fortune: 11000000.0
Player 3 fortune:  8500000.0

Note

You may be questioning whether the data in the player files is truly “secret” - couldn’t any of the players cheat by reading the other players’ secrets from their files? Once again, you’re absolutely right: when all four players are running as the same user on the same machine, they all have access to the same resources, including files on the filesystem. This highlights the fact that, to be truly secure, an MPC program must be run on separate machines that are under the sole control of their players - only then can we guarantee that player secrets will remain truly private.

That’s it for this tutorial! Of course, real computation requires more than just comparisons - see the User Guide for individual articles with detailed topics describing how to use Cicada to perform a whole host of mathematical operations.

Footnotes