Blockchain mathematical basis: asymmetric cryptography

Introduction

Backup your private key.
Never share your private key.
Not your keys, not your bitcoins.
Send to: Paste public 0x... address
Sign message metamask

Although, the use of jargon has been reduced to improve UX and onboard more users, the terms "private key, public key, public address, signatures" are still common. Users will also encounter mainstream definitions of blockchain and cryptocurrencies mixing the terms "database, cryptography, security, confidentiality".
How does it all fit together?
What is the mathematical rationale behind this proclaimed and proven security?

> Symmetric cryptography
> Assymmetric cryptography
    > Ensure non collusion & "one-way" mathematical properties
    > Hard to hack ~ intractability
    > Bitcoin application
         > Intractability of elliptic curve discrete logarithm Problem
             > Discrete logarithm problem
                 > Algebra
                     > Finite cyclic group
                         > Group
                             > Set
                             > Binary operation
                             > Properties
                 > Application
                     > Elliptic curve over a finite field
                     > Elliptic curve point multiplication
         > Elliptic Curve Signature Algoritm (ECDSA)

1. Symmetric cryptography

Symmetric cryptography
Symmetric cryptography uses the same key to encrypt and decrypt a message. This key has to be shared between the sender and recipient that want to communicate confidentially. This system can only be as secure as the communication channel used to exchange the key.

2. Assymetric cryptography

By contrast, in assymetric or public-key cryptography, 2 different keys are used:

  • public key: can be freely shared
  • private key: kept secret

The generation of such keys depends on cryptographic algorithms based on mathematical problems to produce one-way (or preimage resistant) functions. Effective security only requires keeping the private key private. The public key can be openly distributed without compromising security.
Public-key encryption is like owning a (key, lock) combination and distributing copies of the lock.

  • "sending" = using a copy of someone else's lock to lock your message = encrypting message to send with someone else's public key
  • "receiving" = opening a message locked with a copy of your lock with your key = decrypting received message with your private key

Public-key encryption illustration

Applications

"Hiding": Public-key encryption to ensure confidentiality

To restrict access to a message's content, one can encrypt it using the receiver's public key. That encrypted message can only be decrypted with the receiver's private key. Especially it shall be impossible to deduce the message from its encrypted version.
To ensure this property called "hiding", the mathematical function used to encrypt the message needs to be very hard or impossible to reverse. It needs to be a one-way function. This mathematical one-way (or preimage resistance) property is desirable because confidentiality stems from it.

"Binding": digital signatures to ensure authenticity, non repudiation

Digital signature illustration
Digitally signing is like creating a lock out of the sender's private key and the message to transmit that
is unique,
and that can only be unlocked by the public key associated with the private key that created it.
Any smallest change in the private key or the message would create a different signature/lock. The signature is mathematically bound to both the message and private key it originally was made with.
To be binding, the mathematical function used to generate the the signature need to ensure non collusion . It needs to be injective.

f(x) == f(y) => x = y <-> x != y => f(x) != f(y)

This mathematical non collusion property is desirable because 3 key properties stem from it:

  • message authenticity = integrity: it is very hard to find 2 different messages that generate the same signature out of the same private key. If the signature is valid, the message is authentic.
  • sender authenticity: it is very hard to find 2 different private keys that generate the same signature out of the same message. If the signature is valid, the message has necessarily been signed by the owner of the private key that generated it.

A corollary of sender's authenticity is

  • non repudiation: provided that private key used to digitally sign a message is properly safeguarded by the original owner, the owner cannot dispute the authenticity of the signature. Nobody can forge the signature

Successfully unlocking the message with the sender's public key confirms these 3 properties.

3. Defining "very hard": intractability

THE KEY TO PUBLIC-KEY CRYPTOGRAPHY IS THE INTRACTABILITY OF CERTAIN MATHEMATICAL PROBLEMS.
All the properties introduced above hold only if it is very hard to find colluding elements, reverse the function... etc..
In the context of cryptography what does "very hard" mean?

A problem that can be solved in theory (e.g. given large but finite resources, especially time), but for which in practice any solution takes too many resources to be useful, is known as an intractable problem. [wikipedia]

...so problems which can be solved by brute force but it would take too long. What does too long mean?
Too long means universe-lifetime-long: longer than 13.799±0.021 ×109 years.

For a more rigorous definition, look into computer science courses.

So to guarantee the "hiding" and "binding" properties, we are looking for a mathematical function that poses 2 intractable problems:

  • Finding an input from an output (one-way)
  • Finding 2 colluding inputs (non-collusion)

4. Example: Bitcoin

The Bitcoin protocol leverages the "hiding" and "binding" properties:

  • hiding: "one way" generation of public keys from private keys
  • binding: signatures of transactions to transfer bitcoins

The intractable problems ensuring these properties in the context of Bitcoin are posed by Ellicptic curves.

4.1 Intractability of the Elliptic Curve Discrete Logarithm Problem (ECDLP)

For elliptic-curve-based protocols, it is assumed that finding the discrete logarithm of a random elliptic curve element with respect to a publicly known base point is intractable = infeasible: this is the "elliptic curve discrete logarithm problem" (ECDLP). The security of elliptic curve cryptography depends on the ability to compute a point multiplication and the inability to compute the multiplicand given the original and product points. The size of the elliptic curve determines the difficulty of the problem. [wikipedia]

Discrete logarithm problem

It is the problem of finding solutions x to the equation g^x = h given elements g and h of a finite cyclic group G. [wikipedia]

Finite cyclic group

Cyclic group

group that is generated by a single element. [wikipedia]

Group [wikipedia]

  • set: collection of distinct elements
    • equipped with a binary operation: calculation that combines two elements (called operands) to produce another element (e.g addition, mutiplication...)
    • such that 4 properties are satisfied, noting + the binary operation on a group G
      • closure: (a, b) in G => a + b in G
      • associativity: (a + b) + c = a + (b + c)
      • identiy: it exists an identity element e | a in G => a + e in G
      • invertibility: for all a in G, a has an inverse element i.e. it exists b | a + b = e

Intuitive example of a cyclic group - clock (12PM format): all hours generated by addition of 1 hours, cyclic because all hours value are decreased when higher than 12: 9:00 + 4:00 = 13:00 = 1:00

Application

In the case of the ECDLP, the finite cyclic group chosen is an elliptic curve over a finite field equipped as binary operation with the elliptic curve point multiplication and with infinity as identity element.
Elliptic curve: curve define by y² = x³ +ax +b, where 4a³ + 27b² !== 0. The condition on a and b is to avoid singular points (points of self intersection or points where the tangents of each branch are equal).
Examples of Elliptic curves
Ellipitic curve point multiplication
- addition P + Q = R = take the symmetric point over the horizontal axis of the intersection of the line (PQ) with the elliptic curve
- identity = infinity: indeed P + Q = R => P + Q - R = 0 = infinity ((PQ) crosses the elliptic curve only in a third point R, no fourth intersection point).
- doubling: particular addition case where P = Q, 2 * P = P + P = take the symmetric point over the horizontal axis of the intersection of the tangent in P with the elliptic curve.
- scalar multiplication: addition + doubling, n * P = P + (n-1)P = P + (P + ....(P + (P + P))
Elliptic curve point operation illustration

Elliptic curve over finite field

In the context of Bitcoin or Ethereum or Blockchain protocols, we want to generate address that have a fixed/finite length. More precisely the Bitcoin public key are 512 bits. So we can't work with infinite numbers. This is why we define the elliptic curve over a finite field of integers:
y² mod p = (x³ + ax + b) mod p.
The general point operations definitions remain valid.
In the case of elliptic curves over a finite field, the generator G of the group is called the base point.
The order is the smallest positive number n | n*G = 0 (= infinity) (number of times the point can be added to itself until its slope is infinite, or a vertical line)
Animation elliptic curve over finite field
Here one can draw the points of an elliptic curve over a finite field.

4.2 Elliptic Curve Discrete Signing Algorithm (ECDSA)

The Bitcoin protocol combines the use of an elliptic curve with following parameters (known as secp256k1.

Parameter Value
Elliptic Curve a = 1, b = 7 => y² = x³ + 7
modulo of the field (prime) in hexadecimal 2²⁵⁶ – 2³² – 2⁹ – 2⁸ – 2⁷ – 2⁶ – 2⁴ – 1 = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE FFFFFC2F
G = base point that generates the cyclic group (in hexadecimal) 04 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8
n = order (in hexadecimal) FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141

To go further and read about why and how these parameters were chosen: link, link.

Further notations:

Notation
z message
d private key
Q public key
G base point
n order

Public-key & private-key

The public key is derived from the private key by scalar multiplication of the base point a number of times equal to the value of the private key.
1.Choose randomly a private key (256-bit integer) d
2.Apply elliptic curve scalar multiplication: public key = private key * base point <-> P = d * G

So given a point P (public key), finding x means finding how many times to substract P to itself to land back on G (base point).
This problem is an ECDLP and is intractable.
So it is computationally infeasible to derive the private key corresponding to a given public key.

Signatures

Sign message, starting point: a message and a private key.

1.Select a random (or generated deterministically in a secret way) integer k from [1,n-1] (reminder: n is the order)
2.Calculate by elliptic point scalar multiplication the point (x, y) = k * G (reminder G is the base point)
3.Calculate r = x mod n. If r = 0, back to 1
4.Caculate s = (z + rd)k⁻¹ mod n. If s = 0, return to step 1. (reminder: we are on an elliptic curve over a finite cyclic group, especially over group where invertibility is fulfilled)
5.Signature is the pair (r, s)

Verify signature

Let Bob be the recipient of a message signed by Alice. Bob must have a copy of Alice's public-key curve point Q.

Validity of public key

-[x] Q != 0 (Q is not infinite)
-[x] Q is on the curve (use curve equation)
-[x] n * Q = 0 (base point * Q is infinite)

Validity of signature, starting point: a message, a signature and a public key

1.[x] r and s are integers
2.Calculate u = zs⁻¹ mod n and v = rs⁻¹
3.Calculate the following elliptic curve point operation: C = (x, y) = u*G + v*Q
4.[ ] if (x, y) = 0 (infinite) the signature is invalid
5.[x] if r = x mod n the signature is valid

Correctness of algorithm

Let C be the point u*G + v*Q.

C = uG + vQ
    = uG + vdG (definition of public key: Q = dG)
    = zs⁻¹G + rs⁻¹dG
    = s⁻¹(z + rd)G

Signature valid <-> s = (z + rd)k⁻¹
                <-> C = ((z + rd)k⁻¹)⁻¹(z + rd)G = (z + rd)(z + rd)⁻¹kG = kG
                <-> x = r mod n (definition of r)

Note

I lied for the sake of simplicity. In reality the very first step is to hash the data with SHA256 to generate a number containing the same number of bits (256) as the order of the curve (z = hash(message)). SHA256 also makes the address quantum resistant. Quantum computers could reverse the point scalar multiplication to get the private address. Quantum computers cannot reverse hash functions.

Summary

To fulfill confidentiality, authencity and integrity, we look for mechanisms to ensure "hiding" and binding. We are looking for functions that are collusion and preimage resistant.
Asymmetric cryptography is about relying on the intractability of some mathematical problems to guarantee these properties.
In the case of Bitcoin (or Ethereum) this problem is the Elliptic Curve Discrete Logarithm Problem.
It is the foundation of the generation of public keys from private keys and of digital signatures.


You'll only receive email when Gauthier Riou publishes a new post

More from Gauthier Riou