[ots-dev] Stamp/Beacon Trees for Secure, Precise, Trusted Timestamping and Clock Synchronization

Peter Todd pete at petertodd.org
Thu Jan 21 22:22:31 UTC 2021


# Abstract

Merkle trees are commonly used to scale timestamping and clock synchronization
protocols, with examples including Guardtime (trusted timestamping) and
Roughtime (clock synchronization). However the standard approach of
timestamping just the tip of the merkle tree suffers from poor precision,
especially if signatures are recorded in a calendar for auditing. This creates
a fundamental conflict between timestamp precision and scalability.

Here we propose a new construction, the Stamp/Beacon Tree, which greatly
improves on the precision of standard merkle trees by including per-leaf
timestamps that commit to exactly when a given leaf digest was both received,
and published.

This new approach allows for a highly scalable trusted timestamping and random
beacon service, with precision limited only by network latency.  Additionally,
we'll show how to use stamp/beacon trees for clock synchronization, providing a
secure Network Time Protocol (NTP) replacement, with better precision than
Roughtime.


# Cryptographic Construction and Attested Guarantees

Similar to other protocols such as Roughtime, a stamp/beacon tree is a merkle
tree, timestamped with a signature from a trusted notary that signs a
commitment to the merkle tree, an absolute time T, and a radius ε indicating
the uncertainty of the time. The notary receives one or more client submitted
digests, builds a tree, and then returns to each client a unique merkle path
from the digest-specific leaf to tip.

Unlike existing protocols, each leaf i in a stamp/beacon tree commits to the
following:

    d_i - Client-submitted digest.
    n_i - Notary chosen nonce.
    s_i - The delta between time T and when the digest was received (timestamped).
    p_i - The delta between time T and when the nonce was sent (published).

The deltas are the key to the stamp/beacon tree's timing precision. The notary
is attesting that the following is true:

1) Digest d_i was _received_ at true time T_si = T - s_i ± ε
2) Nonce n_i was _sent_ at time T_pi = T + p_i ± ε
3) Prior to being sent, the nonce was a secret.

As usual, the combination of merkle path and signature proves the validity of
the attestation. Note that this means the merkle path must be unique, eg by
committing to the total number of leaves in the tree - and thus the expected
merkle path length - or ensuring that leaves can't be confused with inner nodes
via the hashed message length, tags, etc.


## Trusted Timestamping

This is the most straightforward use for a stamp/beacon tree. Since the notary
attested to the fact that it received the digest at a specific true time, with
uncertainty ± ε, the notary also attesting that the digest must have existed
prior to that time. Specifically, prior to the _later_ bound of the
uncertainty, T - s_i + ε.

The nonce does double duty, by preventing other clients with sibling leaves in
the tree from learning about other digests via brute force search. In fact, the
OpenTimestamps server already uses a per-leaf nonce.

Additionally, if the actual serialization of the s and p deltas is unsigned,
the signature by itself is a valid timestamp: T + ε must be *after* the digest
was submitted, thus the notary is attesting to the fact that the digest existed
*prior* to T + ε. This may be useful for simplified proof validation that does
not need full precision.


## Random Beacon

Since the notary attests to the fact that the nonce was a secret prior to being
sent - aka published to the world - the nonce acts as a random beacon.
Specifically, the notary is attesting to the fact that the earliest the nonce
value could have been known to the public was after the _earlier_ bound of the
uncertainty, T + p_i - ε.

Again, if the s and p deltas are unsigned, the tip digest of the tree can
itself be used as a random beacon, because it's a commitment to random beacons.
As we'll discuss later, if multicast message delivery is available, it's a good
option to broadcast the tip.

The main reason why per-leaf beacon timing is proposed is bandwidth: being able
to spread out the bandwidth usage of the beacons, rather than having a sudden
spike, reduces network congestion caused timing jitter.


# Secure, Precise, Clock Synchronization

Here we have a local clock, and a remote clock, and our goal is to synchronize
the local clock to the remote clock. To do this we need to measure the delta
time, Δ, between the local clock and the remote clock; with Δ measured, the
local clock can be moved forwards or backwards to match the remote clock.
Repeated measurements can also measure the difference in clock frequency
between local and remote.

The Network Time Protocol (NTP) achieves very precise clock synchronization.
But, NTP is not resilient to Man-In-The-Middle (MITM) attack. The Roughtime
protocol is resilient to MITM attack. But as the name suggests, Roughtime only
achieves imprecise synchronization. Here we will combine the principles of both
NTP and Roughtime, into a secure, and precise, clock synchronization protocol
using stamp/beacon trees.


## Threat Model

Since we're trying to achieve secure, high precision, time synchronization, our
threat model (unusually!) has to include attacks on the timing of messages.

We assume the existence of a MITM attacker who is _not_ capable of forging
signatures, breaking hash functions, violating the laws of physics, etc. But
the attacker _is_ capable of causing messages to be:

1. Dropped
2. Delayed
3. Hastened

Dealing with dropped messages is fairly obvious, so we won't go into detail
about them here. Dropped messages of course includes corrupted packets, packets
that are so delayed as to exceed retry timeouts, DDoS attacks, etc.

A delayed message is simply the MITM attacker introducing an artificial delay.
A hastened message shows up as a negative delay. While unlikely, an attacker
could in theory accomplish this by rerouting a message to a network link with a
lower latency than the usual route, e.g via a more direct network path, or even
a low-latency radio or laser link (light travelling through a fiber optic cable
is about 30% slower than EM waves travelling through air/vacuum).


## Delta Time Measurement

As in Roughtime, each delta time measurement is performed by the client
generating a nonce, and asking the notary to timestamp that nonce. Replay
attacks are prevented by the uniqueness of the nonce.

As in NTP, this four step process results in four different time measurements,
t_s and t_p with respect to the client's local clock, and T_s and T_p with
respect to the notary's clock. We'll refer to the network latency of the
request and response as l_s and l_p respectfully.

Finally, we have a MITM attacker, who can artificially delay the request and
response messages. We'll refer to those delays as w_s and w_p. That gives us the
following timing diagram:

    Client: t_s                                      t_p
                ⭨                                   ⭧
    Latency:      l_s + w_s               l_p + w_p
                            ⭨           ⭧
    Notary:                   T_s → T_p

We can assume that the difference in frequency between client and notary clock
is negligible: even the cheapest quartz oscillator will have better than 0.01%
frequency accuracy. With that assumption, we can derive two equations for the
relationship between each pair of time measurements:

    t_s + Δ + l_s + w_s = T_s ± ε
    T_p ± ε + l_p + w_p = t_p + Δ

...isolating the delta time difference:

    Δ = T_s ± ε - t_s - (l_s + w_s)
    Δ = T_p ± ε - t_p + (l_p + w_p)

Neither equation can be solved in isolation, because the latencies in each
direction are unknown. But if we add them together we get:

        T_s + T_p   t_s + t_p       (l_p + w_p) - (l_s + w_s)
    Δ = ───────── - ───────── ± ε + ─────────────────────────
            2           2                       2

The first two terms are our true delta time difference; the third term ± ε is
the notary attested error bounds (which adds directly due to the delta encoding
of the attestation). In a perfect world with no network delay, and no attacker,
they would be our only terms.

The final term comes from the network latencies, both real, and any artificial
latency introduced by the attacker (or maybe removed!). There's two key
things to understand about the impact of latency on the uncertainty of the
measurement:

1. Symmetrical latency cancels out: if messages take the same time in both
   directions, latency has no impact on the measurement.

2. Asymmetric latency is undetectable: without additional knowledge it is
   fundamentally impossible to know the ratio of the two message latencies.

Usually networks have close to symmetrical latency; NTP relies on a symmetric
latency assumption. Conversely an attacker who wants to skew our clock will
introduce an asymmetric latency.


### One-Shot Measurement

Suppose nothing is known about the communication channel between client is
notary, and one delta time measurement is performed. What does the client
learn?

We'll call the sum of the latencies L. The client can measure the sum latency:

    L = l_s + w_s + l_p + w_p
      = (t_p - t_s) - (T_s - T_p)

The worst possible scenario is if the latency is completely asymmetric, and
entirely caused by a MITM attacker (l_s = l_p = 0). Since the attacker can
choose which direction to delay, the worst case uncertainty from MITM attack is
simply ±L/2.


### Repeated Measurements

Suppose the delta time measurements are repeated. What does the client learn?

Real world network latencies vary due to network congestion, routing changes,
etc. Second, attacks are rarely continuous.

Using the above worst case analysis, each sample can be thought of as a
constraint on what the present time could be. Due to the uncertainty in the
local clock's frequency relative to the notary, that constraint degrades over
time. As calculating these constraints if relatively straightforward, we won't
go into more details. But it's worth noting that constraint degradation happens
relatively quickly with standard computer hardware: a frequency error of
±50ppm, typical for a cheap quartz oscillator, is a ±180ms/hour drift. Thus
prior delta time samples become out of date relative to new ones in typical
conditions in a matter of minutes.


#### Discontinuous Attacks/Measuring Minimum Latency

A special case of repeated measurements is to assume that:

1) Latency is symmetric.
2) The MITM attacker can only delay packets, not hasten their arrival.
3) At some point, the attack stops.

With these assumptions the minimum latency can be measured, and subsequently,
subtracted from the measured latency to get a tighter uncertainty bound. With
dedicated near light speed communication channels such as radio links, and
careful link characterization, this approach could be used to achieve secure
clock synchronization with accuracy almost as good as existing non-MITM-secure
approaches.


# Scaling via Untrusted Aggregation Servers

The security requirements of trusted servers make achieving high performance
expensive for many reasons such as the difficulty of removing heat from
tamper-detecting enclosures, the desire to use simpler microprocessor
architectures that are easier to audit, etc.

Thus, we want to scale our trusted notaries with untrusted aggregation servers.
To recap, an aggregation server accepts n digests from clients, creates a
merkle tree, and requests that the notary timestamp the tip of that merkle
tree. Effectively, we're just adding more layers to the tree - the
OpenTimestamps proof format and calendar software already supports aggregation.
In fact, OpenTimestamps proofs even allow for an arbitrary number aggregation
layers: the proof format doesn't care how many merkle trees were built.

Our question here is how does this impact timing uncertainty? Let's looks at a
timing diagram for an aggregation round:

    Clients:   t_s_0   ...  t_s_n                         t_p_0 ...  t_p_n
                    ⭨            ⭨                       ⭧          ⭧
    Aggregator:                    t_s               t_p
                                       ⭨           ⭧
    Notary:                              T_s → T_p

Since the aggregator is _not_ trusted, an aggregation layer is simply an
increase in overall end-to-end latency. The most fundamental limit here is how
often merkle trees are built: if the trusted notary can accept f requests per
second, the worst-case interval between an aggregator receiving a request from
a client, and the aggregator sending the merkle tip, must be _at least_ 1/f.

The logic also holds true for each additional aggregation layer. However which
each aggregation layer able to scale total req/s by a factor of hundreds of
thousands on modern hardware, it's hard to imagine a scenario where you would
need more than two or three layers.

The fundamental limit is unlikely to be the limiting factor in practice: even
an original 700MHz Raspberry Pi can do well over 200,000 SHA256 operations per
second, more than enough to saturate the ~5000reqs/s the 100BaseT ethernet
interface is capable of. 5000req/s adds just 0.2ms of inherent max latency -
other sources of latency are likely to dominate.

Aggregation does create an interesting fairness/incentives problem: repeated
requests can get more or less lucky as to where a given request ends up
relative to when the aggregated request is actually sent. Implementations may
want to add artificial delays to ensure that repeated requests don't get
substantially better precision.


# Multicast Beacon Delivery

While the leaves of the merkle tree are unique to each client, the stamp/beacon
tree tip is common to all, and itself constitutes a valid random beacon with
attested time T±ε. Thus if low-latency multicast messaging mechanisms are
available such as radio, broadcast ethernet, multicast routing, etc, tighter
timing bounds may be achievable by using them to broadcast the tip.

For clock synchronization, the tip can be used instead of the per-leaf beacon.
Whether or not this actually reduces uncertainty depends on the multicast
technology used, and the security assumptions. In particular, whether or not
latency to/from the notary is assumed to be symmetric.

-- 
https://petertodd.org 'peter'[:-1]@petertodd.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <https://lists.opentimestamps.org/pipermail/ots-dev/attachments/20210121/4092cbdb/attachment.sig>


More information about the ots-dev mailing list