July 13, 2014

Blind Birthday Attack

Here's a crypto brain teaser I thought of while having a twitter discussion about double-HMAC and side channels. The solution is given in the next section, but you should try to solve it on your own first.

Blind Birthday Attack Problem: An attacker can query an oracle which takes two messages B1 and B2 as input, computes HMAC-SHA256(K, B1) and HMAC-SHA256(K, B2), for some secret 256-bit key K, and returns the length of the common prefix of the two HMACs (the number of bits that are the same up until the first difference). Can the attacker find a pair of distinct messages B1 and B2 such that HMAC-SHA256(K, B1) is equal to HMAC-SHA256(K, B2) in significantly less than 2256 queries?

If it's not clear from the description: The oracle keeps the actual HMAC values a secret, only telling the attacker the length of the common prefix between the two. In other words, you're trying to find a collision without knowing what the actual colliding values are. The answer should not involve breaking HMAC-SHA256 or guessing K.

I posted the problem a few days before I posted the solution. Only one person, Sam Schinke, solved it in that time.


The answer is yes, it is possible to find a collision in about 2128 queries (ignoring log factors). To see how, first realize that if we sent 2128 queries with random messages, then because of the birthday paradox, it's very likely that the HMACs of some pair of them would collide. It's very unlikely that any of the pairs we sent (we have to send a pair in each query) will collide, but maybe the B1 from our 23727th query collided with the B1 from our 9272617197227th query. If we just sent 2128 random queries, we would never find those cross-query collisions.

So creating a collision is easy: Any set of 2128 messages will probably create an HMAC collision somewhere. It's finding that collision that's hard.

We can find the collision by building a sort of "blind" binary search tree. This tree will have the following properties:

  1. Each node can have 0, 1, or 2 children.
  2. Each child is either a "left child" or a "right child."
  3. A node can only have one "left child" and one "right child." This means for a node with only one child, that child is either a left child or a right child. For a node that has two children, it has one left child and one right child.
  4. There is one node in the tree for each message (one-to-one mapping between nodes and messages).
  5. For a node in the Nth level of the tree (the root is level 1):
    • If it has a right child, the first N bits of the node's message's HMAC matches the first N bits of the right child's message's HMAC.
    • If it has a left child, the first N-1 bits of the node's message's HMAC matches the first N-1 bits of the left child's message's HMAC, and the Nth bit differs.

All we have to do is give the tree a root node, with a random message, and then keep adding random messages to the tree until we find a collision. The process for adding a message to the tree is simple. You should convince yourself that the tree properties are preserved:

Adding a Message to the Tree:

  1. Set the CURRENT node to the root node.
  2. Set LEVEL = 1.
  3. Query the Oracle for (MESSAGE, CURRENT) and get the length of the common prefix LENGTH.
  4. If LENGTH == 256 (MESSAGE, CURRENT) is a collision. Stop.
  5. If LENGTH >= LEVEL, then set CURRENT to CURRENT->RIGHT. Or, if CURRENT doesn't have a right child, set CURRENT->RIGHT = MESSAGE and stop.
  6. Otherwise, set CURRENT to CURRENT->LEFT. Or, if CURRENT doesn't have a left child, set CURRENT->LEFT = MESSAGE and stop.
  7. Set LEVEL = LEVEL + 1.
  8. Go back to Step 3.

That's it. Just keep repeating that with random messages until a collision is found. It's guaranteed to find the first HMAC collision that occurs in the set of messages you add, so overall you should expect to find a collision after about 2128 messages have been added. The tree's maximum depth is 256 (since a 257th level would imply a collision by property 5), so the number of oracle queries should be around 2136, much less than the 2256 queries that would be required if the oracle didn't return the common prefix length.


I wrote a Ruby implementation of this attack against HMAC truncated to 32 bits (so that it finishes in a reasonable amount of time). Here's the output, and the source code follows:

Closest collision so far: 1
Tree size: 0
Closest collision so far: 2
Tree size: 1
Closest collision so far: 3
Tree size: 4
Closest collision so far: 7
Tree size: 5
Closest collision so far: 9
Tree size: 11
Closest collision so far: 10
Tree size: 52
Closest collision so far: 14
Tree size: 86
Closest collision so far: 16
Tree size: 200
Closest collision so far: 17
Tree size: 347
Closest collision so far: 18
Tree size: 407
Closest collision so far: 19
Tree size: 679
Closest collision so far: 20
Tree size: 2895
Closest collision so far: 22
Tree size: 3256
Closest collision so far: 23
Tree size: 6067
Closest collision so far: 25
Tree size: 6678
Closest collision so far: 29
Tree size: 12976
Closest collision so far: 30
Tree size: 13006
Closest collision so far: 32
Tree size: 33425
Found a collision amongst 33425 in 445046 queries!
Message 1: 8db04aea6b6b8d3a80d93d7064ec78a0bd24a8cba3a56d3c2f8755d8a9b63a40
Message 2: 99cec515fff6c583134b0942c0e6381ebdb10c07b472f58fd74e1cb0fbb684b8

Here's the source code:

# Blind Birthday Attack. Proof of concept.
# https://defuse.ca/blind-birthday-attack.htm

require 'securerandom'
require 'openssl'

# Use 32-bit HMAC so that it's actually possible to perform the attack in
# a reasonable amount of time. This value must be a multiple of 8.

# Compute HMAC-SHA256
def hmac(key, message)
  return OpenSSL::HMAC::digest('SHA256', key, message)[0, HMAC_BITS/8]

# Here's the oracle, which we are attackking.  t has a 256-bit random secret
# key. When we give it two inputs a and b, it HMACs them both, and tells us how
# much of the HMACs match.
KEY = SecureRandom.random_bytes(32)
def oracle(a, b)
  h1 = hmac(KEY, a)
  h2 = hmac(KEY, b)
  h1_binary = h1.bytes.map { |c| c.to_s(2).rjust(8,'0') }.join('').split('')
  h2_binary = h2.bytes.map { |c| c.to_s(2).rjust(8,'0') }.join('').split('')
  0.upto(HMAC_BITS - 1) do |i|
    if h1_binary[i] != h2_binary[i]
      return i
  return HMAC_BITS

# Here's an implementation of the attack.

# We organize messages into a tree to find collisions.
class TreeNode
  attr_accessor :message, :left, :right

# We need a good supply of unique messages.
def random
  return SecureRandom.random_bytes(32)

def attack
  # Start the tree with one random message.
  root = TreeNode.new
  root.message = random()

  # Keep track of how much resources we use for the attack.
  queries = 0
  tree_size = 0
  closest = 0

  # Each time this outer loop iterates, a new message is added to the tree.
  loop do

    # Generate a new random message to add to the tree.
    newnode = TreeNode.new
    newnode.message = random()

    # Start inserting at the root of the tree.
    current = root

    # Keep track of how deep we are into the tree.
    # This is the number of HMAC bits that are known to match between the
    # 'current' node's message and the newnode's message.
    matching = 0

    # Each time this inner loop iterates, we go one level deeper into the tree.
    loop do

      # Ask the oracle how many bits the two HMACs match.
      thismatch = oracle(current.message, newnode.message)
      queries += 1

      # If we have a better match than ever before, tell the user, so that they
      # don't get bored and quit the attack before it finishes.
      if thismatch > closest
        closest = thismatch
        puts "Closest collision so far: #{thismatch}"
        puts "Tree size: #{tree_size}"

      # If we found a collision, tell the user and stop the attack.
      if thismatch == HMAC_BITS && current.message != newnode.message
        puts "Found a collision amongst #{tree_size} in #{queries} queries!"
        puts "Message 1: #{current.message.unpack("H*")[0]}"
        puts "Message 2: #{newnode.message.unpack("H*")[0]}"

      # If the (matching+1)st HMAC bit matches, we go right. Else, left.
      if thismatch > matching
        # If the right node is nil, this is where we add the new node.
        if current.right.nil?
          current.right = newnode
          tree_size += 1
        # Otherwise, just move on to the next level.
        current = current.right
        # If the left node is nil, this is where we add the new node.
        if current.left.nil?
          current.left = newnode
          tree_size += 1
        # Otherwise, just move on to the next level.
        current = current.left

      # We've moved down to the next level.
      matching += 1