Challenge 3

Challenge 3: Secret Key Generation in Massive MIMO OFDM Under One-Wavelength Eavesdropping

Abstract

Secret key generation (SKG) exploits channel reciprocity and randomness to derive shared cryptographic keys from wireless fading coefficients. While classical models assume decorrelation beyond half-wavelength separation, practical indoor propagation may violate this assumption. This chapter introduces a security challenge evaluating SKG robustness in a massive MIMO orthogonal frequency division multiplexing (OFDM) system when a passive eavesdropper is located approximately one wavelength from a legitimate node. Using 64-antenna ULA and 100-subcarrier channel state information from the Ultra-Dense Indoor MaMIMO dataset [1], and building upon the full-chain SKG framework in [2], the challenge provides reconciliation artifacts and invites the research community to attempt key recovery or entropy reduction attacks.

SKG Protocol: We consider a time-division duplex (TDD) massive MIMO OFDM system where Alice and Bob aim to generate a shared secret key from reciprocal channel observations H_AB(f_k) ≈ H_BA(f_k) ∈ C^M×1, where M is the number of ULA antenna elements, N the number of active OFDM subcarriers, and f_k the k-th subcarrier frequency with k = 1, …, N. A passive Eve at approximately one wavelength λ from Bob observes a correlated channel H_AE(f_k) ∈ C^M×1. The SKG protocol extracts shared randomness through quantization, reconciliation via public syndrome s_A, and privacy amplification. Let r_A, r_B, and r_E denote the bit sequences derived by Alice, Bob, and Eve, respectively, after quantization. Security is evaluated via the conditional min-entropy

which captures the residual uncertainty in Alice’s bit sequence given all information available to Eve. Unlike classical fading assumptions, indoor multipath may induce structured spatial correlation even at one-wavelength separation, making robust key extraction non-trivial.

Dataset

The original Ultra-Dense Indoor MaMIMO dataset [1] is available at this link: https://dx.doi.org/10.21227/nr6k-8r78. The dataset is collected using the KU Leuven ESAT-TELEMIC massive MIMO testbed, which consists of a base station equipped with 64 patch antennas and four user equipments (UEs). Measurements are performed in an indoor 3 m × 3 m area using a time-division duplex (TDD) system with OFDM modulation. During data acquisition, the base station simultaneously receives orthogonal pilot signals from the four UEs and performs channel estimation. Each channel state information (CSI) sample is represented as a complex matrix of size 64 × 100 corresponding to 64 antennas and 100 subcarriers. The system operates at a center frequency of 2.61 GHz with a 20 MHz bandwidth. The dataset is designed for high-precision indoor localization and sensing applications. It includes measurements under three antenna array configurations: a uniform linear array (ULA), a uniform rectangular array (URA), and distributed linear arrays (DIS). The UEs are moved along a controlled zigzag trajectory. The ground truth positions are collected with less than 1 mm error, and the movement step size is 5 mm. In total, 252004 CSI samples are collected across the four UEs. The dataset supports research in fine-grained positioning, multi-device localization, and applications such as smart environments and robotic positioning.

Figure 1: UE positions in the wavelength-scaled sampling grid.

Wavelength-scaled Spatial Sampling: We constructed a wavelength-based spatial sampling grid from the original CSI data. The grid is shown in Fig. 1. This approach enables the structured collection of CSI snapshots at precise spatial locations, allowing for coherent analysis of how channels evolve across distinct propagation paths.

The spatial grid is constructed using wavelength-scaled sampling. Given the wavelength λ ≈ 11.5 cm and the UEs’ moving stride 0.5 cm, we define a sampling resolution parameter that represents the number of measurement samples per wavelength. For each UE, the grid construction process identifies measurement indices at discrete wavelength intervals, creating a regular pattern of CSI observations spaced approximately one wavelength apart. The resulting grid (Fig. 1) spans multiple wavelengths, yielding a 12 × 12 grid structure for each of the four UEs. The obtained dataset includes 576 total measurement points for all the 4 UEs. All extracted grid indices and corresponding CSI measurements and position coordinates undergo validation procedures to ensure the grid dataset provides reliable spatial wavelength-scaled sampling.

To support both uplink and downlink channel analysis for SKG purposes, the grid indices identify paired measurement points representing channel reciprocity conditions. For each grid point designated as an uplink measurement, we designate a corresponding downlink measurement taken at a consecutive snapshot from the original dataset (5 mm away). This enables direct comparison of reciprocal channel conditions. This uplink-downlink pairing facilitates the study of channel reciprocity properties for SKG.

Challenge Description

No additional software is required beyond standard Matlab libraries.

Despite the principle of channel reciprocity, practical channel measurements are corrupted by noise, hardware imperfections, and time-variant phenomena. This can lead to random variations and inconsistencies in reciprocal CSI observations. When quantizing these CSI estimations, the granularity of quantization levels and the magnitude of channel mismatch can result in different binary representations between Alice and Bob. These mismatches directly limit the achievable secret key rate if no corrective measures are taken. To overcome this fundamental challenge, distributed source coding techniques are adopted, specifically Slepian-Wolf coding [3]. This enables efficient reconciliation [3, 4] of the mismatched sequences through structured error correction, allowing Alice and Bob to agree on a common binary key despite their imperfect observations.

The key generation process includes three main steps: quantization, information reconciliation, and privacy amplification. To extract binary key material from the continuous-valued channel observations, uniform quantization is applied to both uplink and downlink CSI. We use a 2-bit uniform quantizer to quantize the CSI values.

Reconciliation: To reconcile the mismatched binary sequences between Alice and Bob, Slepian-Wolf decoding is implemented using Polar codes with cyclic redundancy check (CRC) as the error correction code. Alice then generates a syndrome using the Slepian-Wolf coding approach and transmits it over a public channel to Bob. This enables Bob to decode and recover Alice’s key with high reliability despite the channel mismatch using Polar codes. These reconciled sequences form the input to privacy amplification.

Concerning the considered Polar codes for the reconciliation block, we use Gaussian approximation based polar code construction [5, 6], which assumes that log-likelihood ratios (LLRs) remain Gaussian distributed through the polar transformation. The idea of Gaussian approximation is to evolve the densities and estimate the precise reliability of each channel. For AWGN with variance σ², the channel LLR is modeled as LLR ~ N(mu, 2mu) and mu_0 = 2/sigma^2 where μ₀ is the initial LLR mean. Gaussian approximation propagates only the mean LLR μ through the polarization process, yielding μ_i for each synthesized bit-channel W⁽ⁱ⁾. Let φ(·) denote the Gaussian approximation check-node function [5], with an accurate empirical fit

The polarized outputs are given by

where μ⁻ is the Gaussian approximation estimated reliability of the bad channel and μ⁺ is the Gaussian approximation estimated reliability of the good channel. Then the reliabilities of the polar code are found by sorting the μ_i values. The larger μ_i correspond to the more reliable bit-channels.

The reconciliation process is comprehensively evaluated across a range of code rates at an SNR of 20 dB for a codelength of 256. The reconciliation scheme operates independently at each of the 576 spatial grid points. Hence, for a chosen point on the grid, the reconciliation can output the reconciled vectors for both Alice and Bob. For example, for the point at the coordinates (−1.202, 2.405, 0.4), we show in Fig. 2 the error probability after reconciliation for different code rates. As expected, the error probability increases with the code rate.

Figure 2: Error probability after reconciliation vs Code rate: SNR = 20 dB, Codelength = 256.

Privacy amplification: Following quantization and information reconciliation, privacy amplification is applied to compress the reconciled bit sequence into a shorter secret key while removing any residual information potentially available to the eavesdropper. In this challenge, we do not explicitly run a statistical test to estimate the optimal hashing rate for the considered dataset. Instead, we rely on previously obtained results under a Line of Sight (LoS) static scenario, which is compatible with the propagation conditions of the current dataset [2, 7]. Based on these results, we adopt a conservative hashing rate of R = 0.1, chosen deliberately lower than the estimated requirement to ensure a sufficient security margin. This value may be refined in future updates of the challenge as more precise entropy estimates become available. Once the hashing rate is fixed, privacy amplification is performed by applying a cryptographic hash function to the reconciled bit sequence. Specifically, the reconciled vectors are processed using the AES-128 hash function, producing a secret key of length consistent with the selected compression rate. A Davies-Meyer compression function is used to build the hash function (Fig. 3). Suppose E : K × {0, 1}ⁿ → {0, 1}ⁿ a block cipher. Davies-Meyer compression function is given by

where m is the message block and H the current hash value. Let r be an output vector after reconciliation. At any point, we take 2 blocks of 128 bits from r as inputs and the current hash value is also XORed with the output of that iteration. Therefore, at each iteration, the size of r is reduced by half, then we move on to the next iteration, and so on.

This obtained key is then used for subsequent cryptographic operations, including one-time pad encryption in the challenge setting.

Input and Output

Inputs provided to the participant:

Eve’s Channel State Information (CSI) H_AE ∈ C^64×1, where the 64 rows correspond to the Uniform Linear Array (ULA) antenna elements and the 100 columns correspond to the Orthogonal Frequency Division Multiplexing (OFDM) subcarriers, measured at approximately one wavelength from Bob over a 12 × 12 spatial grid (576 total measurement points across 4 UEs).
Public reconciliation information s_A: the syndrome generated by Alice using Polar codes with Cyclic Redundancy Check (CRC), transmitted over the public channel and therefore fully observable by Eve.
Ciphertext c = m ⊕ k: the One-Time Pad (OTP)-encrypted message, publicly disclosed.
Full Secret-Key-Generation (SKG) protocol specification:
- 2-bit uniform quantization applied to CSI observations.
- Slepian-Wolf reconciliation via Polar codes with CRC (codelength = 256, evaluation Signal-to-Noise Ratio (SNR) = 20 dB).
- Privacy amplification with hashing rate R = 0.1 using AES-128 based hashing function, implemented using Davies-Meyer compression function.
No access to Alice’s and Bob’s channel observations H_AB and H_BA.

Output expected from the participant:

Recovered plaintext m, obtained by reconstructing the secret key k and computing m = c ⊕ k.

Participants must also submit a description of their methodology, including well-commented code and an explanation of the proposed solution.

Evaluation Metric

The objective of challenge participants is to recover the OTP-encrypted plaintext m, obtained from the publicly provided ciphertext c = m ⊕ k. The resulting secret key is subsequently used as an OTP encryption key to assess its cryptographic strength under adversarial observation. Successful recovery is defined as exact reconstruction of the plaintext, while partial success is measured by the fraction of correctly recovered bits.

A submission is considered successful if it achieves either exact key recovery, exact plaintext recovery, or a statistically significant improvement over random guessing.

Bibliography

“Ultra-dense indoor MaMIMO CSI dataset,” IEEE DataPort, 2023. https://ieee-dataport.org/open-access/ultra-dense-indoor-mamimo-csi-dataset
A. Mayya, M. Mitev, A. Chorti, and G. Fettweis, “A SKG security challenge: Indoor SKG under an on-the-shoulder eavesdropping attack,” in Proc. GLOBECOM 2023 – 2023 IEEE Global Communications Conference, 2023, pp. 3451–3456.
M. Shakiba-Herfeh and A. Chorti, “Comparison of short blocklength Slepian-Wolf coding for key reconciliation,” in 2021 IEEE Statistical Signal Processing Workshop (SSP), Rio de Janeiro, Brazil: IEEE, Jul. 2021, pp. 111–115.
A. K. A. Passah, R. C. de Lamare, and A. Chorti, “Channel state information preprocessing for CSI-based physical-layer authentication using reconciliation,” 2026, under review in IEEE Transactions on Signal Processing.
P. Trifonov, “Efficient design and decoding of polar codes,” IEEE Transactions on Communications, vol. 60, no. 11, pp. 3221–3227, Nov 2012.
R. M. Oliveira and R. C. de Lamare, “Polar codes based on piecewise gaussian approximation: Design and analysis,” IEEE Access, vol. 10, pp. 73571–73582, 2022.
A. Mayya, L. Senigagliesi, and A. Chorti, “Theoretical and practical analysis of secret key rates based on design parameters and channel characteristics,” Submitted to IEEE IoT Journal, 2026.

To participate: submit your solution using the submission form.