npj Quantum Information

Contractive unitary and classical shadow tomography

Here's a breakdown of the abstract, designed for a zero-base reader:

Research Field Quantum Information

Article Type Research analysis

Authors Wu et al.

Original Paper Published 2026-04-04

ISOM Posted 2026-04-07 08:51 UTC

Read Time 30M

Open PDF Open DOI Open Source Page

Editorial Disclosure

ISOM follows an editorial workflow that structures the source paper into a readable analysis, then publishes the summary, source links, and metadata shown on this page so readers can verify the original work.

The goal of this page is to help readers understand the paper's core question, method, evidence, and implications before opening the original publication.

Background & Academic Lineage

The Origin & Academic Lineage

The problem addressed in this paper originates from the fundamental challenge of characterizing quantum states in the rapidly advancing field of quantum information and computation. As quantum devices grow in scale, involving hundreds or more qubits, the traditional method of "full quantum state tomography" becomes impractical. This is because it demands an exponential number of measurements relative to the system size, making it resource-intensive and computationally prohibitive for large-scale applications.

A significant breakthrough in addressing this issue came with the advent of "classical shadow tomography," first proposed by Aaronson in 2018 [12]. This approach dramatically reduces the "sample complexity"—the number of measurements needed—to predict many properties of a quantum state without requiring its full reconstruction. Classical shadow tomography typically involves applying random unitary operations to a quantum state, measuring it, and then using these "classical snapshots" to infer properties.

Despite the considerable progress made by classical shadow tomography, a key limitation has persisted: reducing the sample complexity below a scaling of $2^k$ for extracting properties of local operators of size $k$. Previous methods, particularly those relying on random Clifford rotations, generally achieve a sample complexity that scales as $2^k$ (or $k \times 2^k$ when considering unknown operator locations). This $2^k$ scaling remains a significant "pain point," as it still demands substantial resources for larger $k$, hindering the efficient characterization of complex quantum many-body states. The authors of this paper were thus motivated to explore new strategies within classical shadow tomography to overcome this $2^k$ barrier and achieve a more efficient scaling.

Intuitive Domain Terms

To help a zero-base reader grasp the core concepts, here are some specialized terms from the paper, translated into everyday analogies:

Quantum State Tomography (QST): Imagine you have a mysterious, complex object, and you want to know everything about it—its exact shape, internal structure, and every tiny detail. QST is like taking a complete, high-resolution 3D scan of that object. It gives you a perfect blueprint, but it's incredibly time-consuming and expensive for very large or intricate objects.
Classical Shadow Tomography: Instead of a full 3D scan, imagine you only need to know a few specific things about your mysterious object, like its weight, whether it's symmetrical, or if it floats. Classical shadow tomography is like taking a few carefully chosen "shadows" or 2D photos from different angles. You can't reconstruct the whole object from these, but you can accurately predict many of its properties with far less effort than a full scan.
Sample Complexity: This refers to how many times you need to repeat an experiment or take a measurement to get a reliable answer. If you're trying to figure out the average height of people in a city, the sample complexity is the number of people you need to measure. Lower sample complexity means you need fewer measurements to get a good estimate.
Pauli String Operator: Think of this as a very specific "question" you can ask a quantum system. For example, "Is qubit 1 spinning up AND qubit 3 spinning down in a particular direction?" A Pauli string is a sequence of these simple questions applied to different parts of the system. The "size" of the string indicates how many qubits are involved in the question.
Contractive Unitary: This is a special kind of quantum operation, like a "data compressor" or "simplifier." Its job is to take complex Pauli string operators and make them "smaller" or less intricate. By simplifying these operators, the contractive unitary makes it much easier and faster to measure the quantum system's properties, thereby reducing the overall sample complexity.

Notation Table

Notation	Description
$\hat{O}$	Pauli string operator (observable) whose expectation value is to be estimated.
$\rho$	The quantum state of the system.
$k$	The size of the local operator / number of qubits in the subsystem on which $\hat{O}$ acts.
$N$	The total number of qubits in the quantum system.
$\hat{U}$	A generic global unitary operation applied to the quantum state.
$\hat{U}_{ct}$	The full contractive unitary for $k$ qubits, constructed from two-qubit contractive unitaries.
$\hat{U}_{ij}$	The two-qubit contractive unitary acting on qubits $i$ and $j$, defined as $\exp(i\frac{\pi}{4} \hat{Z}_i\hat{Z}_j)$.
$\mathcal{E}_U$	A unitary ensemble (e.g., random Clifford, contractive unitary) from which $\hat{U}$ is sampled or constructed.
$\|\|\hat{O}\|\|^2_{\mathcal{E}_U}$	The shadow norm squared of observable $\hat{O}$ averaged over ensemble $\mathcal{E}_U$, directly quantifying sample complexity.
$w(\hat{O}_U)_{\mathcal{E}_U}$	The Pauli weight of the evolved operator $\hat{O}_U = \hat{U}\hat{O}\hat{U}^\dagger$, averaged over ensemble $\mathcal{E}_U$.
$\pi(m)_{\mathcal{E}_U}$	The size distribution of the evolved operator $\hat{O}_U$, representing the probability of $\hat{O}_U$ having size $m$.
$m$	The operator size, defined as the number of non-identity Pauli operators in a Pauli string.
$N_{xy}$	The total number of $\hat{X}$ and $\hat{Y}$ operators in a Pauli string $\hat{O}$.
$\sigma_U(z)$	A classical snapshot of the density matrix reconstructed from a single measurement outcome $z$.
$M$	The number of classical snapshots (measurements) collected to estimate an expectation value.
$D[\langle \hat{O} \rangle]$	The variance of the expectation value of $\hat{O}$.

Problem Definition & Constraints

Core Problem Formulation & The Dilemma

Input/Current State: The current landscape of quantum state characterization is dominated by the challenge of efficiently describing complex quantum many-body states. While a complete quantum state tomography demands an exponentially increasing number of measurements with the system size, rendering it impractical for large-scale quantum devices, classical shadow tomography offers a significant improvement. This technique reduces the sample complexity—the number of measurements needed to estimate state properties—by employing random Clifford rotations before measurements. However, despite these advancements, a persistent challenge remains: reducing the sample complexity below $2^k$ for extracting any non-successive local operators of size $k$. This $2^k$ scaling is still too high for practical applications involving larger numbers of qubits.

Output/Goal State: The primary objective of this paper is to achieve a substantially smaller sample complexity for estimating properties of quantum states, specifically targeting local operators of size $k$. The authors aim to reduce this complexity from the current $2^k$ scaling to an improved $\sim 1.8^k$. This reduction is sought through a novel protocol that strategically combines locally random and globally deterministic unitary operations.

Exact Missing Link / Mathematical Gap: The core mathematical gap addressed by this paper is the absence of an optimal global unitary operation, $\hat{U}$, that can more efficiently "contract" the size distribution of evolved Pauli string operators, $\hat{O}_U = \hat{U}\hat{O}\hat{U}^\dagger$. Existing classical shadow tomography protocols often rely on maximally scrambling random unitaries (such as random Clifford gates), which result in a shadow norm scaling of $2^k$. The paper identifies the missing link as the discovery of a deterministic global unitary, termed a "contractive unitary," capable of reducing the operator size more effectively than these purely random ensembles. Mathematically, the challenge is to find a $\hat{U}$ that minimizes the shadow norm $||\hat{O}||_{\mathcal{E}_U}^2$, which is directly related to the operator size distribution $\pi(m)$ of the evolved operator $\hat{O}_U$, as articulated in Equation (1):
$$||\hat{O}||_{\mathcal{E}_U}^2 = w(\hat{O}_U)_{\mathcal{E}_U}, \quad w(\hat{O}_U)_{\mathcal{E}_U} = \sum_m \frac{\pi(m)_{\mathcal{E}_U}}{3^m}$$
The paper's contribution is to find a $\hat{U}$ that yields a size distribution $\pi(m)$ preferentially peaked at smaller $m$ values, thereby lowering the shadow norm and, consequently, the required sample complexity.

Painful Trade-off / The Dilemma: The central dilemma that has historically constrained researchers in classical shadow tomography is the inherent trade-off between achieving a general, unbiased measurement scheme and minimizing sample complexity. Maximally scrambling random unitaries are employed to ensure that all operators can be measured on equal footing, effectively eliminating local basis dependence. While this approach provides generality and robustness, it leads to a binomial operator size distribution and a $2^k$ shadow norm. The painful trade-off is that this "maximal scrambling," while ensuring broad applicability, might not be the most efficient strategy for reducing the sample complexity for all types of operators, particularly local Pauli strings. The authors' key insight is that a hybrid random-deterministic protocol, incorporating a "contractive unitary," can circumvent this trade-off. This approach selectively contracts operator sizes more efficiently than purely random methods, thereby achieving a better balance between generality and the efficiency of measurement.

Constraints & Failure Modes

The problem of efficiently characterizing quantum states, particularly with the proposed contractive unitary approach, is bounded by several harsh, realistic constraints:

Physical Constraints:
* Exponential Resource Scaling: A fundamental physical limitation is that full quantum state tomography requires an exponentially growing number of measurements with system size. This makes it impractical for quantum devices with more than a few hundred qubits, necessitating alternative methods like classical shadow tomography.
* Hardware Implementability: The proposed contractive unitary must be practically implementable on current and near-future quantum computation platforms. The authors emphasize that their contractive unitary "perfectly matches the advantages of the atom array quantum computation platform and is readily realized in the atom array quantum processor." This implies a strict constraint on the types of gates and operations that can be efficiently executed by existing hardware.
* Circuit Complexity and Depth: For practical deployment, the contractive unitary $\hat{U}_{ct}$ must have low circuit complexity. Although it might seem to involve $\sim k^2$ gates for a $k$-qubit system, the paper notes it "only relies on deterministic and mutually commuting two-qubit gates" and can be implemented in "at most $k-1$ steps of physical operations" on atom array platforms due to their parallel gate capabilities. For platforms with limited connectivity, such as superconducting qubits, the authors show that adding a single ancillary qubit allows decomposition into $k$ steps of local gates, ensuring the circuit depth remains linear in $k$. These are critical constraints for the feasibility of the protocol.

Computational Constraints:
* Sample Complexity Target: The primary computational constraint is the imperative to reduce the sample complexity from $2^k$ to a more manageable scaling, specifically $\sim 1.8^k$. A larger shadow norm directly necessitates a higher number of classical snapshots to reduce prediction variance, thereby increasing the computational cost associated with data acquisition and processing.
* Classical Simulation Limits: For quantum systems with $k \sim 100$ qubits, which are now achievable on modern quantum platforms, classical simulation becomes "impractical to simulate conventionally." This means that any proposed quantum solution must be efficient enough to be useful on actual quantum hardware, as classical verification or full simulation is not a viable option.

Data-Driven Constraints & Failure Modes:
* Prior Knowledge of Operator Location: The initial formulation of the contractive unitary protocol assumes prior knowledge of the precise location where the operators act. This is a significant data-driven constraint, as in many real-world applications (e.g., energy estimation), this information is often unknown.
* Naive Extension Failure: A naive extension of the contractive unitary to scenarios without prior knowledge of operator locations (by applying it to the entire system) would paradoxically lead to an even larger shadow norm than the random Clifford protocol ($\sim 2^N$ for $N$ total qubits). This represents a critical failure mode that the authors address with a "sliding trick" to maintain efficiency when operator locations are unknown. Without this trick, the benefits of the contractive unitary would be lost for a broad class of problems.
* Non-successive Local Operators: The problem specifically targets the characterization of "non-successive local operators with a size $k$." This implies that the solution must be robust and effective even for operators that are not contiguous or simple in their structure.

Why This Approach

The Inevitability of the Choice

The authors of this paper faced a fundamental hurdle in quantum information: fully characterizing complex quantum states, a process known as full quantum state tomography, demands an exponential number of measurements relative to the system size. This exponential scaling renders it impractical for the larger quantum devices we're building today. Classical shadow tomography (CST) emerged as a significant breakthrough, drastically reducing the sample complexity by employing random Clifford rotations before measurements. However, even with this advancement, a critical challenge remained: reducing the sample complexity below $2^k$ for estimating properties of non-successive local operators of size $k$.

The exact moment the authors realized traditional "state-of-the-art" (SOTA) methods, like standard random Clifford protocols, were insufficient was when they identified this persistent $2^k$ scaling. Despite the scrambling effect of random Clifford unitaries, which distributes operator weight across many Pauli strings, the resulting shadow norm (a measure of sample complexity) still scaled as $2^k$. This meant that for larger $k$, the number of measurements required was still prohibitively high. The authors explicitly posed the question: "A challenging question is whether there exist other choices of global unitaries that can outperform the maximally scrambled random unitary and result in a smaller shadow norm compared to $\sim 2^k$." This question itself highlights the insufficiency of existing methods and the necessity for a new approach. The "contractive unitary" was conceived as the direct answer to this, designed specifically to more efficent in reducing operator size and thus achieve a lower shadow norm.

Comparative Superiority

The contractive unitary approach offers qualitative superiority over previous gold standards in classical shadow tomography, going beyond mere numerical improvements. Its primary structural advantage lies in its ability to deterministically contract operator size. Unlike random Clifford unitaries that maximally scramble operators, leading to a broad binomial size distribution peaking around $3k/4$ and a shadow norm of $2^k$, the contractive unitary is engineered to actively reduce the size of certain Pauli string operators. For instance, it can transform four size-2 Pauli operators into size-1 operators. This targeted contraction results in a size distribution that is peaked at smaller operator sizes (e.g., $m/k \approx 2/3$ for odd $N_{XY}$ and a delta peak at $k$ for even $N_{XY}$), which, according to the theory of shadow norms, directly translates to a smaller sample complexity.

This structural advantage is profound: it reduces the sample complexity from $\sim 2^k$ (for random Clifford) to $\sim 1.8^k$ (or $k \times 1.8^k$ when operator locations are unknown). For a system with $k \sim 100$ qubits, this translates to a remarkable "more than $10^4$-fold improvement" in the required sample resources. Furthermore, the paper highlights that this method demonstrates "robustness against quantum noise," maintaining its scaling advantage even in noisy environments, which is a crucial qualitative benefit for real-world quantum devices. Another significant structural advantage is its low circuit complexity; it relies solely on deterministic and mutually commuting two-qubit gates, making it highly amenable to current quantum hardware.

Alignment with Constraints

The chosen contractive unitary method perfectly aligns with the stringent requirements of efficient quantum state characterization. The core problem constraint is the exponential measurement cost of full quantum state tomography and the need to reduce sample complexity for estimating properties of many-body quantum states. The contractive unitary directly addresses this by achieving a sample complexity scaling of $\sim 1.8^k$ (or $k \times 1.8^k$ with the sliding trick), which is a substantial improvement over the $\sim 2^k$ scaling of previous methods. This "marriage" between the problem's harsh requirements and the solution's unique properties is evident in its design goal: to be "more efficient in reducing the operator size to enhance tomography efficiency."

Beyond theoretical efficiency, the method also aligns with practical hardware constraints. The paper explicitly states that the contractive unitary "perfectly matches the advantages of the atom array quantum computation platform and is readily realized in the atom array quantum processor." This is becuase it can be implemented using deterministic, mutually commuting two-qubit gates, which are well-suited for the all-to-all connectivity and parallel gate operations available in atom arrays. Even for platforms with limited connectivity, the paper notes that the contractive unitary can be decomposed into $k$ steps of local gates with the addition of a single ancilla qubit, ensuring its broad applicability across different quantum computing architectures.

Rejection of Alternatives

The paper implicitly, yet strongly, rejects other popular approaches within classical shadow tomography due to thier quantitative inferiority in achieving the desired sample complexity. The primary alternative discussed is the fully random Clifford protocol, which was the previous SOTA. While random Clifford unitaries were a breakthrough in reducing tomography costs, they still resulted in a shadow norm that scaled as $2^k$. The authors' entire motivation stems from the fact that this $2^k$ scaling was "insufficient" for the problem of extracting non-successive local operators with a size $k$ below this bound. The contractive unitary was specifically designed to outperform this $2^k$ scaling by optimizing the unitary's deterministic component to contract operator size more effectively.

Similarly, the paper mentions "shallow circuit protocols" as another alternative. However, Table I clearly shows that the shallow circuit protocol yields a sample complexity of $> 2^k$ for known operator locations and $k \times 2^k$ for unknown locations. This makes it less efficient than both the random Clifford and, critically, the contractive unitary protocol. Therefore, these alternatives were not chosen because they failed to meet the critical requirement of achieving a sub-$2^k$ scaling for sample complexity, which the contractive unitary successfully delivers. The paper does not discuss other machine learning paradigms like GANs or Diffusion models, as they are not directly applicable to the problem of quantum state tomography in the same context as classical shadows.

Mathematical & Logical Mechanism

The Master Equation

The absolute core equation that underpins the efficiency gains in this paper is the definition of the shadow norm, which directly quantifies the sample complexity of classical shadow tomography. The authors aim to minimize this quantity. It's presented as:

$$ ||\hat{O}||^2_{\mathcal{E}_U} = w(\hat{O}_U)_{\mathcal{E}_U} = \sum_m \frac{\pi(m)_{\mathcal{E}_U}}{3^m} $$

While this equation defines the target, the mechanism to achieve the reduction in this norm is primarily driven by the "contractive unitary" and its effect on operator size. The specific two-qubit contractive unitary is given by:

$$ \hat{U}_{12} = \exp\left(i\frac{\pi}{4} \hat{Z}_1\hat{Z}_2\right) $$

And its effect on the operator size for a $k$-qubit Pauli string $\hat{O}$ under the full contractive unitary $\hat{U}_{ct}$ is described by:

$$ m = \text{Size}(\hat{U}_{ct} \hat{O} \hat{U}_{ct}^\dagger) = \begin{cases} N_{xy} & \text{if } N_{xy} \in \text{odd}, \\ k & \text{if } N_{xy} \in \text{even}. \end{cases} $$

Finally, the resulting Pauli weight for the contractive unitary, which directly relates to the shadow norm, is given by:

$$ w(\hat{O})_{ct} = \frac{1}{2 \cdot 3^k} \left[ \frac{(-1)^k + 1}{9^k} + \left(\frac{5}{9}\right)^k \right] $$

Term-by-Term Autopsy

Let's dissect these equations to understand their components.

For the Master Equation (Shadow Norm):
$$ ||\hat{O}||^2_{\mathcal{E}_U} = w(\hat{O}_U)_{\mathcal{E}_U} = \sum_m \frac{\pi(m)_{\mathcal{E}_U}}{3^m} $$

$||\hat{O}||^2_{\mathcal{E}_U}$: This term represents the shadow norm squared of the observable $\hat{O}$ averaged over the unitary ensemble $\mathcal{E}_U$.
- Mathematical Definition: It's essentially the variance of the estimator used to predict the expectation value of $\hat{O}$. A smaller shadow norm implies a lower variance in the predictions.
- Physical/Logical Role: This is the figure of merit for the classical shadow tomography protocol. It directly characterizes the sample complexity – how many measurement snapshots are needed to estimate $\hat{O}$ with a certain precision. The primary goal of this paper is to reduce this value.
- Why squared? Variance is inherently a squared quantity, reflecting the spread of possible outcomes.
$w(\hat{O}_U)_{\mathcal{E}_U}$: This is the Pauli weight of the evolved operator $\hat{O}_U$, averaged over the unitary ensemble $\mathcal{E}_U$.
- Mathematical Definition: It's an average measure of how "spread out" or "scrambled" the operator $\hat{O}$ becomes after being transformed by a unitary from the ensemble $\mathcal{E}_U$.
- Physical/Logical Role: This term is directly proportional to the shadow norm. Reducing the Pauli weight is the mechanism by which the sample complexity is lowered. The authors found a way to make this weight smaller than previous methods.
$\sum_m$: This is a summation over all possible operator sizes $m$.
- Mathematical Definition: A discrete sum.
- Physical/Logical Role: When an initial Pauli string operator $\hat{O}$ is evolved by a unitary $\hat{U}$, it can transform into a superposition of Pauli strings of various sizes. This summation accounts for the contributions from all these possible evolved sizes.
- Why summation instead of integral? Operator size $m$ (the number of non-identity Pauli operators) is a discrete integer quantity, so a sum is the natural mathematical operation.
$\pi(m)_{\mathcal{E}_U}$: This is the size distribution of the evolved operator $\hat{O}_U$, averaged over the unitary ensemble $\mathcal{E}_U$.
- Mathematical Definition: It represents the probability that the evolved operator $\hat{O}_U$ has a size $m$. For an operator $\hat{O}_U = \sum_P c_P P$ (where $P$ are Pauli strings), $\pi(m) = \sum_{\text{Size}(P)=m} |c_P|^2$.
- Physical/Logical Role: This distribution tells us how the unitary transformation affects the "complexity" of the operator. If the unitary can shift this distribution towards smaller $m$ values, it directly reduces the shadow norm. This is the central idea behind the "contractive unitary."
$3^m$: This term appears in the denominator.
- Mathematical Definition: An exponential factor.
- Physical/Logical Role: This factor arises from the properties of Pauli measurements. For a Pauli string of size $m$, there are $3^m$ possible non-identity Pauli operators (X, Y, Z) that can act on those $m$ qubits. This term effectively penalizes larger operator sizes: a larger $m$ makes $1/3^m$ smaller, but if $\pi(m)$ is significant for large $m$, the overall contribution to the shadow norm can still be large. The goal is to make $\pi(m)$ concentrated at small $m$.

For the Contractive Unitary $\hat{U}_{12}$ (Equation 2):
$$ \hat{U}_{12} = \exp\left(i\frac{\pi}{4} \hat{Z}_1\hat{Z}_2\right) $$

$\hat{U}_{12}$: This is the two-qubit contractive unitary.
- Mathematical Definition: A unitary operator acting on qubits 1 and 2.
- Physical/Logical Role: This specific gate is the fundamental building block of the authors' deterministic unitary. It's chosen because of its particular commutation relations with Pauli operators, which allow it to "contract" operator sizes. It's equivalent to a controlled-Z (CZ) gate up to local rotations.
$\exp(\dots)$: The matrix exponential.
- Mathematical Definition: Defines a unitary operator from a Hermitian generator. For a Hermitian operator $H$, $U = \exp(iH)$ is unitary.
- Physical/Logical Role: This is the standard way to construct quantum gates from interaction Hamiltonians or generators in quantum mechanics.
$i$: The imaginary unit.
- Mathematical Definition: $\sqrt{-1}$.
- Physical/Logical Role: Essential for ensuring the unitarity of quantum operations and the complex nature of quantum states.
$\frac{\pi}{4}$: A constant phase factor.
- Mathematical Definition: A specific angle.
- Physical/Logical Role: This specific value determines the strength and type of the two-qubit interaction. For $\hat{Z}_1\hat{Z}_2$, $\pi/4$ is a common choice for entangling gates.
$\hat{Z}_1$: The Pauli Z operator acting on qubit 1.
- Mathematical Definition: A $2 \times 2$ matrix, typically $\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}$ in the computational basis.
- Physical/Logical Role: A fundamental single-qubit Pauli operator, representing a measurement in the Z-basis.
$\hat{Z}_2$: The Pauli Z operator acting on qubit 2.
- Mathematical Definition: Same as $\hat{Z}_1$ but acting on the second qubit.
- Physical/Logical Role: Same as $\hat{Z}_1$ but on a different qubit.
$\hat{Z}_1\hat{Z}_2$: The tensor product of Pauli Z operators on qubits 1 and 2.
- Mathematical Definition: $\hat{Z}_1 \otimes \hat{Z}_2$.
- Physical/Logical Role: Represents a two-qubit interaction or correlation. This specific interaction is what enables the "contraction" of operator sizes when other Pauli operators are evolved through it.

For Operator Size Contraction (Equation 3):
$$ m = \text{Size}(\hat{U}_{ct} \hat{O} \hat{U}_{ct}^\dagger) = \begin{cases} N_{xy} & \text{if } N_{xy} \in \text{odd}, \\ k & \text{if } N_{xy} \in \text{even}. \end{cases} $$

$m$: The new size of the evolved operator.
- Mathematical Definition: An integer count of non-identity Pauli operators.
- Physical/Logical Role: This is the direct outcome of applying the contractive unitary. The goal is to make this $m$ as small as possible.
$\text{Size}(\dots)$: The operator size function.
- Mathematical Definition: A function that counts the number of non-identity Pauli operators in a Pauli string.
- Physical/Logical Role: This is the metric used to quantify the "spread" or "complexity" of an operator.
$\hat{U}_{ct}$: The full contractive unitary for $k$ qubits.
- Mathematical Definition: Defined as $\prod_{i
- Physical/Logical Role: This is the complete deterministic unitary operation applied to the $k$-qubit subsystem. Its specific construction from commuting two-qubit gates is key to its efficiency.
$\hat{O}$: The initial Pauli string operator.
- Mathematical Definition: A tensor product of $k$ Pauli operators (X, Y, Z, I).
- Physical/Logical Role: This is the observable whose expectation value we want to estimate. It's the input to the unitary evolution.
$\hat{U}_{ct}^\dagger$: The Hermitian conjugate of $\hat{U}_{ct}$.
- Mathematical Definition: The inverse of the unitary operator.
- Physical/Logical Role: Unitary evolution of an operator $\hat{O}$ is given by $\hat{U}\hat{O}\hat{U}^\dagger$.
$N_{xy}$: The total number of X and Y operators in the initial Pauli string $\hat{O}$.
- Mathematical Definition: An integer count.
- Physical/Logical Role: This is the critical property of the initial operator that determines how the contractive unitary acts. The parity (odd or even) of $N_{xy}$ dictates whether the operator size is contracted or remains unchanged.
$k$: The number of qubits in the subsystem.
- Mathematical Definition: An integer.
- Physical/Logical Role: This is the initial size of the operator, assuming it's a size-$k$ Pauli string (i.e., no identity operators initially).
if $N_{xy} \in \text{odd}$: This conditional statement describes the contraction.
- Physical/Logical Role: If the initial operator has an odd number of X or Y components, the contractive unitary effectively converts all Z operators to identity operators, reducing the operator size from $k$ to just $N_{xy}$. This is the primary mechanism for reducing the shadow norm.
if $N_{xy} \in \text{even}$: This conditional statement describes no contraction for Z operators.
- Physical/Logical Role: If the initial operator has an even number of X or Y components, the Z operators remain Z, and the operator size remains $k$. The contractive unitary doesn't help in this specific case for Z operators, but it doesn't increase the size either.

For Pauli Weight of Contractive Unitary (Equation 4):
$$ w(\hat{O})_{ct} = \frac{1}{2 \cdot 3^k} \left[ \frac{(-1)^k + 1}{9^k} + \left(\frac{5}{9}\right)^k \right] $$

$w(\hat{O})_{ct}$: The Pauli weight of the evolved operator $\hat{O}_{ct}$, averaged over the ensemble of size-$k$ Pauli operators.
- Mathematical Definition: This is the calculated average Pauli weight when using the contractive unitary. It's a specific instance of $w(\hat{O}_U)_{\mathcal{E}_U}$ from the master equation.
- Physical/Logical Role: This value directly determines the scaling of the shadow norm. The authors show that this value scales as $\sim 1.8^k$, which is better than the $\sim 2^k$ scaling of random Clifford unitaries.
$\frac{1}{2 \cdot 3^k}$: An overall scaling factor.
- Mathematical Definition: Inverse exponential.
- Physical/Logical Role: This is a normalization factor related to the total number of Pauli strings on $k$ qubits.
$\frac{(-1)^k + 1}{9^k}$: This term contributes when $N_{xy}$ is even.
- Mathematical Definition: A term dependent on $k$.
- Physical/Logical Role: This part of the expression accounts for the contribution to the Pauli weight from initial operators where the number of X and Y components ($N_{xy}$) is even. In this scenario, the Z operators are not contracted, and the operator size remains $k$.
$\left(\frac{5}{9}\right)^k$: This term contributes when $N_{xy}$ is odd.
- Mathematical Definition: A term dependent on $k$.
- Physical/Logical Role: This is the crucial term that provides the improved scaling. It accounts for the contribution from initial operators where $N_{xy}$ is odd, leading to the contraction of Z operators and a smaller effective operator size. This term's exponential base of $5/9 \approx 0.55$ is what drives the $1.8^k$ scaling, as opposed to the $2^k$ scaling from random Clifford unitaries.
- Why addition? These two terms represent contributions from two disjoint sets of initial Pauli operators (those with even $N_{xy}$ and those with odd $N_{xy}$), so their average weights add up.

Step-by-Step Flow

Imagine a single abstract data point, which in this context is a quantum state $\rho$ on $N$ qubits, and we want to estimate the expectation value of a specific Pauli string observable $\hat{O}$ acting on a $k$-qubit subsystem. Here's how the "mathematical engine" processes this:

Initial State and Observable: We begin with a quantum state $\rho$ and a target Pauli string observable $\hat{O}$ (e.g., $\hat{X}_1\hat{Z}_3\hat{Y}_5$) on a $k$-qubit subsystem. The goal is to estimate $\text{Tr}(\hat{O}\rho)$.
Random Single-Qubit Rotations: Before applying the core unitary, the $k$-qubit subsystem undergoes a layer of random single-qubit rotations, $\prod_i \hat{u}_{1,i}$. These rotations are sampled from the Clifford group and serve to eliminate local basis dependence, ensuring all operators are measured on equal footing.
Application of the Contractive Unitary: This is where the paper's innovation kicks in. A global deterministic unitary, $\hat{U}_{ct}$, is applied to the $k$-qubit subsystem. This $\hat{U}_{ct}$ is constructed as a product of two-qubit contractive unitaries $\hat{U}_{ij} = \exp(i\frac{\pi}{4} \hat{Z}_i\hat{Z}_j)$ for all pairs of qubits $i, j$ within the subsystem. The purpose of this step is to transform the observable $\hat{O}$ into an "evolved" operator $\hat{O}_{ct} = \hat{U}_{ct} \hat{O} \hat{U}_{ct}^\dagger$. The magic here is that $\hat{U}_{ct}$ is designed to contract the size of $\hat{O}$ for a significant portion of operators. Specifically, if the initial $\hat{O}$ has an odd number of $\hat{X}$ or $\hat{Y}$ components ($N_{xy}$ is odd), then $\hat{U}_{ct}$ converts all $\hat{Z}$ operators in $\hat{O}$ to identity operators, effectively reducing the operator's size from $k$ to $N_{xy}$. If $N_{xy}$ is even, the size remains $k$. The overall architecture of this protocol, including the random single-qubit rotations and the global unitary, is schematically represented in Figure 1a. Figure 1b further illustrates how this contractive unitary reshapes the operator size distribution compared to random Clifford unitaries.

FIG. 1.

Another Layer of Random Single-Qubit Rotations: Following the contractive unitary, another layer of random single-qubit rotations, $\prod_i \hat{u}_{2,i}$, is applied. This completes the full composite unitary operation $\hat{U}_{\text{eff}} = (\prod_i \hat{u}_{2,i}) \hat{U}_{ct} (\prod_j \hat{u}_{1,j})$.
Measurement in Computational Basis: After the composite unitary $\hat{U}_{\text{eff}}$ is applied to the state $\rho$, the $k$-qubit subsystem is measured in the computational basis. This yields a classical measurement outcome, say $|z\rangle = |z_1, \dots, z_k\rangle$.
Classical Snapshot Reconstruction: From each measurement outcome $|z\rangle$, a "classical snapshot" of the density matrix is reconstructed. This snapshot is given by $\sigma_U(z) = \hat{U}_{\text{eff}}^\dagger |z\rangle\langle z| \hat{U}_{\text{eff}}$. This is an "un-evolution" of the measurement outcome.
Prediction for the Observable: For each reconstructed snapshot $\sigma_U(z)$, a prediction for the expectation value of $\hat{O}$ is made. This is calculated as $\text{Tr}(\hat{O} \sigma_U(z))$. The key here is that the effective operator $\hat{O}_{\text{eff}} = \hat{U}_{\text{eff}}^\dagger \hat{O} \hat{U}_{\text{eff}}$ has a size distribution $\pi(m)$ that is shifted towards smaller $m$ values due to the contractive unitary. This makes the $\text{Tr}(\hat{O} \sigma_U(z))$ calculation more efficient.
Averaging Snapshots: Steps 2-7 are repeated many times (say, $M$ times) to collect a sufficient number of classical snapshots. The final estimate for $\text{Tr}(\hat{O}\rho)$ is obtained by averaging these individual predictions: $\frac{1}{M} \sum_{\alpha=1}^M \text{Tr}(\hat{O} \sigma_U(z^\alpha))$.
Sample Complexity Reduction: The entire process is designed such that the variance of this estimator, which is the shadow norm $||\hat{O}||^2_{\mathcal{E}_U}$, is significantly reduced. By making the operator size distribution $\pi(m)$ peak at smaller $m$ values (thanks to the contractive unitary), the $\sum_m \frac{\pi(m)_{\mathcal{E}_U}}{3^m}$ term becomes smaller, leading to a lower shadow norm and thus fewer measurements (smaller $M$) required for a given precision. The paper demonstrates this reduction from a $2^k$ scaling (random Clifford) to a $1.8^k$ scaling (contractive unitary).

Optimization Dynamics

The "optimization" in this paper isn't an iterative learning process in the typical machine learning sense, but rather a deliberate design choice based on theoretical insights into the structure of quantum operators and unitaries. There are no gradients being computed or loss functions being minimized iteratively by an algorithm. Instead, the authors have engineered a superior mechanism.

The "Loss Landscape": Conceptually, one can imagine a "loss landscape" where the "loss" is the shadow norm $||\hat{O}||^2_{\mathcal{E}_U}$ and the "parameters" are the choices of unitary ensemble $\mathcal{E}_U$. Previous work using random Clifford unitaries found a certain point in this landscape, yielding a sample complexity scaling of $2^k$.
The "Learning" by Design: The authors "learn" by analyzing how different unitary operations affect the operator size distribution $\pi(m)$. They discovered that a specific deterministic unitary, the "contractive unitary" ($\hat{U}_{ct}$), has the unique property of selectively reducing the size of Pauli string operators. This is a theoretical breakthrough, not an algorithmic one.
Mechanism of Contraction: The core of this "optimization" lies in the specific algebraic properties of the $\hat{Z}_i\hat{Z}_j$ interaction. When a Pauli operator like $\hat{X}_i$ or $\hat{Y}_i$ is evolved by $\hat{U}_{ij} = \exp(i\frac{\pi}{4} \hat{Z}_i\hat{Z}_j)$, it can transform other $\hat{Z}$ operators. For instance, if an operator $\hat{O}$ has an odd number of $\hat{X}$ or $\hat{Y}$ components ($N_{xy}$ is odd), the $\hat{U}_{ct}$ effectively converts the $\hat{Z}$ operators in $\hat{O}$ to identity operators ($\hat{I}$). This is a direct consequence of the commutation relations: $\hat{U}_{ij} \hat{Z}_i \hat{U}_{ij}^\dagger = \hat{Z}_i$ and $\hat{U}_{ij} \hat{X}_i \hat{U}_{ij}^\dagger = \hat{X}_i \hat{Z}_j$. The crucial part is how these interactions propagate across multiple qubits. When $N_{xy}$ is odd, the $\hat{Z}$ operators effectively "cancel out" or become $\hat{I}$ due to the collective action of the $\hat{Z}_i\hat{Z}_j$ gates. This reduces the operator size from $k$ to $N_{xy}$.
Shaping the Distribution: By contracting the operator size for a significant fraction of initial Pauli strings, the contractive unitary reshapes the size distribution $\pi(m)$. Instead of being broadly distributed around $m \approx 3k/4$ (as with random Clifford unitaries), $\pi(m)$ now has a significant peak at smaller $m$ values (specifically, $N_{xy}$ for odd $N_{xy}$). This shift towards smaller $m$ values directly reduces the sum $\sum_m \frac{\pi(m)_{\mathcal{E}_U}}{3^m}$, thereby lowering the shadow norm.
Convergence to a Better Scaling: The "convergence" is not iterative but a demonstration of a superior scaling law. The authors theoretically derive (and numerically verify) that this carefully constructed contractive unitary leads to a shadow norm scaling of $\sim 1.8^k$ (or $k \times 1.8^k$ with the sliding trick), which is a significant improvement over the $2^k$ scaling of previous methods. This represents a "jump" to a better region in the conceptual loss landscape, achieved through intelligent design rather than iterative optimization. The mechanism is inherently deterministic once the unitary is chosen; the randomness only comes from the single-qubit rotations and measurement outcomes, not from the core unitary itself.

Results, Limitations & Conclusion

Experimental Design & Baselines

To rigorously validate their mathematical claims, the authors architected a series of numerical experiments, primarily focusing on two types of N-qubit long-range entangled states: the Greenberger-Horne-Zeilinger (GHZ) state and the one-dimensional cluster (ZXZ) state with periodic boundary conditions. For these experiments, a subsystem of $k$ successive qubits was chosen, and predictions were made for specific Pauli string operators. For the GHZ state, the target operator was $\hat{O} = Z_1Z_2...Z_{k-1}Z_k$, while for the ZXZ state, it was $\hat{O} = Z_1Y_2X_3X_4...X_{k-2}Y_{k-1}Z_k$. Crucially, these states admit efficient representations under the stabilizer formalism, allowing for the analytical derivation of exact expectation values, such as $\langle \hat{O} \rangle = ((-1)^{k+1})/2$ for GHZ states and $\langle \hat{O} \rangle = (-1)^k$ for ZXZ states. These analytical values served as "rigorous benchmarks" against which the experimental predictions were compared.

The experimental procedure involved several steps for each sampling process. First, single-qubit rotations were independently generated from a set of 24 single-qubit Clifford gates. Next, the composite unitary operation, as depicted in Fig. 1a, was applied, followed by sampling the measurement outcome $z^\alpha$ in the computational basis. The prediction for each snapshot was computed using the exact shadow norm from Eq. (1) as $O^\alpha = ||\hat{O}||^{-2} \text{Tr}(\hat{O}\sigma_U(z^\alpha))$. After collecting a substantial number of snapshots, specifically $N = 10^5$, the final prediction for the expectation value was obtained by averaging these snapshots: $E[\langle \hat{O} \rangle] = \sum_{\alpha=1}^N O^\alpha / N$. The standard deviation of this expectation was then estimated from the variance $D[\langle \hat{O} \rangle] = \sum_{\alpha=1}^N (O^\alpha - E[\langle \hat{O} \rangle])^2 / N$.

The "victims" (baseline models) in this comparison were protocols employing random Clifford unitaries, which represent the state-of-the-art in classical shadow tomography. The primary results were presented for a system size of $N = 20$, with supplementary information extending to larger systems where $k \sim 20$.

A further set of experiments addressed scenarios where the precise location of the Pauli string operator is unknown. For this, a "sliding trick" was introduced. The system of $N = n_0 k$ qubits was divided into $n_0$ subsystems, each containing $k$ qubits. Unitaries were then slid along one direction by one qubit, generating $k$ distinct sets of unitaries. A circuit structure was randomly selected with a probability of $1/k$. For instance, the operator $\hat{O} = Z_{n_r+1}Y_{n_r+2}X_{n_r+3}X_{n_r+4}...X_{n_r+k-2}Y_{n_r+k-1}Z_{n_r+k}$ for the ZXZ state was used, where $n_r \in [0, N)$ was a random integer. The system size for these tests was $N = 3k$. The baseline for this scenario was the random Clifford protocol also augmented with the sliding trick. The schematics of this sliding trick are illustrated in Figure 3a.

Figure 3. The sliding trick for situations in which the location of the Pauli string operator is un- known. a, Schematics of the sliding trick. Each box represents an independent composite unitary applied to a subsystem with k qubits, as shown in Fig. 1a

FIG. 3.

What the Evidence Proves

The evidence presented in this paper definitively proves that the proposed contractive unitary protocol significantly reduces the sample complexity for classical shadow tomography compared to the standard random Clifford approach. The core mechanism at play is the contractive unitary's ability to more efficiently reduce the effective "operator size" of the evolved Pauli string, which directly translates to a smaller shadow norm and thus fewer measurements needed.

For situations where the operator location is known, Figures 2a and 2b provide undeniable evidence that both the contractive unitary and random Clifford protocols yield unbiased predictions for the expectation values of Pauli string operators. The solid lines, representing the numerical results, align remarkably well with the black dashed lines, which denote the analytically derived exact expectation values for both GHZ and ZXZ states. This confirms the validity of both approaches in terms of accuracy.

However, the definitive advantage of the contractive unitary becomes strikingly clear in Figures 2c and 2d, which plot the variance $D[\langle \hat{O} \rangle]$ of the operator expectation. Here, the contractive unitary protocol consistently exhibits a smaller standard deviation (and thus lower variance) compared to the random Clifford protocol, especially as $k$ increases. The dashed lines in these figures directly confirm the theoretical scaling laws: the contractive unitary achieves a sample complexity scaling of approximately $2 \times 1.8^k$, while the random Clifford protocol scales as $2^k$. This is the hard evidence that their core mechanism works in reality. The reduction from $2^k$ to $1.8^k$ is substantial; for $k \sim 100$, a size achievable on modern quantum computation platforms, this amounts to a more than $10^4$-fold improvement in sample resources. The paper also notes that the contractive unitary protocol demonstrates robustness against quantum noise, as shown in the supplementary information (Fig. S4).

FIG. 2.

The underlying reason for this improved scaling is visually supported by Figure 1b, which illustrates the operator size distribution $\pi(m)$. The random Clifford unitary (red line) maximally scrambles the operator, leading to a broad binomial distribution peaked around $m/k \approx 3/4$. In stark contrast, the contractive unitary (blue line) results in a distribution peaked at a smaller operator size, specifically near $m/k \approx 2/3$, with an additional delta peak at $k$. This more concentrated distribution at smaller operator sizes is the direct consequence of the contractive unitary's ability to reduce the operator size, leading to a smaller shadow norm and, consequently, a lower sample complexity.

Even when the precise location of the operator is unknown, the "sliding trick" combined with the contractive unitary maintains its advantage. Figures 3b and 3c show that both protocols, when equipped with the sliding trick, provide unbiased predictions. Crucially, the variance for the contractive unitary with the sliding trick scales as $(32/19)k \times 1.8^k$, outperforming the random Clifford protocol's $k \times 2^k$ scaling. Table I provides a concise summary, highlighting the superior sample complexity of the contractive unitary in both known ($1.8^k$) and unknown ($k \times 1.8^k$) operator location scenarios, compared to random Clifford ($2^k$ and $k \times 2^k$) and shallow circuit protocols ($>2^k$).

Limitations & Future Directions

While the contractive unitary protocol presents a significant advancement, it's important to acknowledge its current limitations and consider avenues for future development.

One notable challenge arises when extending the approach to situations without prior knowledge of operator locations. The "sliding trick" is introduced to address this, but it adds a factor of $k$ to the sample complexity, resulting in a scaling of $k \times 1.8^k$. While still superior to the random Clifford's $k \times 2^k$, it suggests that further optimization might be possible to mitigate this $k$ factor. A naive application of the contractive unitary to the entire system (rather than a subsystem) could even lead to an increase in shadow norm for the random Clifford protocol, and for the contractive unitary, it could convert identity operators back to Z, which is generally undesirable. This highlights the need for careful design when scaling up. Furthermore, the paper notes that the random Clifford protocol with the sliding trick "still can hardly outperform the shallow circuit protocol" for large enough $k$, implying that while the contractive unitary with the sliding trick is better, the random Clifford one is not a strong contender in this specific scenario.

Looking ahead, the findings open up several exciting discussion topics and research directions:

Hardware-Specific Optimizations and Implementations: The paper emphasizes that the contractive unitary perfectly matches the advantages of atom array quantum computation platforms due to their reconfigurable nature, all-to-all connectivity, and ability to perform parallel gate operations. Recent experimental successes in high-fidelity CZ gates and global CZ gates on these platforms make the contractive unitary readily implementable. Moreover, the authors show that by adding a single ancillary qubit, the contractive unitary can be decomposed into $k$ steps of local gates, making it compatible with platforms like superconducting qubits that have limited connectivity. This suggests a rich future for exploring hardware-aware designs and optimizations for contractive unitaries, potentially leading to even more efficient implementations tailored to specific quantum architectures.
Generalization of the "Contractive Unitary" Concept: The core insight is the deliberate design of deterministic quantum circuits to efficiently contract or scramble operator size. This "general idea" is proposed to have broader applications in quantum teleportation, quantum sensing, and quantum machine learning. Future research could focus on identifying and constructing other types of "contractive" or "scrambling" unitaries optimized for different classes of observables or quantum tasks, moving beyond just Pauli string operators. Can we design unitaries that contract other types of operators or achieve different beneficial size distributions?
Hybrid Random-Deterministic Protocols as a New Paradigm: The work explicitly demonstrates that a "random-deterministic hybridized protocol can be more efficient than fully random measurements." This is a profound lesson. It challenges the conventional wisdom of relying solely on fully random measurements for shadow tomography and opens a new paradigm for designing quantum measurement protocols. Future work could explore other combinations of random and deterministic elements, seeking to find optimal trade-offs between measurement efficiency, robustness, and implementation complexity.
Theoretical Bounds and Optimal Unitary Design: While the paper presents a specific contractive unitary, the question of whether it is globally optimal for operator size contraction remains an open theoretical problem. Further research could delve into establishing tighter theoretical bounds on shadow norms and sample complexity, and then using these bounds to guide the search for even more optimal deterministic unitary designs. This might involve exploring different mathematical structures or leveraging advanced optimization techniques.
Robustness and Error Mitigation: The paper briefly mentions the protocol's robustness against quantum noise. Given the inherent noisiness of current quantum devices, a deeper investigation into the noise resilience of contractive unitaries under various realistic noise models (e.g., depolarizing noise, dephasing, gate errors) would be invaluable. Developing error mitigation strategies specifically tailored to these hybrid protocols could further enhance their practical utility.

In essence, this work not only provides a powerful new tool for quantum state characterization but also sparks a broader discussion on the intelligent design of quantum circuits for measurement and information extraction. The future of classical shadow tomography, and indeed quantum information processing, may lie in this thoughtful integration of deterministic and random elements.

Table 1. A comparison of the sample complexity for 34 the contractive unitary protocol, the random Clifford 35 protocol, and the shallow circuits protocol for situations 36 with or without the information of the precise location 37 of the Pauli string operators ˆO. 38 Table 2. Two-qubit Pauli operators with size-2. 39