Baker-Campbell-Hausdorff formula

The Baker-Campbell-Hausdorff (BCH) formula addresses the fundamental question of how to combine exponentials of non-commuting operators. Given two operators (or matrices), $\displaystyle X$ and $\displaystyle Y$ , the goal is to find an operator $\displaystyle Z$ such that

$\displaystyle e^X e^Y = e^Z$

If $\displaystyle X$ and $\displaystyle Y$ commute (i.e., $\displaystyle [X,Y] = XY - YX = 0$ ), the solution is simply $\displaystyle Z = X+Y$ . However, when they do not commute, $\displaystyle Z$ is given by a more complex infinite series involving nested commutators of $\displaystyle X$ and $\displaystyle Y$ .

The formula is essential in Lie group theory, where it relates the Lie algebra (the space of generators like $\displaystyle X$ and $\displaystyle Y$ ) to the Lie group (the space of transformations like $\displaystyle e^X$ and $\displaystyle e^Y$ ). In quantum mechanics, it is crucial for composing unitary operators, such as time-evolution operators.

Baker-Campbell-Hausdorff Formula

The operator $\displaystyle Z = \log(e^X e^Y)$ is given by the series:

$\displaystyle Z = X + Y + \frac{1}{2}[X, Y] + \frac{1}{12}\left([X,[X,Y]] - [Y,[X,Y]]\right) - \frac{1}{2}4[Y,[X,[X,Y]]] + \dots$

where the series consists entirely of nested commutators.

We aim to find $\displaystyle e^{Z(t)} = e^{tX} e^{tY}$ .

Assume $\displaystyle Z(t) = tZ_1 + t^2 Z_2 + t^3 Z_3 + \cdots$ .

Our goal is to compute the coefficients $\displaystyle Z_n$ , which are combinations of $\displaystyle X$ and $\displaystyle Y$ . The final BCH formula is given by evaluating at $\displaystyle t = 1$ .

Expanding Both Sides

Right-Hand Side:

$\displaystyle e^{tX} e^{tY} = (1 + tX + \frac{t^2}{2}X^2 + \frac{t^3}{6}X^3 + \cdots)(1 + tY + \frac{t^2}{2}Y^2 + \frac{t^3}{6}Y^3 + \cdots)$
$\displaystyle = 1 + t(X + Y) + \frac{t^2}{2}(X^2 + 2XY + Y^2) + \frac{t^3}{6}(X^3 + 3X^2Y + 3XY^2 + Y^3) + \cdots$

Left-Hand Side:

$\displaystyle e^{Z(t)} = 1 + Z(t) + \frac{1}{2}Z(t)^2 + \frac{1}{6}Z(t)^3 + \cdots$
$\displaystyle = 1 + tZ_1 + t^2 Z_2 + t^3 Z_3 + \frac{1}{2}(tZ_1)^2 + \cdots$
$\displaystyle = 1 + tZ_1 + t^2(Z_2 + \frac{1}{2}Z_1^2) + t^3(Z_3 + Z_1 Z_2 + \frac{1}{6}Z_1^3) + \cdots$

We will now match coefficients.

Order $\displaystyle t^1: Z_1 = X + Y$

Order $\displaystyle t^2: Z_2 + \frac{1}{2}Z_1^2 = \frac{1}{2}(X^2 + 2XY + Y^2)$

Substitute $\displaystyle Z_1 = X + Y$ :

$\displaystyle Z_1^2 = (X + Y)^2 = X^2 + XY + YX + Y^2$

So:

$\displaystyle Z_2 + \frac{1}{2}(X^2 + XY + YX + Y^2) = \frac{1}{2}(X^2 + 2XY + Y^2)$

Therefore:

$\displaystyle Z_2 = \frac{1}{2}(XY - YX) = \frac{1}{2}[X,Y]$

Order $\displaystyle t^3: Z_3 + Z_1 Z_2 + \frac{1}{6}Z_1^3 = \frac{1}{6}(X^3 + 3X^2Y + 3XY^2 + Y^3)$

With $\displaystyle Z_1 = X + Y,\; Z_2 = \frac{1}{2}[X,Y]$

After simplifying:

$\displaystyle Z_3 = \frac{1}{12}[X,[X,Y]] - \frac{1}{12}[Y,[X,Y]]$

So we have

$\displaystyle Z = X + Y + \frac{1}{2}[X,Y] + \frac{1}{12}[X,[X,Y]] - \frac{1}{12}[Y,[X,Y]] + \cdots$

The proof above was by direct computation, we can calculate the terms in the formula by matching terms in the expansion. The point is that it is far from obvious that it is possible to express each of the terms in terms of commutators.

We now look at a standard proof of the BCH formula that defines a parameter-dependent function $\displaystyle Z(t)$ and derives a differential equation for it. This will give some conceptual explanations for the terms constants and terms that appear.

Define $\displaystyle Z(t)$ such that:

$\displaystyle e^{Z(t)} = e^{tX} e^Y$

We want to find $\displaystyle Z = Z(1)$ .

At $\displaystyle t = 0$ , we have: $\displaystyle e^{Z(0)} = e^{0 \cdot X} e^Y = e^Y \Rightarrow Z(0) = Y$

Differentiate the identity:

$\displaystyle \frac{d}{dt} \left( e^{Z(t)} \right) = \frac{d}{dt} \left( e^{tX} e^Y \right)$

Right-hand side:

$\displaystyle \frac{d}{dt} \left( e^{tX} e^Y \right) = X e^{tX} e^Y = X e^{Z(t)}$

Left-hand side: When $\displaystyle Z(t)$ is a scalar, the rule is simply:

$\displaystyle \frac{d}{dt} e^{Z(t)} = \dot{Z}(t) e^{Z(t)}$

But when $\displaystyle Z(t)$ is an operator that may not commute with $\displaystyle \dot{Z}(t)$ , this identity no longer holds directly. we use the formula for differentiating an operator exponential when $\displaystyle Z(t)$ and $\displaystyle \dot{Z}(t)$ may not commute:

$\displaystyle \frac{d}{dt} e^{Z(t)} = \left( \int_0^1 ds \, e^{sZ(t)} \dot{Z}(t) e^{-sZ(t)} \right) e^{Z(t)}$ .

We will prove this identity at the end.

Set both derivatives equal:

$\displaystyle X e^{Z(t)} = \left( \int_0^1 ds \, e^{sZ(t)} \dot{Z}(t) e^{-sZ(t)} \right) e^{Z(t)}$

Cancel $\displaystyle e^{Z(t)}$ on the right:

$\displaystyle X = \int_0^1 ds \, e^{sZ(t)} \dot{Z}(t) e^{-sZ(t)}$

Adjoint Representation

Let us now understand the meaning of $\displaystyle e^{sZ(t)} \dot{Z}(t) e^{-sZ(t)}$ .

If $\displaystyle A$ and $\displaystyle B$ are operators, and we conjugate $\displaystyle B$ by $\displaystyle e^A$ , we define:

$\displaystyle e^A B e^{-A} = B + [A, B] + \frac{1}{2!}[A, [A, B]] + \frac{1}{3!}[A, [A, [A, B]]] + \cdots$

This is the Taylor expansion of the exponential of the adjoint operator, denoted:

$\displaystyle \text{ad}_A(B) = [A, B], \quad \text{ad}_A^2(B) = [A, [A, B]], \quad \text{etc.}$ .

This is a fundamental object in Lie theory.

Then:

$\displaystyle e^A B e^{-A} = e^{\text{ad}_A}(B)$

So:

$\displaystyle e^{sZ(t)} \dot{Z}(t) e^{-sZ(t)} = e^{s \cdot \text{ad}_{Z(t)}}(\dot{Z}(t))$

Now, the integral becomes:

$\displaystyle X = \int_0^1 e^{s \cdot \text{ad}_{Z(t)}}(\dot{Z}(t)) ds$

Performing the simple scalar integral $\displaystyle \int_0^1 e^{sz} ds = (e^z - 1)/z$ yields:

$\displaystyle X = \left( \frac{e^{\text{ad}(Z(t))} - 1}{\text{ad}(Z(t))} \right) \dot{Z}(t)$

We can now formally invert the operator acting on $\displaystyle \dot{Z}(t)$ to solve for it:

$\displaystyle \dot{Z}(t) = \left( \frac{\text{ad}(Z(t))}{e^{\text{ad}(Z(t))} - 1} \right) X$

The function $\displaystyle g(z) = \frac{z}{e^z - 1}$ has the (defining) series in terms of Bernoulli numbers:

$\displaystyle \frac{z}{e^z - 1} = \sum_{n=0}^\infty \frac{B_n}{n!} z^n$

Thus,

$\displaystyle \dot{Z}(t) = \left( \sum_{n=0}^\infty \frac{B_n}{n!} (\text{ad}(Z(t)))^n \right) X$

Expanding we get

$\displaystyle \dot{Z}(t) = X - \frac{1}{2}[Z(t), X] + \frac{1}{12}[Z(t), [Z(t), X]] - \frac{1}{720}[Z(t), [Z(t), [Z(t), [Z(t), X]]]] + \cdots$

We integrate from $\displaystyle t=0$ to $\displaystyle t=1$ :

$\displaystyle Z = Z(1) = Z(0) + \int_0^1 \dot{Z}(t) \, dt = Y + \int_0^1 \dot{Z}(t) \, dt$

So we have the integral equation:

$\displaystyle Z =Y + \int_0^1 \left(X - \frac{1}{2}[Z(t), X] + \frac{1}{12}[Z(t), [Z(t), X]] - \frac{1}{720}[Z(t), [Z(t), [Z(t), [Z(t), X]]]] + \cdots \right) ds$

Solving this integral iteratively gives the terms of the BCH formula. For example, approximating $\displaystyle Z(t) \approx Y+tX$ inside the expression for $\displaystyle \dot{Z}(t)$ :

$\displaystyle \dot{Z}(t) \approx X - \frac{1}{2}[Y + tX, X] = X - \frac{1}{2}[Y, X] - \frac{t}{2}[X, X]$

Since $\displaystyle [X,X] = 0$ , we get:

$\displaystyle \dot{Z}(t) \approx X - \frac{1}{2}[Y, X] = X + \frac{1}{2}[X, Y]$

Then,

$\displaystyle Z(1) \approx Y + \int_0^1 \left( X + \frac{1}{2}[X, Y] \right) dt = Y + X + \frac{1}{2}[X, Y]$

Continuing this iterative process to higher orders generates all the terms of the BCH formula stated in the theorem.

$\displaystyle Z = X + Y + \frac{1}{2}[X,Y] + \frac{1}{12}[X,[X,Y]] - \frac{1}{12}[Y,[X,Y]] + \cdots$

Proof of the derivative formula:

We aim to prove the identity:

$\displaystyle \frac{d}{dt} e^{Z(t)} = \int_0^1 e^{sZ(t)} \dot{Z}(t) e^{(1 - s)Z(t)} \, ds$

where $\displaystyle Z(t)$ is an operator-valued function (e.g., a matrix), and $\displaystyle \dot{Z}(t) = \frac{dZ}{dt}$ .

The exponential of an operator is defined as a power series:

$\displaystyle e^{Z(t)} = \sum_{n=0}^\infty \frac{Z(t)^n}{n!} = 1 + Z(t) + \frac{1}{2!}Z(t)^2 + \frac{1}{3!}Z(t)^3 + \cdots$

We differentiate this series term-by-term with respect to $\displaystyle t$ :

$\displaystyle \frac{d}{dt} e^{Z(t)} = \sum_{n=1}^\infty \frac{1}{n!} \frac{d}{dt} Z(t)^n$

To differentiate $\displaystyle Z(t)^n$ , we use the product rule. Since the operator $\displaystyle Z(t)$ appears $\displaystyle n$ times and generally does not commute with its derivative, we get:

$\displaystyle \frac{d}{dt} Z(t)^n = \sum_{k=0}^{n-1} Z(t)^k \dot{Z}(t) Z(t)^{n - 1 - k}$

Therefore:

$\displaystyle \frac{d}{dt} e^{Z(t)} = \sum_{n=1}^\infty \frac{1}{n!} \sum_{k=0}^{n-1} Z(t)^k \dot{Z}(t) Z(t)^{n - 1 - k}$

We’ll now derive the same expression by expanding the integral identity, and match both sides.

Let us now consider the right-hand side:

$\displaystyle I = \int_0^1 e^{sZ(t)} \dot{Z}(t) e^{(1 - s)Z(t)} \, ds$

We expand the exponentials into power series:

$\displaystyle e^{sZ(t)} = \sum_{k=0}^\infty \frac{(sZ(t))^k}{k!}, \quad e^{(1 - s)Z(t)} = \sum_{j=0}^\infty \frac{((1 - s)Z(t))^j}{j!}$

Therefore:

$\displaystyle I = \int_0^1 \left( \sum_{k=0}^\infty \frac{(sZ(t))^k}{k!} \right) \dot{Z}(t) \left( \sum_{j=0}^\infty \frac{((1 - s)Z(t))^j}{j!} \right) ds$