Riesz representation theorem explained

The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

Preliminaries and notation

Let

be a Hilbert space over a field

where

is either the real numbers

or the complex numbers

\Complex.

F=\Complex

(resp. if

F=\R

) then

is called a (resp. a). Every real Hilbert space can be extended to be a dense subset of a unique (up to bijective isometry) complex Hilbert space, called its complexification, which is why Hilbert spaces are often automatically assumed to be complex. Real and complex Hilbert spaces have in common many, but by no means all, properties and results/theorems.

This article is intended for both mathematicians and physicists and will describe the theorem for both. In both mathematics and physics, if a Hilbert space is assumed to be real (that is, if

F=\R

) then this will usually be made clear. Often in mathematics, and especially in physics, unless indicated otherwise, "Hilbert space" is usually automatically assumed to mean "complex Hilbert space." Depending on the author, in mathematics, "Hilbert space" usually means either (1) a complex Hilbert space, or (2) a real complex Hilbert space.

Linear and antilinear maps

By definition, an (also called a)

f:H\toY

is a map between vector spaces that is :

f(x + y) = f(x) + f(y) \quad \text x, y \in H,

and (also called or):

f(c x) = \overline f(x) \quad \text x \in H \text c \in \mathbb,

where

\overline{c}

is the conjugate of the complex number

c=a+bi

, given by

\overline{c}=a-bi

In contrast, a map

f:H\toY

is linear if it is additive and :

f(c x) = c f(x) \quad \text x \in H \quad \text c \in \mathbb.

Every constant

map is always both linear and antilinear. If

F=\R

then the definitions of linear maps and antilinear maps are completely identical. A linear map from a Hilbert space into a Banach space (or more generally, from any Banach space into any topological vector space) is continuous if and only if it is bounded; the same is true of antilinear maps. The inverse of any antilinear (resp. linear) bijection is again an antilinear (resp. linear) bijection. The composition of two linear maps is a map.

Continuous dual and anti-dual spaces

A on

is a function

H\toF

whose codomain is the underlying scalar field

Denote by

H^*

(resp. by

\overline{H}^*)

the set of all continuous linear (resp. continuous antilinear) functionals on

which is called the (resp. the) of

F=\R

then linear functionals on

are the same as antilinear functionals and consequently, the same is true for such continuous maps: that is,

H^*=\overline{H}^*.

One-to-one correspondence between linear and antilinear functionals

Given any functional

f~:~H\toF,

the is the functional

\begin\overline : \,& H && \to \,&& \mathbb \\ & h && \mapsto\,&& \overline. \\\end

This assignment is most useful when

F=\Complex

because if

F=\R

then

f=\overline{f}

and the assignment

f\mapsto\overline{f}

reduces down to the identity map.

The assignment

f\mapsto\overline{f}

defines an antilinear bijective correspondence from the set of

all functionals (resp. all linear functionals, all continuous linear functionals

H^*

) on

onto the set of

all functionals (resp. all linear functionals, all continuous linear functionals

\overline{H}^*

) on

Mathematics vs. physics notations and definitions of inner product

has an associated inner product

H x H\toF

valued in

's underlying scalar field

that is linear in one coordinate and antilinear in the other (as specified below).If

is a complex Hilbert space (

F=\Complex

), then there is a crucial difference between the notations prevailing in mathematics versus physics, regarding which of the two variables is linear.However, for real Hilbert spaces (

F=\R

), the inner product is a symmetric map that is linear in each coordinate (bilinear), so there can be no such confusion.

In mathematics, the inner product on a Hilbert space

is often denoted by

\left\langle ⋅ , ⋅ \right\rangle

\left\langle ⋅ , ⋅ \right\rangle_H

while in physics, the bra–ket notation

\left\langle ⋅ \mid ⋅ \right\rangle

\left\langle ⋅ \mid ⋅ \right\rangle_H

is typically used. In this article, these two notations will be related by the equality:

\left\langle x, y \right\rangle := \left\langle y \mid x \right\rangle \quad \text x, y \in H.

These have the following properties:

The map
\left\langle ⋅ , ⋅ \right\rangle

is linear in its first coordinate; equivalently, the map
\left\langle ⋅ \mid ⋅ \right\rangle

is linear in its second coordinate. That is, for fixed
y\inH,

the map
\left\langley\mid ⋅ \right\rangle=\left\langle ⋅ ,y\right\rangle:H\toF

with $h \mapsto \left\langle \,y\mid h\, \right\rangle = \left\langle \,h, y\, \right\rangle$ is a linear functional on
H.

This linear functional is continuous, so
\left\langley\mid ⋅ \right\rangle=\left\langle ⋅ ,y\right\rangle\inH^*.
The map
\left\langle ⋅ , ⋅ \right\rangle

is antilinear in its coordinate; equivalently, the map
\left\langle ⋅ \mid ⋅ \right\rangle

is antilinear in its coordinate. That is, for fixed
y\inH,

the map
\left\langle ⋅ \midy\right\rangle=\left\langley, ⋅ \right\rangle:H\toF

with $h \mapsto \left\langle \,h\mid y\, \right\rangle = \left\langle \,y, h\, \right\rangle$ is an antilinear functional on
H.

This antilinear functional is continuous, so
\left\langle ⋅ \midy\right\rangle=\left\langley, ⋅ \right\rangle\in\overline{H}^*.

In computations, one must consistently use either the mathematics notation

\left\langle ⋅ , ⋅ \right\rangle

, which is (linear, antilinear); or the physics notation

\left\langle ⋅ \mid ⋅ \right\rangle

, whch is (antilinear | linear).

Canonical norm and inner product on the dual space and anti-dual space

x=y

then

\langlex\midx\rangle=\langlex,x\rangle

is a non-negative real number and the map

\|x\| := \sqrt = \sqrt

defines a canonical norm on

that makes

into a normed space. As with all normed spaces, the (continuous) dual space

H^*

carries a canonical norm, called the, that is defined by

\|f\|_ ~:=~ \sup_ |f(x)| \quad \text f \in H^*.

\overline{H}^*,

denoted by

	*},
\\|f\\|
	\overline{H

is defined by using this same equation:

\|f\|_ ~:=~ \sup_ |f(x)| \quad \text f \in \overline^*.

This canonical norm on

H^*

satisfies the parallelogram law, which means that the polarization identity can be used to define a which this article will denote by the notations

\left\langle f, g \right\rangle_ := \left\langle g \mid f \right\rangle_,

where this inner product turns

H^*

into a Hilbert space. There are now two ways of defining a norm on

H^*:

the norm induced by this inner product (that is, the norm defined by

f\mapsto\sqrt{\left\langlef,f

\right\rangle
	H^*

}) and the usual dual norm (defined as the supremum over the closed unit ball). These norms are the same; explicitly, this means that the following holds for every

f\inH^*:

\sup_ |f(x)| = \|f\|_ ~=~ \sqrt ~=~ \sqrt.

As will be described later, the Riesz representation theorem can be used to give an equivalent definition of the canonical norm and the canonical inner product on

H^*.

The same equations that were used above can also be used to define a norm and inner product on

's anti-dual space

\overline{H}^*.

Canonical isometry between the dual and antidual

\overline{f}

of a functional

which was defined above, satisfies

\|f\|_ ~=~ \left\|\overline\right\|_ \quad \text \quad \left\|\overline\right\|_ ~=~ \|g\|_

for every

f\inH^*

and every

g\in\overline{H}^*.

This says exactly that the canonical antilinear bijection defined by

\begin\operatorname :\;&& H^* &&\;\to \;& \overline^* \\[0.3ex] && f &&\;\mapsto\;& \overline \\\end

as well as its inverse

\operatorname{Cong}^-1~:~\overline{H}^*\toH^*

are antilinear isometries and consequently also homeomorphisms. The inner products on the dual space

H^*

and the anti-dual space

\overline{H}^*,

denoted respectively by

\langle ⋅ , ⋅

\rangle
	H^*

and

\langle ⋅ , ⋅

	*},
\rangle
	\overline{H

are related by

\langle \,\overline\, | \,\overline\, \rangle_ = \overline = \langle \,g\, | \,f\, \rangle_ \qquad \text f, g \in H^*

and

\langle \,\overline\, | \,\overline\, \rangle_ = \overline = \langle \,g\, | \,f\, \rangle_ \qquad \text f, g \in \overline^*.

F=\R

then

H^*=\overline{H}^*

and this canonical map

\operatorname{Cong}:H^*\to\overline{H}^*

reduces down to the identity map.

Riesz representation theorem

Two vectors

and

are if

\langlex,y\rangle=0,

which happens if and only if

\|y\|\leq\|y+sx\|

for all scalars

The orthogonal complement of a subset

X\subseteqH

X^ := \,

which is always a closed vector subspace of

The Hilbert projection theorem guarantees that for any nonempty closed convex subset

of a Hilbert space there exists a unique vector

m\inC

such that

\|m\|=inf_c\|c\|;

that is,

m\inC

is the (unique) global minimum point of the function

C\to[0,infty)

defined by

c\mapsto\|c\|.

Statement

Historically, the theorem is often attributed simultaneously to Riesz and Fréchet in 1907 (see references).

Let

denote the underlying scalar field of

Fix

y\inH.

Define

Λ:H\toF

Λ(z):=\langley|z\rangle,

which is a linear functional on

since

is in the linear argument. By the Cauchy–Schwarz inequality,

|\Lambda(z)| = |\langle \,y\, | \,z\, \rangle| \leq \|y\| \|z\|

which shows that

is bounded (equivalently, continuous) and that

\|Λ\|\leq\|y\|.

It remains to show that

\|y\|\leq\|Λ\|.

By using

in place of

it follows that

\|y\|^2 = \langle \,y\, | \,y\, \rangle = \Lambda y = |\Lambda(y)| \leq \|\Lambda\| \|y\|

(the equality

Λy=|Λ(y)|

holds because

Λy=\|y\|²\geq0

is real and non-negative). Thus that

\|Λ\|=\|y\|.

\blacksquare

The proof above did not use the fact that

is complete, which shows that the formula for the norm

\|\langley| ⋅

\rangle\\|
	H^*

=\|y\|_H

holds more generally for all inner product spaces.

Suppose

f,g\inH

are such that

\varphi(z)=\langlef|z\rangle

and

\varphi(z)=\langleg|z\rangle

for all

z\inH.

Then

\langle \,f - g\, | \,z\, \rangle = \langle \,f\, | \,z\, \rangle - \langle \,g\, | \,z\, \rangle = \varphi(z) - \varphi(z) = 0 \quad \text z \in H

which shows that

Λ:=\langlef-g| ⋅ \rangle

is the constant

linear functional. Consequently

0=\|\langlef-g| ⋅ \rangle\|=\|f-g\|,

which implies that

f-g=0.

\blacksquare

Let

K:=\ker\varphi:=\{m\inH:\varphi(m)=0\}.

K=H

(or equivalently, if

\varphi=0

) then taking

f_\varphi:=0

completes the proof so assume that

K ≠ H

and

\varphi ≠ 0.

The continuity of

\varphi

implies that

is a closed subspace of

(because

K=\varphi^-1(\{0\})

and

\{0\}

is a closed subset of

). Let

K^ := \

denote the orthogonal complement of

Because

is closed and

is a Hilbert space,^[1]

can be written as the direct sum

H=K ⊕ K^\bot

^[2] (a proof of this is given in the article on the Hilbert projection theorem). Because

K ≠ H,

there exists some non-zero

p\inK^\bot.

For any

h\inH,

\varphi[(\varphi h) p - (\varphi p) h] ~=~ \varphi[(\varphi h) p] - \varphi[(\varphi p) h] ~=~ (\varphi h) \varphi p - (\varphi p) \varphi h = 0,

which shows that

(\varphih)p-(\varphip)h~\in~\ker\varphi=K,

where now

p\inK^\bot

implies

0 = \langle \,p\, | \,(\varphi h) p - (\varphi p) h\, \rangle ~=~ \langle \,p\, | \,(\varphi h) p \, \rangle - \langle \,p\, | \,(\varphi p) h\, \rangle ~=~ (\varphi h) \langle \,p\, | \,p \, \rangle - (\varphi p) \langle \,p\, | \,h\, \rangle.

Solving for

\varphih

shows that

\varphi h = \frac = \left\langle \,\frac p\, \Bigg| \,h\, \right\rangle \quad \text h \in H,

which proves that the vector

f_\varphi:=

	\overline{\varphip

} p satisfies

\varphih=\langlef_\varphi|h\rangleforeveryh\inH.

Applying the norm formula that was proved above with

y:=f_\varphi

shows that

\\|\varphi\\|
	H^*

=\left\|\left\langlef_\varphi| ⋅

\right\rangle\right\\|
	H^*

=\left\|f_\varphi\right\|_H.

Also, the vector

u:=

	p
	\\|p\\|

has norm

\|u\|=1

and satisfies

f_\varphi:=\overline{\varphi(u)}u.

\blacksquare

It can now be deduced that

K^\bot

-dimensional when

\varphi ≠ 0.

Let

q\inK^\bot

be any non-zero vector. Replacing

with

in the proof above shows that the vector

g:=

	\overline{\varphiq

} q satisfies

\varphi(h)=\langleg|h\rangle

for every

h\inH.

The uniqueness of the (non-zero) vector

f_\varphi

representing

\varphi

implies that

f_\varphi=g,

which in turn implies that

\overline{\varphiq} ≠ 0

and

	\\|q\\|²
	\overline{\varphiq

} f_. Thus every vector in

K^\bot

is a scalar multiple of

f_\varphi.

\blacksquare

The formulas for the inner products follow from the polarization identity.

Observations

\varphi\inH^*

then

\varphi \left(f_\right) = \left\langle f_, f_ \right\rangle = \left\|f_\right\|^2 = \|\varphi\|^2.

So in particular,

\varphi\left(f_\varphi\right)\geq0

is always real and furthermore,

\varphi\left(f_\varphi\right)=0

if and only if

f_\varphi=0

if and only if

\varphi=0.

Linear functionals as affine hyperplanes

A non-trivial continuous linear functional

\varphi

is often interpreted geometrically by identifying it with the affine hyperplane

A:=\varphi^-1(1)

(the kernel

\ker\varphi=\varphi^-1(0)

is also often visualized alongside

A:=\varphi^-1(1)

although knowing

is enough to reconstruct

\ker\varphi

because if

A=\varnothing

then

\ker\varphi=H

and otherwise

\ker\varphi=A-A

). In particular, the norm of

\varphi

should somehow be interpretable as the "norm of the hyperplane

". When

\varphi ≠ 0

then the Riesz representation theorem provides such an interpretation of

\|\varphi\|

in terms of the affine hyperplane

A:=\varphi^-1(1)

as follows: using the notation from the theorem's statement, from

\|\varphi\|² ≠ 0

it follows that

and so

implies

\|\varphi\|=inf_a\|\varphi\|²\|a\|

and thus

\|\varphi\|=

	1
	inf_a\\|a\\|

This can also be seen by applying the Hilbert projection theorem to

and concluding that the global minimum point of the map

A\to[0,infty)

defined by

a\mapsto\|a\|

	f_\varphi
	\\|\varphi\\|²

\inA.

The formulas

\frac = \sup_ \frac

provide the promised interpretation of the linear functional's norm

\|\varphi\|

entirely in terms of its associated affine hyperplane

A=\varphi^-1(1)

(because with this formula, knowing only the

is enough to describe the norm of its associated linear). Defining

	1
	infty

:=0,

the infimum formula

\|\varphi\| = \frac

will also hold when

\varphi=0.

When the supremum is taken in

(as is typically assumed), then the supremum of the empty set is

\sup\varnothing=-infty

but if the supremum is taken in the non-negative reals

[0,infty)

(which is the image/range of the norm

\| ⋅ \|

when

\dimH>0

) then this supremum is instead

\sup\varnothing=0,

in which case the supremum formula

\|\varphi\|=

\sup
	a\in\varphi^-1(1)

	1
	\\|a\\|

will also hold when

\varphi=0

(although the atypical equality

\sup\varnothing=0

is usually unexpected and so risks causing confusion).

Constructions of the representing vector

Using the notation from the theorem above, several ways of constructing

f_\varphi

from

\varphi\inH^*

are now described. If

\varphi=0

then

f_\varphi:=0

; in other words,

f_0 = 0.

This special case of

\varphi=0

is henceforth assumed to be known, which is why some of the constructions given below start by assuming

\varphi ≠ 0.

Orthogonal complement of kernel

\varphi ≠ 0

then for any

0 ≠ u\in(\ker\varphi)^\bot,

f_ := \frac.

u\in(\ker\varphi)^\bot

is a unit vector (meaning

\|u\|=1

) then

f_ := \overline u

(this is true even if

\varphi=0

because in this case

f_\varphi=\overline{\varphi(u)}u=\overline{0}u=0

). If

is a unit vector satisfying the above condition then the same is true of

-u,

which is also a unit vector in

(\ker\varphi)^\bot.

However,

\overline{\varphi(-u)}(-u)=\overline{\varphi(u)}u=f_\varphi

so both these vectors result in the same

f_\varphi.

Orthogonal projection onto kernel

x\inH

is such that

\varphi(x) ≠ 0

and if

x_K

is the orthogonal projection of

onto

\ker\varphi

then

f_ = \frac \left(x - x_K\right).

Orthonormal basis

\left\{e_i\right\}_i

and a continuous linear functional

\varphi\inH^*,

the vector

f_\varphi\inH

can be constructed uniquely by

f_\varphi = \sum_ \overline e_i

where all but at most countably many

\varphi\left(e_i\right)

will be equal to

and where the value of

f_\varphi

does not actually depend on choice of orthonormal basis (that is, using any other orthonormal basis for

will result in the same vector). If

y\inH

is written as

y=\sum_ia_ie_i

then

\varphi(y) = \sum_ \varphi\left(e_i\right) a_i = \langle f_ | y \rangle

and

\left\|f_\right\|^2 = \varphi\left(f_\right) = \sum_ \varphi\left(e_i\right) \overline = \sum_ \left|\varphi\left(e_i\right)\right|^2 = \|\varphi\|^2.

If the orthonormal basis

\left\{e_i\right\}_i=\left\{e_i\right\}

	infty

	i=1

is a sequence then this becomes

f_\varphi = \overline e_1 + \overline e_2 + \cdots

and if

y\inH

is written as

y=\sum_ia_ie_i=a₁e₁+a₂e₂+ …

then

\varphi(y) = \varphi\left(e_1\right) a_1 + \varphi\left(e_2\right) a_2 + \cdots = \langle f_ | y \rangle.

Example in finite dimensions using matrix transformations

Consider the special case of

H=\Complexⁿ

(where

n>0

is an integer) with the standard inner product

\langle z \mid w \rangle := \overline^ \vec \qquad \text \; w, z \in H

where

wandz

are represented as column matrices

\vec{w}:=\begin{bmatrix}w₁\ \vdots\ w_{n\end{bmatrix}}

and

\vec{z}:=\begin{bmatrix}z₁\ \vdots\ z_{n\end{bmatrix}}

with respect to the standard orthonormal basis

e_1,\ldots,e_n

(here,

e_i

at its

^th coordinate and

everywhere else; as usual,

H^*

will now be associated with the dual basis) and where

\overline{\vec{z}}^{\operatorname{T}

} := \left[\overline{z_1}, \ldots, \overline{z_n}\right] denotes the conjugate transpose of

\vec{z}.

Let

\varphi\inH^*

be any linear functional and let

\varphi_1,\ldots,\varphi_n\in\Complex

be the unique scalars such that

\varphi\left(w_1, \ldots, w_n\right) = \varphi_1 w_1 + \cdots + \varphi_n w_n \qquad \text \; w := \left(w_1, \ldots, w_n\right) \in H,

where it can be shown that

\varphi_i=\varphi\left(e_i\right)

for all

i=1,\ldots,n.

Then the Riesz representation of

\varphi

is the vector

f_ ~:=~ \overline e_1 + \cdots + \overline e_n~=~ \left(\overline, \ldots, \overline\right) \in H.

To see why, identify every vector

w=\left(w_1,\ldots,w_n\right)

with the column matrix

\vec{w}:=\begin{bmatrix}w₁\ \vdots\ w_{n\end{bmatrix}}

so that

f_\varphi

is identified with

\vec{f_\varphi

} := \begin\overline \\ \vdots \\ \overline\end = \begin\overline \\ \vdots \\ \overline\end. As usual, also identify the linear functional

\varphi

with its transformation matrix, which is the row matrix

\vec{\varphi}:=\left[\varphi_1,\ldots,\varphi_n\right]

so that

\vec{f_\varphi

} := \overline^ and the function

\varphi

is the assignment

\vec{w}\mapsto\vec{\varphi}\vec{w},

where the right hand side is matrix multiplication. Then for all

w=\left(w_1,\ldots,w_n\right)\inH,

\varphi(w) = \varphi_1 w_1 + \cdots + \varphi_n w_n = \left[\varphi_1, \ldots, \varphi_n\right] \beginw_1 \\ \vdots \\ w_n\end = \overline^ \vec = \overline^ \vec = \left\langle \,\,f_\, \mid \,w\, \right\rangle,

which shows that

f_\varphi

satisfies the defining condition of the Riesz representation of

\varphi.

The bijective antilinear isometry

\Phi:H\toH^*

defined in the corollary to the Riesz representation theorem is the assignment that sends

z=\left(z_1,\ldots,z_n\right)\inH

to the linear functional

\Phi(z)\inH^*

defined by

w = \left(w_1, \ldots, w_n\right) ~\mapsto~ \langle \,z\, \mid \,w\,\rangle = \overline w_1 + \cdots + \overline w_n,

where under the identification of vectors in

with column matrices and vector in

H^*

with row matrices,

\Phi

is just the assignment

\vec = \beginz_1 \\ \vdots \\ z_n\end ~\mapsto~ \overline^ = \left[\overline{z_1}, \ldots, \overline{z_n}\right].

As described in the corollary,

\Phi

's inverse

\Phi^-1:H^*\toH

is the antilinear isometry

\varphi\mapstof_\varphi,

which was just shown above to be:

\varphi ~\mapsto~ f_ ~:=~ \left(\overline, \ldots, \overline\right);

where in terms of matrices,

\Phi^-1

is the assignment

\vec = \left[\varphi_1, \ldots, \varphi_n\right] ~\mapsto~ \overline^ = \begin\overline \\ \vdots \\ \overline\end.

Thus in terms of matrices, each of

\Phi:H\toH^*

and

\Phi^-1:H^*\toH

is just the operation of conjugate transposition

\vec{v}\mapsto\overline{\vec{v}}^{\operatorname{T}

} (although between different spaces of matrices: if

is identified with the space of all column (respectively, row) matrices then

H^*

is identified with the space of all row (respectively, column matrices).

This example used the standard inner product, which is the map

\langlez\midw\rangle:=\overline{\vec{z}}^{\operatorname{T}

} \vec, but if a different inner product is used, such as

\langlez\midw\rangle_M:=\overline{\vec{z}}^{\operatorname{T}

} \, M \, \vec \, where

is any Hermitian positive-definite matrix, or if a different orthonormal basis is used then the transformation matrices, and thus also the above formulas, will be different.

Relationship with the associated real Hilbert space

Canonical injections into the dual and anti-dual

Induced linear map into anti-dual

The map defined by placing

into the coordinate of the inner product and letting the variable

h\inH

vary over the coordinate results in an functional:

\langle \,\cdot \mid y\, \rangle = \langle \,y, \cdot\, \rangle : H \to \mathbb \quad \text \quad h \mapsto \langle \,h \mid y\, \rangle = \langle \,y, h\, \rangle.

This map is an element of

\overline{H}^*,

which is the continuous anti-dual space of

The

\overline{H}^*

is the operator

\begin\operatorname_H^ :\;&& H &&\;\to \;& \overline^* \\[0.3ex] && y &&\;\mapsto\;& \langle \,\cdot \mid y\, \rangle = \langle \,y, \cdot\, \rangle \\[0.3ex]\end

which is also an injective isometry. The Fundamental theorem of Hilbert spaces, which is related to Riesz representation theorem, states that this map is surjective (and thus bijective). Consequently, every antilinear functional on

can be written (uniquely) in this form.

\operatorname{Cong}:H^*\to\overline{H}^*

is the canonical linear bijective isometry

f\mapsto\overline{f}

that was defined above, then the following equality holds:

\operatorname ~\circ~ \operatorname_H^ ~=~ \operatorname_H^.

Extending the bra–ket notation to bras and kets

See main article: Bra–ket notation.

Let

\left(H,\langle ⋅ , ⋅ \rangle_H\right)

be a Hilbert space and as before, let

\langley|x\rangle_H:=\langlex,y\rangle_H.

Let

\begin\Phi :\;&& H &&\;\to \;& H^* \\[0.3ex] && g &&\;\mapsto\;& \left\langle \,g\mid \cdot\, \right\rangle_H = \left\langle \,\cdot, g\, \right\rangle_H \\\end

which is a bijective antilinear isometry that satisfies

(\Phi h) g = \langle h\mid g \rangle_H = \langle g, h \rangle_H \quad \text g, h \in H.

Bras

Given a vector

h\inH,

let

\langleh|

denote the continuous linear functional

\Phih

; that is,

\langle h\, | ~:=~ \Phi h

so that this functional

\langleh|

is defined by

g\mapsto\left\langleh\midg\right\rangle_H.

This map was denoted by

\left\langleh\mid ⋅ \right\rangle

earlier in this article.

The assignment

h\mapsto\langleh|

is just the isometric antilinear isomorphism

\Phi~:~H\toH^*,

which is why

~\langlecg+h|~=~\overline{c}\langleg\mid~+~\langleh|~

holds for all

g,h\inH

and all scalars

The result of plugging some given

g\inH

into the functional

\langleh|

is the scalar

\langleh|g\rangle_H=\langleg,h\rangle_H,

which may be denoted by

\langleh\midg\rangle.

^[3]

Bra of a linear functional

Given a continuous linear functional

\psi\inH^*,

let

\langle\psi\mid

denote the vector

\Phi^-1\psi\inH

; that is,

\langle \psi\mid ~:=~ \Phi^ \psi.

The assignment

\psi\mapsto\langle\psi\mid

is just the isometric antilinear isomorphism

\Phi^-1~:~H^*\toH,

which is why

~\langlec\psi+\phi\mid~=~\overline{c}\langle\psi\mid~+~\langle\phi\mid~

holds for all

\phi,\psi\inH^*

and all scalars

The defining condition of the vector

\langle\psi|\inH

is the technically correct but unsightly equality

\left\langle \, \langle \psi\mid \, \mid g \right\rangle_H ~=~ \psi g \quad \text g \in H,

which is why the notation

\left\langle\psi\midg\right\rangle

is used in place of

\left\langle\langle\psi\mid\midg\right\rangle_H=\left\langleg,\langle\psi\mid\right\rangle_H.

With this notation, the defining condition becomes

\left\langle \psi\mid g \right\rangle ~=~ \psi g \quad \text g \in H.

Kets

For any given vector

g\inH,

the notation

|g\rangle

is used to denote

; that is,

\mid g \rangle : = g.

The assignment

g\mapsto|g\rangle

is just the identity map

\operatorname{Id}_H:H\toH,

which is why

~\midcg+h\rangle~=~c\midg\rangle~+~\midh\rangle~

holds for all

g,h\inH

and all scalars

The notation

\langleh\midg\rangle

and

\langle\psi\midg\rangle

is used in place of

\left\langleh\mid\midg\rangle\right\rangle_H~=~\left\langle\midg\rangle,h\right\rangle_H

and

\left\langle\psi\mid\midg\rangle\right\rangle_H~=~\left\langleg,\langle\psi\mid\right\rangle_H,

respectively. As expected,

~\langle\psi\midg\rangle=\psig~

and

~\langleh\midg\rangle~

really is just the scalar

~\langleh\midg\rangle_H~=~\langleg,h\rangle_H.

Adjoints and transposes

Let

A:H\toZ

be a continuous linear operator between Hilbert spaces

\left(H,\langle ⋅ , ⋅ \rangle_H\right)

and

\left(Z,\langle ⋅ , ⋅ \rangle_Z\right).

As before, let

\langley\midx\rangle_H:=\langlex,y\rangle_H

and

\langley\midx\rangle_Z:=\langlex,y\rangle_Z.

Denote by $\begin\Phi_H :\;&& H &&\;\to \;& H^* \\[0.3ex] && g &&\;\mapsto\;& \langle \,g \mid \cdot\, \rangle_H \\\end\quad \text \quad \begin\Phi_Z :\;&& Z &&\;\to \;& Z^* \\[0.3ex] && y &&\;\mapsto\;& \langle \,y \mid \cdot\, \rangle_Z \\\end$ the usual bijective antilinear isometries that satisfy: $\left(\Phi_H g\right) h = \langle g\mid h \rangle_H \quad \text g, h \in H \qquad \text \qquad \left(\Phi_Z y\right) z = \langle y \mid z \rangle_Z \quad \text y, z \in Z.$

Definition of the adjoint

See main article: Hermitian adjoint and Conjugate transpose.

For every

z\inZ,

the scalar-valued map

\langlez\midA( ⋅ )\rangle_Z

defined by

h \mapsto \langle z\mid A h \rangle_Z = \langle A h, z \rangle_Z

is a continuous linear functional on

and so by the Riesz representation theorem, there exists a unique vector in

denoted by

A^*z,

such that

\langlez\midA( ⋅ )\rangle_Z=\left\langleA^*z\mid ⋅ \right\rangle_H,

or equivalently, such that

\langle z \mid A h \rangle_Z = \left\langle A^* z \mid h \right\rangle_H \quad \text h \in H.

The assignment

z\mapstoA^*z

thus induces a function

A^*:Z\toH

called the of

A:H\toZ

whose defining condition is

\langle z \mid A h \rangle_Z = \left\langle A^* z\mid h \right\rangle_H \quad \text h \in H \text z \in Z.

The adjoint

A^*:Z\toH

is necessarily a continuous (equivalently, a bounded) linear operator.

is finite dimensional with the standard inner product and if

is the transformation matrix of

with respect to the standard orthonormal basis then

's conjugate transpose

\overline{M^{\operatorname{T}

}} is the transformation matrix of the adjoint

A^*.

Adjoints are transposes

See main article: Transpose of a linear map.

Descriptions of self-adjoint, normal, and unitary operators

Assume

Z=H

and let

\Phi:=\Phi_H=\Phi_Z.

Let

A:H\toH

be a continuous (that is, bounded) linear operator.

Whether or not

A:H\toH

is self-adjoint, normal, or unitary depends entirely on whether or not

satisfies certain defining conditions related to its adjoint, which was shown by to essentially be just the transpose

{}^tA:H^*\toH^*.

Because the transpose of

is a map between continuous linear functionals, these defining conditions can consequently be re-expressed entirely in terms of linear functionals, as the remainder of subsection will now describe in detail. The linear functionals that are involved are the simplest possible continuous linear functionals on

that can be defined entirely in terms of

the inner product

\langle ⋅ \mid ⋅ \rangle

and some given vector

h\inH.

Specifically, these are

\left\langleAh\mid ⋅ \right\rangle

and

\langleh\midA( ⋅ )\rangle

where

\left\langle A h\mid\cdot\, \right\rangle = \Phi (A h) = (\Phi \circ A) h \quad \text \quad \langle h\mid A (\cdot) \rangle = \left(^A \circ \Phi\right) h.

Self-adjoint operators

A continuous linear operator

A:H\toH

is called self-adjoint if it is equal to its own adjoint; that is, if

A=A^*.

Using, this happens if and only if:

\Phi \circ A = ^t A \circ \Phi

where this equality can be rewritten in the following two equivalent forms:

A = \Phi^ \circ ^t A \circ \Phi \quad \text \quad ^A = \Phi \circ A \circ \Phi^.

Unraveling notation and definitions produces the following characterization of self-adjoint operators in terms of the aforementioned continuous linear functionals:

is self-adjoint if and only if for all

z\inH,

the linear functional

\langlez\midA( ⋅ )\rangle

is equal to the linear functional

\langleAz\mid ⋅ \rangle

; that is, if and only if

where if bra-ket notation is used, this is $\langle z \mid A ~=~ \langle A z \mid \quad \text z \in H.$

Normal operators

See also: Normal operator and Normal matrix.

A continuous linear operator

A:H\toH

is called normal if

AA^*=A^*A,

which happens if and only if for all

z,h\inH,

\left\langle A A^* z\mid h \right\rangle = \left\langle A^* A z\mid h \right\rangle.

Using and unraveling notation and definitions produces the following characterization of normal operators in terms of inner products of continuous linear functionals:

is a normal operator if and only if

where the left hand side is also equal to

\overline{\langleAh\midAz\rangle}_H=\langleAz\midAh\rangle_H.

The left hand side of this characterization involves only linear functionals of the form

\langleAh\mid ⋅ \rangle

while the right hand side involves only linear functions of the form

\langleh\midA( ⋅ )\rangle

(defined as above). So in plain English, characterization says that an operator is normal when the inner product of any two linear functions of the first form is equal to the inner product of their second form (using the same vectors

z,h\inH

for both forms).In other words, if it happens to be the case (and when

is injective or self-adjoint, it is) that the assignment of linear functionals

\langleAh\mid ⋅ \rangle~\mapsto~\langleh|A( ⋅ )\rangle

is well-defined (or alternatively, if

\langleh|A( ⋅ )\rangle~\mapsto~\langleAh\mid ⋅ \rangle

is well-defined) where

ranges over

then

is a normal operator if and only if this assignment preserves the inner product on

H^*.

The fact that every self-adjoint bounded linear operator is normal follows readily by direct substitution of

A^*=A

into either side of

A^*A=AA^*.

This same fact also follows immediately from the direct substitution of the equalities into either side of .

Alternatively, for a complex Hilbert space, the continuous linear operator

is a normal operator if and only if

\|Az\|=\left\|A^*z\right\|

for every

z\inH,

which happens if and only if

\|Az\|_H = \|\langle z\, | \,A(\cdot) \rangle\|_ \quad \text z \in H.

Unitary operators

See also: Unitary transformation and Unitary matrix.

An invertible bounded linear operator

A:H\toH

is said to be unitary if its inverse is its adjoint:

A^-1=A^*.

By using, this is seen to be equivalent to

\Phi\circA^-1={}^tA\circ\Phi.

Unraveling notation and definitions, it follows that

is unitary if and only if

\langle A^ z\mid\cdot\, \rangle = \langle z\mid A (\cdot) \rangle \quad \text z \in H.

The fact that a bounded invertible linear operator

A:H\toH

is unitary if and only if

A^*A=\operatorname{Id}_H

(or equivalently,

{}^tA\circ\Phi\circA=\Phi

) produces another (well-known) characterization: an invertible bounded linear map

is unitary if and only if

\langle A z\mid A (\cdot)\, \rangle = \langle z\mid\cdot\, \rangle \quad \text z \in H.

Because

A:H\toH

is invertible (and so in particular a bijection), this is also true of the transpose

{}^tA:H^*\toH^*.

This fact also allows the vector

z\inH

in the above characterizations to be replaced with

A^-1z,

thereby producing many more equalities. Similarly,

⋅

can be replaced with

A( ⋅ )

A^-1( ⋅ ).

Notes

Proofs

Bibliography

- Fréchet. M.. 1907. Sur les ensembles de fonctions et les opérations linéaires. Les Comptes rendus de l'Académie des sciences. 144. 1414 - 1416. fr.
P. Halmos Measure Theory, D. van Nostrand and Co., 1950.
P. Halmos, A Hilbert Space Problem Book, Springer, New York 1982 (problem 3 contains version for vector spaces with coordinate systems).
Riesz. F.. 1907. Sur une espèce de géométrie analytique des systèmes de fonctions sommables. Comptes rendus de l'Académie des Sciences. 144. 1409 - 1411. fr.
Riesz. F.. 1909. Sur les opérations fonctionnelles linéaires. Comptes rendus de l'Académie des Sciences. 149. 974 - 977. fr.
- - Walter Rudin, Real and Complex Analysis, McGraw-Hill, 1966, .

Notes and References

Showing that there is a non-zero vector
v

in
K^\bot

relies on the continuity of
\phi

and the Cauchy completeness of
H.

This is the only place in the proof in which these properties are used.
Technically,
H=K ⊕ K^\bot

means that the addition map
K x K^\bot\toH

defined by
(k,p)\mapstok+p

is a surjective linear isomorphism and homeomorphism. See the article on complemented subspaces for more details.
The usual notation for plugging an element
g

into a linear map
F

is
F(g)

and sometimes
Fg.

Replacing
F

with
\langleh\mid:=~\Phih

produces
\langleh\mid(g)

or
\langleh\midg,

which is unsightly (despite being consistent with the usual notation used with functions). Consequently, the symbol
\rangle

is appended to the end, so that the notation
\langleh\midg\rangle

is used instead to denote this value
(\Phih)g.

f
	\varphi_\R

f
	\varphi_\R

f
	\varphi_\R

f
	\varphi_\R