Loading web-font TeX/Math/Italic

Sunday, 12 July 2015

Famous inequalities

In this post we prove several important inequalities that often arise in functional analysis.

Cauchy-Schwartz inequality
Let x and y be two vectors in \mathbb{R}^n. The Cauchy-Schwartz inequality states that |\langle x,y \rangle| \leq \lVert x \rVert \lVert y \rVert. Equality holds if and only if x and y are linearly dependent, that is, if x=\lambda y for some scalar \lambda.

Proof 1: dot product
Note that w \cdot w=w_1^2+w_2^2+\cdots+w_n^2 \geq 0 for any w \in \mathbb{R}^n. We rescale x and y to new vectors which have the same length, namely \lVert x \rVert y and  \lVert y \rVert x. Then

\begin{align} \langle \lVert x \rVert y-\lVert y \rVert x,\lVert x \rVert y-\lVert y \rVert x \rangle &\geq 0\\ \lVert x \rVert^2 \langle y,y \rangle-2\lVert x \rVert \lVert y \rVert \langle x,y \rangle+\lVert y \rVert^2 \langle x,x \rangle & \geq 0\\ 2\lVert x \rVert^2 \lVert y \rVert^2-2\lVert x \rVert \lVert y \rVert \langle x,y \rangle & \geq 0\\ \langle x,y \rangle \leq \lVert x \rVert \lVert y \rVert \end{align}


Similarly, setting \langle \lVert x \rVert y+\lVert y \rVert x,\lVert x \rVert y+\lVert y \rVert x \rangle \geq 0 gives us -\langle x,y \rangle \leq \lVert x \rVert \lVert y \rVert. Combining both cases, we have |\langle x,y \rangle| \leq \lVert x \rVert \lVert y \rVert.\:\Box

Proof 2: orthogonality
We orthogonally decompose x into a component parallel to y and a component perpendicular to y, and use the fact that the perpendicular component has a non-negative square norm.

\begin{align}\lVert x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y \rVert^2 & \geq 0\\ \langle x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y,x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y \rangle & \geq 0\\ \langle x,x \rangle-2\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle}\langle x,y \rangle+\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle} & \geq 0\\ \langle x,x \rangle-\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle} & \geq 0 \quad\Box \end{align}


Proof 3: discriminant
\langle \lambda x-y, \lambda x-y \rangle=\lambda^2\langle x,x \rangle-2\lambda\langle x,y \rangle+\langle y,y \rangle
This is a quadratic polynomial in \lambda, which is nonnegative for every \lambda, hence its discriminant \Delta \leq 0.
So \langle x,y \rangle^2-\langle x,x \rangle \langle y,y \rangle \leq 0 \iff |\langle x,y \rangle|^2 \leq \lVert x \rVert^2 \lVert y \rVert^2.\:\Box

Proof 4: Lagrange's identity [Check this out for more details.]
|\langle x,y \rangle|=\lVert x \rVert \lVert y \rVert \iff x and y are linearly dependent \iff the sequences (a_i) and (b_i) are proportional, where x=(a_1,a_2,\cdots,a_n)^T and y=(b_1,b_2,\cdots,b_n)^T.

Define (a_i) and (b_i) to be proportional if a_ib_j=a_jb_i for every pair i,j, or equivalently a_ib_j-a_jb_i=0 for every pair i,j. Thus

\begin{align}\sum_{i,j} (a_ib_j-a_jb_i)^2&=0\\ \sum_{i,j} (a_i^2b_j^2-2a_ia_jb_ib_j+a_j^2b_i^2)&=0\\ 2(\sum_i a_i^2)(\sum_j b_j^2)-2(\sum_i a_ib_i)^2&=0.\:\Box \end{align}


The inequality together with the equality case, follows immediately, provided that the two sequences are positive.

Intuition for this inequality:
We restate the inequality as |\langle x,y \rangle| \leq \lVert x \rVert for all normalised vector y, namely \lVert y \rVert=1. Then \langle x,y \rangle is the component of the vector x in the direction y. So the inequality can be thought of as saying that the component of a vector x in a single direction is bounded by the magnitude of x.



Lemma
Let 0<\lambda<1. Then t^\lambda \leq 1-\lambda+\lambda t for all t \geq 0. This inequality becomes equality only when t=1.

Proof
Define f: [0,\infty) \to \mathbb{R} by f(t)=1-\lambda+\lambda t-t^\lambda.
Then f'(t)=\lambda-\lambda t^{\lambda-1}=\lambda(1-\dfrac{1}{t^{1-\lambda}}).
f'(t) \begin{cases} >0\;\text{if}\; t>1 \\ <0\;\text{if}\; 0<t<1 \end{cases}
Thus f(1)=0=\min f and f(t)\geq 0 for all t \geq 0, with equality iff t=1.
\Rightarrow t^\lambda \leq 1-\lambda+\lambda t \;\; \forall t \geq 0, \quad t^\lambda=1-\lambda+\lambda t \;\; \text{iff} \;\; t=1\:\Box

The inequalities that follow all use the notion of conjugate exponents.

Conjugate exponents are positive real numbers p,q such that \dfrac{1}{p}+\dfrac{1}{q}=1.
For example, the pair 1,\infty is a pair of conjugate exponents since p \to 1 \Rightarrow q \to \infty.
If p,q are integers, the only pair of conjugate exponents is 2,2.

Young’s inequality
Let p,q be conjugate exponents with 1<q<\infty. Then ab \leq \dfrac{a^p}{p}+\dfrac{b^q}{q} for all a,b \geq 0.
Equality holds iff a^p=b^q.

Proof 1: the above lemma
If b=0, there is nothing to prove.
Assume b>0. Using the lemma, substitute \frac{1}{p}<1 for \lambda and a^pb^{-q} for t.
Then we have
\begin{align}a^pb^{-q} &\leq 1-\frac{1}{p}+\frac{1}{p} a^pb^{-q}\\ ab^{\frac{-q}{p}} &\leq \frac{1}{p} a^pb^{-q}+\frac{1}{q} \quad [\frac{1}{p}+\frac{1}{q}=1]\\ ab^{\frac{-q}{p}+q} &\leq \frac{a^p}{p}+\frac{b^q}{q}\\ ab &\leq \frac{a^p}{p}+\frac{b^q}{q} \quad [\frac{q}{p}+1=q \Rightarrow \frac{-q}{p}+q=1],\end{align}

with equality when t=1, namely a^pb^{-q}, a^p=b^q. \:\Box

[How do we know we can prove ab \leq \frac{a^p}{p}+\frac{b^q}{q} by proving t^\lambda \leq 1-\lambda+\lambda t? We divide Young's inequality by b^q, which gives us \frac{a}{b^{q-1}}\leq \frac{1}{p}\cdot \frac{a^p}{b^q}+\frac{1}{q}.]

Proof 2i: convex function
Recall that a function f:I \to \mathbb{R} is convex if for all x,y \in I and \lambda \in [0,1]f((1-\lambda)x+\lambda y)\leq (1-\lambda)f(x)+\lambda f(y).
We can make use of the convexity of the exponential function.
Substitute \ln a^p for x and \ln b^q for y.
Since \frac{1}{p}+\frac{1}{q}=1, we have \large e^{\frac{1}{p}\ln a^p+\frac{1}{q}\ln b^q} \leq \frac{1}{p}e^{\ln a^p}+\frac{1}{q}e^{\ln b^q}.
Thus ab \leq \frac{a^p}{p}+\frac{b^q}{q}.\:\Box

Proof 2ii: concave function
Alternatively, we can use the concavity of the logarithmic function.
\log(\frac{a^p}{p}+\frac{b^q}{q}) \geq \frac{1}{p}\log a^p+\frac{1}{q}\log b^q=\log(ab)\:\Box

Proof 3: integration
Consider f(x)=x^{q-1} \quad q>1.
Then g(y)=y^{p-1} where \frac{1}{p}+\frac{1}{q}=1.
So ab \leq \int_0^b x^{q-1}\:dx+\int_0^a y^{p-1}\:dy=\frac{b^q}{q}+\frac{a^p}{p}.\:\Box

Proof 4: differentiation
Let f(a)=\frac{a^p}{p}+\frac{b^q}{q}-ab. Then f'(a)=a^{p-1}-b.
Setting f'(a)=0, we have a=b^\frac{1}{p-1} \Rightarrow a^p=b^\frac{p}{p-1}=b^q.
\min f=f(b^\frac{q}{p})=0 \Rightarrow 0 \leq \frac{a^p}{p}+\frac{b^q}{q}-ab \Rightarrow ab \leq \frac{a^p}{p}+\frac{b^q}{q}.\:\Box

Remark: Young's inequality is one of the most important inequalities. It says that a product can be estimated by a sum. The [[AM-GM inequality]] is its special case. The Cauchy-Schwartz inequality can be proved from the AM-GM inequality. [to be explored in another post]

Hölder's inequality
Let p,q be conjugate exponents with 1<q<\infty. For any integer n\geq 1, assume that a_1,\cdots,a_n and b_1,\cdots,b_n are nonnegative. Then \sum\limits_{k=1}^n a_kb_k \leq (\sum\limits_{k=1}^n a_k^p)^\frac{1}{p} (\sum\limits_{k=1}^n b_k^q)^\frac{1}{q}.

Proof from Young's inequality:
Let A=(\sum\limits_{k=1}^n a^p)^\frac{1}{p} and B=(\sum\limits_{k=1}^n b^q)^\frac{1}{q}.
Then \dfrac{\sum\limits_{k=1}^n a^p}{A^p}=1 and \dfrac{\sum\limits_{k=1}^n b^q}{B^q}=1.
Put a=\dfrac{a_k}{A}, b=\dfrac{b_k}{B} in Young's inequality.
\begin{align}\frac{a_kb_k}{AB} &\leq \frac{a_k^p}{pA^p}+\frac{b_k^q}{qB^q} \quad \forall k=1,2,\cdots,n\\ \sum\limits_{k=1}^n \frac{a_kb_k}{AB} &\leq \frac{1}{p}+\frac{1}{q}=1\\ \sum\limits_{k=1}^n a_kb_k &\leq (\sum\limits_{k=1}^n a^p)^\frac{1}{p} (\sum\limits_{k=1}^n b^q)^\frac{1}{q}\:\Box \end{align}

We will delve into the applications of these inequalities in another post.

References:
proof 1 of Young’s inequality, Hölder's inequality
proof 4 of Cauchy-Schwartz inequality
Intuition of Cauchy-Schwartz inequality

Related:
Equality in Young's inequality for convolution
Young's inequality for three variables
Hölder's inequality with three functions

No comments:

Post a Comment