In this post we prove several important inequalities that often arise in functional analysis.
Cauchy-Schwartz inequality
Let $x$ and $y$ be two vectors in $\mathbb{R}^n$. The Cauchy-Schwartz inequality states that $|\langle x,y \rangle| \leq \lVert x \rVert \lVert y \rVert$. Equality holds if and only if $x$ and $y$ are linearly dependent, that is, if $x=\lambda y$ for some scalar $\lambda$.
Proof 1: dot product
Note that $w \cdot w=w_1^2+w_2^2+\cdots+w_n^2 \geq 0$ for any $w \in \mathbb{R}^n$. We rescale $x$ and $y$ to new vectors which have the same length, namely $\lVert x \rVert y$ and $ \lVert y \rVert x$. Then
$$\begin{align} \langle \lVert x \rVert y-\lVert y \rVert x,\lVert x \rVert y-\lVert y \rVert x \rangle &\geq 0\\
\lVert x \rVert^2 \langle y,y \rangle-2\lVert x \rVert \lVert y \rVert \langle x,y \rangle+\lVert y \rVert^2 \langle x,x \rangle & \geq 0\\
2\lVert x \rVert^2 \lVert y \rVert^2-2\lVert x \rVert \lVert y \rVert \langle x,y \rangle & \geq 0\\
\langle x,y \rangle \leq \lVert x \rVert \lVert y \rVert
\end{align}$$
Similarly, setting $\langle \lVert x \rVert y+\lVert y \rVert x,\lVert x \rVert y+\lVert y \rVert x \rangle \geq 0$ gives us $-\langle x,y \rangle \leq \lVert x \rVert \lVert y \rVert$. Combining both cases, we have $|\langle x,y \rangle| \leq \lVert x \rVert \lVert y \rVert.\:\Box$
Proof 2: orthogonality
We orthogonally decompose $x$ into a component parallel to $y$ and a component perpendicular to $y$, and use the fact that the perpendicular component has a non-negative square norm.
$$\begin{align}\lVert x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y \rVert^2 & \geq 0\\ \langle x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y,x-\frac{\langle x,y \rangle}{\langle y,y \rangle}y \rangle & \geq 0\\ \langle x,x \rangle-2\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle}\langle x,y \rangle+\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle} & \geq 0\\ \langle x,x \rangle-\frac{|\langle x,y \rangle|^2}{\langle y,y \rangle} & \geq 0 \quad\Box \end{align}$$
Proof 3: discriminant
$\langle \lambda x-y, \lambda x-y \rangle=\lambda^2\langle x,x \rangle-2\lambda\langle x,y \rangle+\langle y,y \rangle$
This is a quadratic polynomial in $\lambda$, which is nonnegative for every $\lambda$, hence its discriminant $\Delta \leq 0$.
So $\langle x,y \rangle^2-\langle x,x \rangle \langle y,y \rangle \leq 0 \iff |\langle x,y \rangle|^2 \leq \lVert x \rVert^2 \lVert y \rVert^2.\:\Box$
Proof 4: Lagrange's identity [Check this out for more details.]
$|\langle x,y \rangle|=\lVert x \rVert \lVert y \rVert \iff$ $x$ and $y$ are linearly dependent $\iff$ the sequences $(a_i)$ and $(b_i)$ are proportional, where $x=(a_1,a_2,\cdots,a_n)^T$ and $y=(b_1,b_2,\cdots,b_n)^T$.
Define $(a_i)$ and $(b_i)$ to be proportional if $a_ib_j=a_jb_i$ for every pair $i,j$, or equivalently $a_ib_j-a_jb_i=0$ for every pair $i,j$. Thus
$$\begin{align}\sum_{i,j} (a_ib_j-a_jb_i)^2&=0\\
\sum_{i,j} (a_i^2b_j^2-2a_ia_jb_ib_j+a_j^2b_i^2)&=0\\
2(\sum_i a_i^2)(\sum_j b_j^2)-2(\sum_i a_ib_i)^2&=0.\:\Box \end{align}$$
The inequality together with the equality case, follows immediately, provided that the two sequences are positive.
Intuition for this inequality:
We restate the inequality as $|\langle x,y \rangle| \leq \lVert x \rVert$ for all normalised vector $y$, namely $\lVert y \rVert=1$. Then $\langle x,y \rangle$ is the component of the vector $x$ in the direction $y$. So the inequality can be thought of as saying that the component of a vector $x$ in a single direction is bounded by the magnitude of $x$.
Lemma
Let $0<\lambda<1$. Then $t^\lambda \leq 1-\lambda+\lambda t$ for all $t \geq 0$. This inequality becomes equality only when $t=1$.
Proof
Define $f: [0,\infty) \to \mathbb{R}$ by $f(t)=1-\lambda+\lambda t-t^\lambda$.
Then $f'(t)=\lambda-\lambda t^{\lambda-1}=\lambda(1-\dfrac{1}{t^{1-\lambda}})$.
$f'(t) \begin{cases} >0\;\text{if}\; t>1 \\ <0\;\text{if}\; 0<t<1 \end{cases}$
Thus $f(1)=0=\min f$ and $f(t)\geq 0$ for all $t \geq 0$, with equality iff $t=1$.
$\Rightarrow t^\lambda \leq 1-\lambda+\lambda t \;\; \forall t \geq 0, \quad t^\lambda=1-\lambda+\lambda t \;\; \text{iff} \;\; t=1\:\Box$
The inequalities that follow all use the notion of conjugate exponents.
Conjugate exponents are positive real numbers $p,q$ such that $\dfrac{1}{p}+\dfrac{1}{q}=1$.
For example, the pair $1,\infty$ is a pair of conjugate exponents since $p \to 1 \Rightarrow q \to \infty$.
If $p,q$ are integers, the only pair of conjugate exponents is $2,2$.
Young’s inequality
Let $p,q$ be conjugate exponents with $1<q<\infty$. Then $ab \leq \dfrac{a^p}{p}+\dfrac{b^q}{q}$ for all $a,b \geq 0$.
Equality holds iff $a^p=b^q$.
Proof 1: the above lemma
If $b=0$, there is nothing to prove.
Assume $b>0$. Using the lemma, substitute $\frac{1}{p}<1$ for $\lambda$ and $a^pb^{-q}$ for $t$.
Then we have
$$\begin{align}a^pb^{-q} &\leq 1-\frac{1}{p}+\frac{1}{p} a^pb^{-q}\\
ab^{\frac{-q}{p}} &\leq \frac{1}{p} a^pb^{-q}+\frac{1}{q} \quad [\frac{1}{p}+\frac{1}{q}=1]\\
ab^{\frac{-q}{p}+q} &\leq \frac{a^p}{p}+\frac{b^q}{q}\\
ab &\leq \frac{a^p}{p}+\frac{b^q}{q} \quad [\frac{q}{p}+1=q \Rightarrow \frac{-q}{p}+q=1],\end{align}$$
with equality when $t=1$, namely $a^pb^{-q}, a^p=b^q. \:\Box$
[How do we know we can prove $ab \leq \frac{a^p}{p}+\frac{b^q}{q}$ by proving $t^\lambda \leq 1-\lambda+\lambda t$? We divide Young's inequality by $b^q$, which gives us $\frac{a}{b^{q-1}}\leq \frac{1}{p}\cdot \frac{a^p}{b^q}+\frac{1}{q}$.]
Proof 2i: convex function
Recall that a function $f:I \to \mathbb{R}$ is convex if for all $x,y \in I$ and $\lambda \in [0,1]$, $f((1-\lambda)x+\lambda y)\leq (1-\lambda)f(x)+\lambda f(y)$.
We can make use of the convexity of the exponential function.
Substitute $\ln a^p$ for $x$ and $\ln b^q$ for $y$.
Since $\frac{1}{p}+\frac{1}{q}=1$, we have $\large e^{\frac{1}{p}\ln a^p+\frac{1}{q}\ln b^q} \leq \frac{1}{p}e^{\ln a^p}+\frac{1}{q}e^{\ln b^q}$.
Thus $ab \leq \frac{a^p}{p}+\frac{b^q}{q}.\:\Box$
Proof 2ii: concave function
Alternatively, we can use the concavity of the logarithmic function.
$\log(\frac{a^p}{p}+\frac{b^q}{q}) \geq \frac{1}{p}\log a^p+\frac{1}{q}\log b^q=\log(ab)\:\Box$
Proof 3: integration
Consider $f(x)=x^{q-1} \quad q>1$.
Then $g(y)=y^{p-1}$ where $\frac{1}{p}+\frac{1}{q}=1$.
So $ab \leq \int_0^b x^{q-1}\:dx+\int_0^a y^{p-1}\:dy=\frac{b^q}{q}+\frac{a^p}{p}.\:\Box$
Proof 4: differentiation
Let $f(a)=\frac{a^p}{p}+\frac{b^q}{q}-ab$. Then $f'(a)=a^{p-1}-b$.
Setting $f'(a)=0$, we have $a=b^\frac{1}{p-1} \Rightarrow a^p=b^\frac{p}{p-1}=b^q$.
$\min f=f(b^\frac{q}{p})=0 \Rightarrow 0 \leq \frac{a^p}{p}+\frac{b^q}{q}-ab \Rightarrow ab \leq \frac{a^p}{p}+\frac{b^q}{q}.\:\Box$
Remark: Young's inequality is one of the most important inequalities. It says that a product can be estimated by a sum. The [[AM-GM inequality]] is its special case. The Cauchy-Schwartz inequality can be proved from the AM-GM inequality. [to be explored in another post]
Hölder's inequality
Let $p,q$ be conjugate exponents with $1<q<\infty$. For any integer $n\geq 1$, assume that $a_1,\cdots,a_n$ and $b_1,\cdots,b_n$ are nonnegative. Then $\sum\limits_{k=1}^n a_kb_k \leq (\sum\limits_{k=1}^n a_k^p)^\frac{1}{p} (\sum\limits_{k=1}^n b_k^q)^\frac{1}{q}$.
Proof from Young's inequality:
Let $A=(\sum\limits_{k=1}^n a^p)^\frac{1}{p}$ and $B=(\sum\limits_{k=1}^n b^q)^\frac{1}{q}$.
Then $\dfrac{\sum\limits_{k=1}^n a^p}{A^p}=1$ and $\dfrac{\sum\limits_{k=1}^n b^q}{B^q}=1$.
Put $a=\dfrac{a_k}{A}, b=\dfrac{b_k}{B}$ in Young's inequality.
$\begin{align}\frac{a_kb_k}{AB} &\leq \frac{a_k^p}{pA^p}+\frac{b_k^q}{qB^q} \quad \forall k=1,2,\cdots,n\\
\sum\limits_{k=1}^n \frac{a_kb_k}{AB} &\leq \frac{1}{p}+\frac{1}{q}=1\\
\sum\limits_{k=1}^n a_kb_k &\leq (\sum\limits_{k=1}^n a^p)^\frac{1}{p} (\sum\limits_{k=1}^n b^q)^\frac{1}{q}\:\Box \end{align}$
We will delve into the applications of these inequalities in another post.
References:
proof 1 of Young’s inequality, Hölder's inequality
proof 4 of Cauchy-Schwartz inequality
Intuition of Cauchy-Schwartz inequality
Related:
Equality in Young's inequality for convolution
Young's inequality for three variables
Hölder's inequality with three functions
No comments:
Post a Comment