Processing math: 0%

Tuesday, 6 September 2016

Convolution as Weighted Average

We know that the average of f over [x-T,x+T] for any fixed x is given by f_{\text{avg}}(x)=\frac{1}{2T}\int_{x-T}^{x+T} f(t)\,\mathrm{d}t. Define \phi=\frac{\Pi_T}{2T}, where \begin{align}\Pi_T(t)=\begin{cases}1,&\quad -T\leq t\leq T\\ 0,&\quad |t|>T.\end{cases}\end{align} We then have \int_{-\infty}^\infty \phi(t)\,\mathrm{d}t=\frac{1}{2T}\int_{-T}^T \Pi_T(t)\,\mathrm{d}t=1. For any fixed x, \begin{align}\phi(x-t)=\begin{cases}\frac{1}{2T},&\quad x-T\leq t\leq x+T\\ 0,&\quad \text{else}.\end{cases}\end{align} Thus, for any given function f and any fixed value of x, \begin{align}f(t)\phi(x-t)=\begin{cases}\frac{1}{2T}f(t),&\quad x-T\leq t\leq x+T\\ 0,&\quad \text{else}.\end{cases}\end{align} We conclude that f_{\text{avg}}(x)=\int_{-\infty}^\infty f(t)\phi(x-t)\,\mathrm{d}t. More generally, when we replace \phi with an arbitrary function, we have the definition of a convolution: the convolution of two functions f and g is given by f*g(x)=\int_{-\infty}^\infty f(t)g(x-t)\,\mathrm{d}t. Reference: The Mathematics of Imaging by Timothy G. Freeman

Friday, 1 July 2016

Nth root

We first study the sequence x_{n+1}=\dfrac{1}{2}(x_n+\dfrac{a}{x_n}), where a=2,x_1=1 and examine the numbers x_2,x_3,x_4,x_5 : x_2=\dfrac{1}{2}(1+2)=1.5\\x_3=\dfrac{1}{2}(1.5+\dfrac{2}{1.5})\approx 1.4167\\x_4=\dfrac{1}{2}(1.4167+\dfrac{2}{1.4167})\approx1.4142\\x_5=\dfrac{1}{2}(1.4142+\dfrac{2}{1.4142})\approx 1.4142\\ \cdots We can show that the sequence tends to \sqrt{2}. More generally, take l=\lim x_n, so l=\lim x_{n+1} and so\begin{align*}l&=\dfrac{1}{2}(l+\dfrac{a}{l})\\ 2l^2&=l^2+a\\ l&=\sqrt{a}\;\;\text{(reject the negative number because }\;x_n>0).\end{align*} This sequence helps us generate the square root of any positive number. You may ask: how can we come up with such a sequence? The reasoning is as follows: if x_1\neq \sqrt{a}, then we have x_1<\sqrt{a} or x_1>\sqrt{a}. In the first case, we have \sqrt{a} x_1<a or \sqrt{a}<\dfrac{a}{x_1}. In the other case, if x_1>\sqrt{a}, then \sqrt{a}>\dfrac{a}{x_1}. Therefore, we have x_1<\sqrt{a}<\dfrac{a}{x_1} or \dfrac{a}{x_1}<\sqrt{a}<x_1. If we take the average of x_1 and \dfrac{a}{x_1}, namely x_2=\dfrac{1}{2}(x_1+\dfrac{a}{x_1}), then the result will be between x_1 and \dfrac{a}{x_1}. Repeat this process several times: x_3=\dfrac{1}{2}(x_2+\dfrac{a}{x_2}), x_4=\dfrac{1}{2}(x_3+\dfrac{a}{x_3}) \cdots The sequence (x_n) tends to \sqrt{a}.

In fact, the sequence for square root can be generalised to one for nth root. x_{n+1}=\dfrac{1}{p}\bigg((p-1)x_n+\dfrac{a}{x_n^{p-1}}\bigg) The reasoning is similar. We consider \underbrace{x_n\times x_n\times \cdots\times x_n}_{p-1\; times}\times \dfrac{a}{x_n^{p-1}}=a and take the average of these p factors.

These sequences are related to the method of Newton-Raphson. Finding the square root is equivalent to solving the equation x^2-a=0. Let f(x)=x^2-a. Then, x_{n+1}=x_n-\dfrac{f(x_n)}{f'(x_n)}=x_n-\dfrac{x_n^2-a}{2x_n}=\dfrac{1}{2}(x_n+\dfrac{a}{x_n}). As for the nth root, we can solve the equation x^p-a=0 and we have x_{n+1}=x_n-\dfrac{x_n^p-a}{px_n^{p-1}}=x_n-\dfrac{1}{p}x_n+\dfrac{a}{px_n^{p-1}}=\dfrac{1}{p}\bigg((p-1)x_n+\dfrac{a}{x_n^{p-1}}\bigg).

Saturday, 21 May 2016

Deriving Power Sums by Polynomial Interpolation

Most of us have seen these formulas: \sum_{i=k}^n k=1+2+\cdots+n=\dfrac{1}{2}n(n+1)\\ \sum_{i=k}^n k^2=1+4+\cdots+n^2=\dfrac{1}{6}n(n+1)(2n+1)\\ \sum_{i=k}^n k^3=1+8+\cdots+n^3=\dfrac{1}{4}n^2(n+1)^2\\ \sum_{i=k}^n k^4=1+16+\cdots+n^4=\dfrac{1}{30}n(n+1)(2n+1)(3n^2+3n-1). These can be proved by mathematical induction, but we can actually derive them on our own!

We simply acknowledge the following fact: for any positive integer p, the power sum \sum_{k=1}^n k^p can always be expressed as a polynomial f(n) that has degree p+1 and has rational coefficients.

Let's try to derive the first and the last formula. For the first, let f(n)=an^2+bn+c. We have f(1)=1,f(2)=3,f(3)=6. We want to solve the linear system: a+b+c=1\\ 4a+2b+c=3\\ 9a+3b+c=6, or equivalently in matrix form \begin{pmatrix}1&1&1\\ 4&2&1\\ 9&3&1\end{pmatrix}\begin{pmatrix}a\\b\\c\end{pmatrix}=\begin{pmatrix}1\\3\\6\end{pmatrix}. Note that \begin{pmatrix}1&1&1\\ 4&2&1\\ 9&3&1\end{pmatrix} is a Vandermonde matrix. Just type in vander([1 2 3])\[1;3;6] in Matlab and we have the solution: a=1/2,b=1/2,c=0.

For the last formula, let f(n)=an^5+bn^4+cn^3+dn^2+en+f. Then f(1)=1,f(2)=17,f(3)=98,f(4)=354,f(5)=979,f(6)=2275. Type vander([1 2 3 4 5 6]\[1;17;98;354;979;2275] in Matlab and we have a=1/5, b=1/2, c=1/3, d=0, e=-1/30, f=0. Finally, factor(1/5*x^5+1/2*x^4+1/3*x^3-1/30*x) gives the formula. The reader can also try to derive the formulas for \sum_{i=k}^n k^5 and \sum_{i=k}^n k^6.

Sunday, 15 May 2016

Manipulations of Sums

Let a_1,a_2,\cdots be any sequence of numbers. We are often interested in sums such as a_1+\cdots+a_n. This sum can be written more compactly using either of the following equivalent notations: \sum_{i=1}^n a_i\quad \text{or}\quad \sum_{1\leq i\leq n} a_i. In general, if R(i) is any relation involving i, the symbol \sum_{R(i)} a_i means the sum of all a_i, where i is an integer satisfying the condition R(i). The letter i is a dummy variable or index variable that may be renamed without changing the result.

We now discuss four important algebraic operations on sums.

1) The distributive law, for products of sums \bigg(\sum_{R(i)} a_i\bigg)\bigg(\sum_{S(j)} b_j\bigg)=\sum_{R(i)}\bigg(\sum_{S(j)} a_ib_j\bigg) Consider for example \begin{align}\bigg(\sum_{i=1}^2 a_i\bigg)\bigg(\sum_{j=1}^3 b_j\bigg)&=(a_1+a_2)(b_1+b_2+b_3)\\ &=(a_1b_1+a_1b_2+a_1b_3)+(a_2b_1+a_2b_2+a_2b_3)\\ &=\sum_{i=1}^2\bigg(\sum_{j=1}^3 a_ib_j\bigg).\end{align} 2) Interchanging order of summation \sum_{R(i)}\sum_{S(j)} a_{ij}=\sum_{S(j)}\sum_{R(i)} a_{ij} For example, \sum_{R(i)}\sum_{j=2}^2 a_{ij}=\sum_{R(i)}(a_{i1}+a_{i2})\\ \sum_{j=1}^2 \sum_{R(i)} a_{ij}=\sum_{R(i)} a_{i1}+\sum_{R(i)} a_{i2}. This means that \sum_{R(i)}(b_i+c_i)=\sum_{R(i)}b_i+\sum_{R(i)}c_i, where b_i=a_{i1} and c_i=a_{i2}. This operation is very useful because we often know a simple form for \sum_{R(i)} a_{ij} but not for \sum_{S(j)} a_{ij}. More generally, we have \sum_{R(i)}\sum_{S(i,j)}a_{ij}=\sum_{S'(j)}\sum_{R'(i,j)}a_{ij}, where S'(j) is the relation 'there is an integer i s.t. both R(i) and S(i,j) are true'; and R'(i,j) is the relation 'both R(i) and S(i,j) are true'. For example, if the summation is \sum_{i=1}^n\sum_{j=1}^i a_{ij}, then S'(j) is the relation 'there is an integer i s.t. 1\leq i\leq n and 1\leq j\leq i, namely 1\leq j\leq n; and R'(i,j) is the relation '1\leq i\leq n and 1\leq j\leq i', namely j\leq i\leq n. Thus, \sum_{i=1}^n\sum_{j=1}^i a_{ij}=\sum_{j=1}^n\sum_{i=j}^n a_{ij}. 3) Change of variable
\sum_{R(i)} a_i=\sum_{R(j)} a_j=\sum_{R(p(j))} a_{p(j)} The first equality holds because we are simply changing the dummy variable. In the second case, p(j) is a function of j that represents a permutation of the range. For each integer i satisfying the relation R(i), there is a unique integer j satisfying the relation p(j)=i. This condition is always satisfied in cases p(j)=c+j and p(j)=c-j, where c is an integer not depending on j. For instance, \sum_{1\leq i\leq n} a_i=\sum_{1\leq i-1\leq n} a_{i-1}=\sum_{2\leq i\leq n+1} a_{i-1}. Note that for infinite sums, replacing j by p(j) is valid iff \sum_{R(j)}|a_j| exists.

4) Manipulating the domain
If R(i) and S(i) are arbitrary relations, we have \sum_{R(i)}a_i+\sum_{S(i)}a_i=\sum_{R(i)\;\text{or}\;S(i)}a_i+\sum_{R(i)\;\text{and}\;S(i)}a_i. For example, \sum_{1\leq i\leq m}a_i+\sum_{m\leq i\leq n}a_i=\bigg(\sum_{1\leq i\leq n}a_i\bigg)+a_m, assuming that 1\leq m\leq n.

Examples from high school:
[The sum of a geometric progression] Assume that r\neq 1,n\geq 0. A conventional way of deriving the formula is to consider S=a+ar+\cdots+ar^n and S-rS. a+ar+\cdots+ar^n\\ \underline{-)\; ar+ar^2+\cdots+ar^{n+1}}\\ a-ar^{n+1}=S(1-r)\\ \Rightarrow S=a\dfrac{1-r^{n+1}}{1-r} This can be easily derived too by manipulating sums: \begin{align}a+ar+\cdots+ar^n&=a+\sum_{i=1}^n ar^i\\ &=a+r\sum_{i=1}^n ar^{i-1}\\ &=a+r\sum_{i=0}^{n-1} ar^i\\ &=a+r\sum_{i=0}^n ar^i-ar^{n+1}\\ &=a\dfrac{1-r^{n+1}}{1-r}\end{align}
[The sum of a arithmetic progression] Assume n\geq 0. Then \begin{align}a+(a+d)+\cdots+(a+nd)&=\sum_{i=0}^n (a+di)\\ &=\sum_{n-i=0}^n (a+d(n-i))\\ &=\sum_{i=0}^n (a+dn-di)\\ &=\sum_{i=0}^n (2a+dn)-\sum_{i=0}^n (a+di)\\ &=(n+1)(2a+dn)-\sum_{i=0}^n (a+di)\\ &=a(n+1)+\dfrac{1}{2}dn(n+1)\\ &=\dfrac{1}{2}(n+1)(2a+dn)\end{align} Another example:
Let \begin{align}S&=\sum_{i=0}^n\sum_{j=0}^i a_ia_j\\ &=\sum_{j=0}^n\sum_{i=j}^n a_ia_j\\ &=\sum_{i=0}^n\sum_{j=i}^n a_ia_j \end{align} Then \begin{align}2S&=\sum_{i=0}^n\bigg(\sum_{j=0}^i a_ia_j+\sum_{j=i}^n a_ia_j\bigg)\\ &=\sum_{i=0}^n\bigg((\sum_{j=0}^n a_ia_j)+a_ia_i\bigg)\\ &=\bigg(\sum_{i=0}^n a_i\bigg)^2+\bigg(\sum_{i=0}^n a_i^2\bigg)\end{align} We have derived the identity \sum_{i=0}^n\sum_{j=0}^i a_ia_j=\dfrac{1}{2}\bigg(\bigg(\sum_{i=0}^n a_i\bigg)^2+\bigg(\sum_{i=0}^n a_i^2\bigg)\bigg).

Examples in analysis
In Riemann integration, we have the following result: Suppose f\in \mathscr{R}[a,b] and g\in \mathscr{C}([m,M]), where M=\sup\{f(x)\mid a\leq x\leq b\} and m=\inf\{f(x)\mid a\leq x\leq b\}. Then g\circ f\in \mathscr{R}[a,b].

Idea: Consider continuous points and discontinuous points separately.

Proof: Let \varepsilon>0. Since g is uniformly continuous on [m,M], there exists \delta>0 s.t. |g(x)-g(y)|<\varepsilon as x,y\in [m,M] and |x-y|<\delta. Also, since f\in \mathscr{R}[a,b], there exists a partition P=\{x_0,x_1,\cdots,x_n\} of [a,b] s.t. U(f,P)-L(f,P)<\varepsilon \delta. Consider A=\{i\mid osc(f)< \delta\} (index w.r.t. continuous points) and A^c=\{i\mid osc(f)\geq \delta\} (index w.r.t. discontinuous points). i\in A does not cause much trouble. To handle i\in A^c, consider \begin{align}\sum_{i\in A^c}(x_i-x_{i-1})&=\dfrac{1}{\delta}\sum_{i\in A^c}\delta (x_i-x_{i-1})\\ &\leq \dfrac{1}{\delta}\sum_{i=1}^n(M_i(f)-m_i(f))(x_i-x_{i-1})\\ &=\dfrac{1}{\delta}(U(f,P)-L(f,P))\\ &<\varepsilon. \end{align}Therefore, \begin{align}U(g\circ f,P)-L(g\circ f,P)&=\sum_{i=1}^n\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\ &=\sum_{i\in A}\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\&+\sum_{i\in A^c}\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\ &\leq \varepsilon \sum_{i\in A}(x_i-x_{i-1})+C\sum_{i\in A^c}(x_i-x_{i-1})\\ &\leq \varepsilon(b-a)+C\varepsilon\\ &=(b-a+C)\varepsilon,\end{align} where C=\max\{g(u)\mid m\leq u\leq M\}-\min\{g(u)\mid m\leq u\leq M\}. It follows that g\circ f\in \mathscr{R}[a,b]. \Box

Say we want to evaluate \sum_{n=1}^\infty \dfrac{2n-1}{2^n} (which is convergent by ratio test). Consider \begin{align}s_n&=\sum_{k=1}^n \dfrac{2k-1}{2^k}\\ &=\dfrac{1}{2}+\sum_{k=2}^n\bigg(\dfrac{2k-1}{2^k}-\dfrac{2k-3}{2^k}\bigg)+\sum_{k=2}^n \dfrac{2k-3}{2^k}\\ &=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=2}^n \dfrac{2k-3}{2^k}\\ &=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=2}^n \dfrac{2(k-1)-1}{2^{k-1+1}}\\ &=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=1}^{n-1} \dfrac{2k-1}{2^{k+1}}\\ &=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\dfrac{1}{2}s_n-\dfrac{2n-1}{2^n}\\ &=1+2\bigg(1-\dfrac{1}{2^{n-1}}\bigg)-\dfrac{2n-1}{2^n}\end{align} As n\to \infty, s_n\to 3.

Many manipulations of sums and other formulas become significantly simpler if we use the bracket notation: [\text{statement}]=\begin{cases} 1,\quad \text{if the statement is true;}\\ 0,\quad \text{if the statement is false.}\end{cases} Then we can write \sum_{R(i)} a_i=\sum_i a_i[R(i)] since the terms of that infinite sums are zero when R(i) is false. In fact, we have seen this expression before -- Kronecker delta symbol is a special case of bracket notation: \delta_{ij}=[i=j]=\begin{cases}1,\quad \text{if }i=j\\ 0,\quad \text{if }i\neq j\end{cases}. With bracket notation, we can derive rule (3) from rule (1) and (2): \begin{align}\sum_{R(p(j))} a_{p(j)}&=\sum_j a_{p(j)}[R(p(j))]\\ &=\sum_j \sum_i a_i [R(i)][i=p(j)]\\ &=\sum_i a_i[R(i)]\sum_j [i=p(j)].\end{align} If p is a permutation of the range, then we are left with \sum_i a_i [R(i)], which is \sum_{R(i)} a_i.

We can also derive rule (2): \sum_{i=1}^n\sum_{j=1}^i a_{ij}=\sum_{i,j} a_{ij}[1\leq i\leq n][1\leq j\leq i]=\sum_{i,j}a_{ij}[1\leq j\leq n][j\leq i\leq n]=\sum_{j=1}^n\sum_{i=j}^n a_{ij}, since [1\leq j\leq n][1\leq j\leq i]=[1\leq j\leq i\leq n]=[1\leq j\leq n][j\leq i\leq n].

More to explore:
Möbius inversion formula
arithmetic function

Reference
illustration

Saturday, 7 May 2016

Evaluate \int_0^\pi \log(1-2a\cos x+a^2)\;dx

From the identity 1-a^{2n}=(1-a^2)\prod_{i=1}^{n-1}(1-2a\cos \dfrac{i\pi}{n}+a^2),\quad a\in \Bbb R prove that \int_0^\pi \log(1-2a\cos x+a^2)\;dx=\begin{align}\begin{cases}0\quad &|a|<1\\ 2\pi\log|a|\quad &|a|>1.\end{cases}\end{align}


The identity follows from finding the roots of \dfrac{1-a^{2n}}{1-a^2}, whose proof is similar to this problem.

I: Elementary real analysis
1-a^{2n}=(1-a^2)\prod_{i=1}^{n-1}(1-2a\cos \dfrac{i\pi}{n}+a^2)\\ \dfrac{1-a^{2n}}{1-a^2}=\prod_{i=1}^{n-1}(1-2a\cos \dfrac{i\pi}{n}+a^2)\\ 1+a^2+a^4+\cdots+a^{2n-2}=\prod_{i=1}^{n-1}(1-2a\cos \dfrac{i\pi}{n}+a^2)\\ \dfrac{\pi}{n}\log(1+a^2+a^4+\cdots+a^{2n-2})=\dfrac{\pi}{n}[\log(1-2a\cos \dfrac{\pi}{n}+a^2)+\log(1-2a\cos \dfrac{2\pi}{n}+a^2)+\cdots\\+\log(1-2a\cos \dfrac{(n-1)\pi}{n}+a^2)]Recall the motivation of the definition of Riemann sum: we want to approximate an area bounded by a curve by calculating areas of rectangles. Riemann sum is the sum of areas of rectangles. We partition [0,\pi] into n subintervals and find the sum of areas of rectangles each with width \dfrac{\pi}{n} and lengths \log(1-2a\cos \dfrac{i\pi}{n}+a^2) for i=1,2,\cdots,n. Therefore, we have \begin{align}\int_0^\pi \log(1-2a\cos x+a^2)\;dx&=\lim_{n\to \infty}\dfrac{\pi}{n}\sum_{i=1}^n \log(1-2a\cos \dfrac{i\pi}{n}+a^2)\\ &=\lim_{n\to \infty}\dfrac{\pi}{n}\log(1+a^2+a^4+\cdots+a^{2n-2}).\end{align} When |a|<1, \log(1+a^2+a^4+\cdots+a^{2n-2})=0. It follows that the integral is 0 when |a|<1. As for |a|>1, the trick is to make use of the case |a|<1. Note that a>1 implies that \dfrac{1}{a}<1: \lim_{n\to \infty} \dfrac{\pi}{n}\log\bigg(a^{2n-2}(\dfrac{1}{a^{2n-2}}+\dfrac{1}{a^{2n-4}}+\cdots+1)\bigg)=\lim_{n\to \infty} \dfrac{2n-2}{n}\pi \log a=2\pi\log a.
II: Complex analysis

More to explore
other proofs
another proof
yet another proof
Evaluating a similar integral by complex analysis methods