Thursday 29 January 2015

Interesting Graphs: Catenary


If a flexible string is suspended under gravity by its two ends, the shape resembles a chain (Latin catena). That's why the curve is called a catenary. The equation of a catenary in Cartesian coordinates: $y=a\cosh \frac{x}{a}$, where $a$ is a constant; $a$ depends on the mass per unit length and tension of the string. The derivation of equation for the curve can be found here.

Catenaries occur naturally, since they minimize the gravitational potential energy of a string or rope whose location is fixed at two ends, which is equivalent to minimizing the area under the string. But they are also optimal for architects when a flexible cable (or its equivalent) is subject to a uniform force (e.g., gravity or the weight of a bridge, etc.).

Catenoids
The catenoid is a surface of revolution of a catenary curve rotated around its directrix.



Claim: Catenoids are minimal surfaces that minimizes area for a given boundary.

Definition: A minimal surface is a surface $M$ with mean curvature $H=0$ at all points $p \in M$.

Mean curvature $\large H=\frac{Eg+Ge-2Ff}{2(EG-F^2)}$ [Proof: later]

Proof of the claim:

A catenoid can be parametrized by $$x(u,v)=(a\cosh v \cos u,a\cosh v \sin u,av).$$ We then evaluate the partial derivatives of $x$: $$x_u = (-a\cosh v \sin u,a\cosh v \cos u,0)\\
x_v = (-a\sinh v \cos u,a\sinh v \sin u,a)$$ We know that $$n=\frac{x_u \times x_v}{|x_u \times x_v|},$$ and so we have the coefficients of the first fundamental form: $$E=x_u \cdot x_u=a^2\cosh^2 u\\
F=x_u \cdot x_v=0\\
G=x_v \cdot x_v=a^2\cosh^2 u,$$ and the coefficients of the second fundamental form: $$e=n \cdot x_{uu}=-a\\
f=n \cdot x_{uv}=0\\
g=n \cdot x_{vv}=a.$$ Substituting the values to $H$, we have $$H=\frac{Eg+Ge-2Ff}{2(EG-F^2)}=0$$ Thus, catenoid is a minimal surface.

2nd method:
http://www.princeton.edu/~rvdb/WebGL/catenoid_explanation.html

Similarity of catenary and parabola
In the previous post, we have proved that $\cosh x=\frac{e^x+e^{-x}}{2}$, then we have $\cosh x=1+\frac{x^2}{2!}+\frac{x^4}{4!}+...$
For $x\approx 0$, $\cosh x \approx 1+\frac{x^2}{2}$.
RHS represents a parabola, so we conclude that for small $x$, a catenary can be approximated by a parabola.

A common misconception is that a parabola can be used to construct an arch. If we look at the function for a parabola, we see that the slope at any point is given by $2x$ and is changing linearly. On the other hand if we look at the function $\cosh x$, we see that the slope at any point is given by $\frac{e^x-e^{-x}}{2}$, which means the slope of a catenary curve is changing exponentially. In other words, the legs of an inverted catenary curve will be straighter at the base of the arch compared to an inverted parabola, giving the structure more horizontal support.

Applications
Real life examples of catenaries include the cables of a suspension bridge, a rope hanging between two posts, each strand of a spider web, and the Gateway Arch in St. Louis.

If you use physics to model the differential equation describing the effect of a uniform force on a flexible cable, its solution will be of the form $A\cosh ax$.

Reference:
http://www.ias.ac.in/resonance/Volumes/11/08/0081-0085.pdf
http://aleph0.clarku.edu/~djoyce/ma131/gallery.pdf
http://www.princeton.edu/~rvdb/WebGL/catenoid_explanation.html

Wednesday 28 January 2015

$68\%, 95\%, 99.7\%$

When we learn standard derivations, we learn about 68%, 95%, 99.7%, the percentage of values that lie within one, two and three standard deviations of the mean for a normal distribution. Have you ever wondered how these numerical values come about?

They come from the cumulative distribution function of the normal distribution:
[Proof: pending] $\large \Phi(x)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{x}e^{-\frac{1}{2}t^2}dt$

Note that $\phi(x) \ne \Phi(x)$. $\phi(x)$ refers to height, whereas $\Phi(x)$ refers to area.


For example, $\large \Phi(2)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{2}e^{-t^2}dt$. This integral cannot be expressed in closed form. We resort to using numerical integration, and we have $\Phi(2) \approx 0.9772$, or $P(x\leq\mu+2\sigma) \approx 0.9772$. To compute the probability that an observation is within two standard deviations of the mean, that is, the area shaded in light blue:

$\begin{align}P(\mu-2\sigma \leq x \leq \mu+2\sigma)&=\Phi(2)-\Phi(-2)\\&=\int_{-\infty}^{2}-\int_{-\infty}^{-2}\\&\approx 0.9772-(1-0.9772)\\&\approx 0.9545\end{align}$

Remark: $\Phi(x)=1-\Phi(-x)$.

Tuesday 27 January 2015

Hyperbolic sines and cosines

We all know that the equation for a circle is $x^2+y^2=1$. What about that for a hyperbola? It is $x^2-y^2=1$. Since $\cosh t, \sinh t$ satisfy this equation, that is, $\cosh^2 t - \sinh^2 t=1$, they are called hyperbolic functions.

In this post, we will prove two results: $$\cosh a=\frac{e^a+e^{-a}}{2}\\ \sinh a=\frac{e^a-e^{-a}}{2}.$$

In the right figure, the area of the sector of the rectangular hyperbola $x^2-y^2=1$ bounded by the x-axis, ray, and hyperbola is $\dfrac{a}{2}$.

Proof:

Prerequisite knowledge:
Area of the sector of a parametric curve
Area of the sector bounded by two radii and the arc $P_0P$ of a parametric curve is given by $A=\frac{1}{2}\int_{t_0}^t [x(t)y'(t)-y(t)x'(t)]dt$ where the parametric values, $t_0$ and t relate to the endpoints, $P_0$ and P of the arc, respectively.

Now, we can find the area of the sector.
Since $\cosh^2 a-\sinh^2 a=1$, we have a parametric curve $x=\cosh a, y=\sinh a$.
Then $x'(a)=\sinh a, y'(a)=\cosh a$.
$x(a)y'(a)=\cosh^2 a, y(a)x'(a)=\sinh^2 a$
$A=\frac{1}{2}\int_0^a (\cosh^2 a-\sinh^2 a)da=\frac{1}{2}\int_0^a da=\frac{a}{2} \:\Box$


$\frac{a}{2}=\frac{1}{2}xy-\int_1^x y\:dx$
$a=x\sqrt{x^2-1}-2\int_1^x \sqrt{x^2-1}\:dx$
$2\int_1^x \sqrt{x^2-1}\:dx\\=2\int_0^{\sec^{-1} x} \sqrt{\sec^2 u-1} \tan u \sec u \:du\\=2\int_0^{\sec^{-1} x} \tan^2 u \sec u\:du\\=2\int_0^{\sec^{-1} x} \frac{\sin^2 u}{\cos^3 u}du\\=\int_0^{\sec^{-1} x} \sin u \: d(\cos^{-2} u)\\=\frac{\sin u}{\cos^2 u}|_0^{\sec^{-1} x}-\int_0^{\sec^{-1} x}\cos^{-2} u \cos u \: du \quad (*)\\=xy-\int_0^{\sec^{-1} x}\sec u\:du\\=xy-[\ln|\sec u+\tan u|]_0^{\sec^{-1} x}\\=xy-\ln|x+\sqrt{x^2-1}| \quad (**)$
$a=xy-xy+\ln|x+\sqrt{x^2-1}|=\ln|x+\sqrt{x^2-1}|$
$e^a=x+\sqrt{x^2-1}$
$e^{2a}=x^2+2x\sqrt{x^2-1}+x^2-1=2x(x+\sqrt{x^2-1})-1=2xe^a-1$
$e^{2a}+1=2xe^a$
$e^a+e^{-a}=2x \Rightarrow x \equiv \cosh a = \frac{e^a+e^{-a}}{2}$
$\sinh a \equiv y=\sqrt{(\frac{e^a+e^{-a}}{2})^2-1}=\sqrt{\frac{e^{2a}+2+e^{-2a}}{4}-1}=\sqrt{\frac{(e^a-e^{-a})^2}{4}}=\frac{e^a-e^{-a}}{2} \Box$

Explanations:
$(*) \frac{\sin u}{\cos^2 u}|_0^{\sec^{-1} x}=\sec u \tan u |_0^{\sec^{-1} x}$
$(**) \tan (\sec^{-1} x)=\:?\\ \sec^{-1} x=y\\ \sec y=x\\ \sec^2 y=x^2\\ 1+\tan^2 y=x^2\\ \tan^2 y=x^2-1\\ \
\therefore \tan (\sec^{-1} x)=\sqrt{x^2-1}$

Relationship to differential equation
In physics, one of the most important differential equation is $y^{\prime \prime}(x)+a^2y(x)=0$. The solution of this equation is $y=A \cos(ax)+B \sin(ax)$, where A and B are constants. (Verify it yourself: differentiate the expression twice.) From trying this yourself, you probably think that the solution to $y^{\prime \prime}(x) - a^2y(x)=0$ is $y=A \cosh(ax)+B \sinh(ax)$. In fact, it is the solution!

Application of hyperbolic trig functions
They are used in the field of engineering, and can be used to solve second order ordinary differential equations. Going beyond this, we can often find hyperbolic trig functions being used in architecture. In particular, the cosh function is used to trace out a curve called a catenary, which is formed from simply hanging a string from two equally high points. We will discuss more about catenary in another post.

References:
http://math.scu.edu/~dostrov/Hyperbolic_Functions.pdf
http://www.nabla.hr/CL-DefiniteIntAppl2.htm
http://www.mathed.soe.vt.edu/Undergraduates/EulersIdentity/HyperbolicTrig.pdf

Monday 26 January 2015

Binomial and Poisson distributions

Let's say in a town with a population of 60000, we would expect 1 in 40000 of the population suffering from a rare disease in a year. Then the expected number of cases is 60000 $\large \cdot \frac{1}{40000}$ or 1.5.

This situation can be modelled by the binomial distribution.
P(get disease) = $\large \frac{1}{40000}$ and P(not get disease) = $\large \frac{39999}{40000}$.

Probability of 5 cases among 60000 people (and thus 59995 people not getting the disease):
$\large C_5^{60000} (\frac{39999}{40000})^{59995}(\frac{1}{40000})^5 \approx 0.0141$

What we are probably interested in, however, is not the probability of exactly 5 cases but that of 5 or more cases.

P(5 or more cases)
$= 1 - [P(0) + P(1) + P(2) + P(3) + P(4)]\\
= \large 1-(\frac{39999}{40000})^{60000}-C_1^{60000} (\frac{39999}{40000})^{59999}\frac{1}{40000}\\ \large-C_2^{60000} (\frac{39999}{40000})^{59998}(\frac{1}{40000})^2-C_3^{60000} (\frac{39999}{40000})^{59997}(\frac{1}{40000})^3\\ \large-C_4^{60000} (\frac{39999}{40000})^{59996}(\frac{1}{40000})^4\\
= \large 1-0.223-0.335-0.251-0.126-0.047\\
= \large 0.019$

As you can see, the calculation for this binomial distribution is tedious. In fact, we can approximate the binomial terms as follows, giving us a completely different distribution -- Poisson distribution. We assume the event is rare but there are many opportunities for it to occur, that is, p is small and n is large.

Let $\large (\frac{39999}{40000})^{60000}=k$, a constant.

Then P(1)
$\Large =C_1^{60000} (\frac{39999}{40000})^{59999}\frac{1}{40000}=\frac{60000\cdot(\frac{39999}{40000})^{60000}\cdot\frac{40000}{39999}}{40000}\\
\Large =k\cdot \frac{60000}{39999}\approx k\cdot\frac{60000}{40000}=k\cdot 1.5$

In the same vein, we found P(2) to be approximately $k\cdot \frac{(1.5)^2}{2}$.

Now do you notice something?

$\text{Number of cases} \:\:\:\:\: 0 \:\:\:\:\:\:\:\:\:\: 1 \:\:\:\:\:\:\:\:\:\:\:\:\:\:\: 2 \:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\: 3 \:\:\:\:\:\:\:\:\:\:\:\:\:\:\:\: 4 \:\:\:\:\:\:\: \ldots\\
\text{Probability} \:\:\:\:\:\:\:\:\:\:\:\:\:\:\: k \:\:\:\:\: k \cdot 1.5 \:\:\:\:\: \frac{k \cdot (1.5)^2}{2!} \:\:\:\:\: \frac{k \cdot (1.5)^3}{3!} \:\:\:\:\: \frac{k \cdot (1.5)^4}{4!} \:\:\:\:\: \ldots $

Sum of probabilities = 1
$\begin{align} k + 1.5k + \frac{(1.5)^2}{2!}\cdot k + \frac{(1.5)^3}{3!}\cdot k + … &=1\\
k (1+1.5+\frac{(1.5)^2}{2!}+\frac{(1.5)^3}{3!}+…) &=1\\
k \cdot e^{1.5} &=1\\
k&=e^{-1.5}\end{align}$
$\large P(X=r)=\frac{e^{-1.5}(1.5)^r}{r!}$, where the discrete random variable X denotes the number of cases of the disease.

This can be generalised to the Poisson distribution with mean λ for which $\large P(X=r)=e^{-\lambda}\frac{\lambda^r}{r!}$.

More to explore:
http://www.math.uah.edu/stat/expect/Properties.html
http://web.mit.edu/jorloff/www/18.05/pdf/class6-prep-a.pdf
http://math.arizona.edu/~jwatkins/h-expectedvalue.pdf
P40-52, 134-137

Saturday 24 January 2015

Thursday 22 January 2015

Use of vectors

Prove trigonometry identities
Sine formula $\LARGE \frac{a}{\sin\alpha}=\frac{b}{\sin\beta}=\frac{c}{\sin\gamma}$
$\vec{A}+\vec{B}+\vec{C}=\vec{0}\\
\begin{align}0&=\vec{A}\times (\vec{A}+\vec{B}+\vec{C})\\
&=\vec{A}\times\vec{B}+\vec{A}\times\vec{C}\\
\vec{B}\times\vec{A}&=\vec{A}\times\vec{C}\\
ba \sin\gamma&=ac \sin\beta\\
\frac{b}{\sin\beta}&=\frac{c}{\sin\gamma}\\
\end{align}$
Similarly, expand $\vec{B}\times (\vec{A}+\vec{B}+\vec{C})$, we have $\Large \frac{a}{\sin\alpha}=\frac{c}{\sin\gamma}$ $\Box$

Cosine formula $a^2+b^2-c^2=2ab\cos\gamma$
$\begin{align}c^2&=\vec{c}\cdot\vec{c}\\
&=(\vec{a}+\vec{b})\cdot(\vec{a}+\vec{b})\\
&=\vec{a}\cdot\vec{a}+2\vec{a}\cdot\vec{b}+\vec{b}\cdot\vec{b}\\
&=a^2+2ab\cos(\pi-\gamma)+b^2\\
&=a^2-2ab\cos\gamma+b^2\:\:\Box
\end{align}$

Prove inequalities

$\sqrt{a^2+(1-b)^2}+\sqrt{b^2+(1-c)^2}+\sqrt{c^2+(1-a)^2}\geq \dfrac{3}{\sqrt{2}}$
Proof: Let $\vec{x}=\small\begin{pmatrix}a \\ 1-b\end{pmatrix}$, $\vec{y}=\small\begin{pmatrix}b \\ 1-c\end{pmatrix}$,$\vec{z}=\small\begin{pmatrix}c \\ 1-a\end{pmatrix}$.
By Minowski's inequality, $|\vec{x}|+|\vec{y}|+|\vec{z}| \geq |\vec{x}+\vec{y}+\vec{z}|$.
$\begin{align}\sqrt{a^2+(1-b)^2}+\sqrt{b^2+(1-c)^2}+\sqrt{c^2+(1-a)^2}&\geq \sqrt{(a+b+c)^2-[3(a+b+c)]^2}\\&=\sqrt{2(a+b+c-\dfrac{3}{2})^2+\dfrac{9}{2}}\\&\geq \dfrac{3}{\sqrt{2}}\:\Box \end{align}$

$\sqrt{x^2+1}+\sqrt{y^2+1}+\sqrt{z^2+1}\geq \sqrt{6(x+y+z)}\quad x,y,z>0$
Proof: Let $\vec{a}=\small\begin{pmatrix}x \\ 1\end{pmatrix}$, $\vec{b}=\small\begin{pmatrix}y \\ 1\end{pmatrix}$,$\vec{b}=\small\begin{pmatrix}z \\ 1\end{pmatrix}$.
$\begin{align}|\vec{a}|+|\vec{b}|+|\vec{c}| &\geq |\vec{a}+\vec{b}+\vec{c}| \\
\iff \sqrt{x^2+1}+\sqrt{y^2+1}+\sqrt{z^2+1}&\geq \sqrt{(x+y+z)^2+3^2}\\
&\geq \sqrt{6(x+y+z)}\:\Box\end{align}$

Tuesday 20 January 2015

Application of linear algebra: searching database

Basic idea:
$\large \cos\theta=\frac{|\vec{a} \cdot \vec{b}|}{|\vec{a}||\vec{b}|}=\vec{x}^T\vec{y}$
$\vec{x},\vec{y}$: unit vectors of $\vec{a}$ and $\vec{b}$ respectively
$\theta$: angle between $\vec{x}$ and $\vec{y}$.

database matrix: $\large \vec{x}=\frac{\vec{a}}{|\vec{a}|}$
search vector: $\large \vec{y}=\frac{\vec{b}}{|\vec{b}|}$

If $\cos\theta = 0$, $\theta=90^\circ$, the document does not contain any of the search words and the corresponding column vector of the database matrix is orthogonal to the search vector.

If $\cos\theta$ is close to $1$, $\theta \sim 0$, the data corresponding to that vector best matches our search criteria.

More to explore:
Latent Semantic Indexing (LSI)
Singular value decomposition
Covariance
Least squares problem

Monday 19 January 2015

Cross Product and Lagrange's Identity

$\begin{align}|x \times y|^2&=\begin{vmatrix}x\cdot x&x\cdot y\\x\cdot y & y\cdot y \end{vmatrix}\\&=|x|^2|y|^2-(x\cdot y)^2\\
&=|x|^2|y|^2(1-\cos^2 \theta)\\
&=|x|^2|y|^2\sin^2 \theta\end{align}$

Proof of $|x \times y|^2=|x|^2|y|^2-(x\cdot y)^2$ (Lagrange's Identity):

$\large{\begin{align}RHS
&=\sum_{i=1}^n {x_i}^2 \sum_{i=1}^n {y_i}^2-(\sum_{i=1}^n {x_i y_i})^2\\
&=\sum_{i=1}^n {x_i}^2 \sum_{j=1}^n {y_j}^2-\sum_{i=1}^n {x_i y_i}\sum_{j=1}^n {x_j y_j}\\
&=\sum_{i,j=1}^n {x_i}^2 {y_j}^2-\sum_{i,j=1}^n {x_i y_i x_j y_j}\\
&=\sum_{i<j} {x_i}^2{y_j}^2+\sum_{i=1}^n {x_i}^2 {y_i}^2+\sum_{i>j} {x_i}^2{y_j}^2\\
& -\sum_{i<j} x_i y_i x_j y_j- \sum_{i=1}^n{x_i}^2{y_i}^2-\sum_{i>j} x_i y_i x_j y_j\\
&=\sum_{i<j} {x_i}^2 {y_j}^2+\sum_{i<j} {x_j}^2 {y_i}^2-2\sum_{i<j} x_i y_i x_j y_j\\
&=\sum_{i<j} ({x_i}^2{y_j}^2-2x_i y_i x_j y_j+{x_j}^2{y_i}^2)\\
&=\sum_{i<j} (x_i y_j-x_j y_i)^2\\
&=LHS \end{align}}$

Computations with cross products:

Area of triangle in space
Let $u,v,w$ be vertices of a triangle. Treat them as arrows from the origin. Then, two sides of the triangle can be given by $v-u$ and $w-u$. Thus, the area of the triangle is $\frac{1}{2}|(v-u)\times (w-u)|$.
By the bilinearity of cross products, $(\boldsymbol{v-u})\times (\boldsymbol{w-u})=v\times w-u\times w-v\times u+u\times u$.
By antisymmetry, $u\times u=0$.
Therefore, the area of the triangle is given by $\frac{1}{2}|v\times w+w\times u+u\times v|$. Note the cyclic pattern.

Remark: We can make use of this result to prove the three dimensional Pythagorean theorem. Consider the tetrahedron with vertices $A=(a,0,0)$, $B=(0,b,0)$, $C=(0,0,c)$ and $O=(0,0,0)$. For any three points $X,Y,Z$ in space, write $[X,Y,Z]$ for the area of the triangle with vertices $X,Y,Z$. Then, the three dimensional Pythagorean theorem states that $[ABC]^2=[OAB]^2+[OBC]^2+[OCA]^2$.

Equation of the plane spanned by two vectors (skipped)

Intersection of two planes
Say we have two non-parallel planes $ax+by+cz=0$ and $a'x+b'y+c'z=0$. They have normal vectors $n=(a,b,c)$ and $n’=(a',b',c')$ respectively. The vector $n \times n’$ is perpendicular to both $n$ and $n’$, so it lies on both planes. The line of intersection of the planes is therefore the line through $n \times n’$.

References:
http://www.owlnet.rice.edu/~fjones/chap7.pdf
https://www2.bc.edu/~reederma/Linalg13.pdf

Useful:
http://www.odeion.org/pythagoras/pythag3d.html
http://www.netcomuk.co.uk/~jenolive/homevec.html

Sunday 18 January 2015

Interesting Graphs: Conics

Conic sections: intersection curves of a plane and a right circular conical surface



The four conic sections (hyperbola, parabola, ellipse, circle) are produced when the plane does NOT pass through the vertex. When the plane passes through the vertex, degenerate conics (two intersecting lines, a line, a point) will be produced.

Conics are generally given by a second degree equation: $Ax^2+Bxy+Cy^2+Dx+Ey+F=0$
$Δ>0$ hyperbola, pair of intersecting lines
$Δ<0$ ellipse, circle, point or no graph
$Δ=0$ parabola, line, pair of parallel lines or no graph

Demonstration:


Green: Hyperbola $Δ=5^2-4(2)(2)=9>0$
Orange: Ellipse $Δ=1^2-4(2)(2)=-15<0$
Blue: Circle $Δ=0^2-4(2)(2)=-16<0$
Purple: Parabola $Δ=4^2-4(2)(2)=0$

 

Hyperbola: the set of points in a plane whose distances from two fixed points (foci) in the plane have a constant difference.
Implicit form: $\dfrac{(x-h)^2}{a^2}-\dfrac{(y-k)^2}{b^2}=1$
Parametric form: $x=h+a\sec\theta$, $y=k+b\tan\theta$

 

Ellipse: the set of all points in a plane whose distances from two fixed points (foci) in the plane have a constant sum.
Implicit form: $\dfrac{(x-h)^2}{a^2}+\dfrac{(y-k)^2}{b^2}=1$
Parametric form: $x=h+a\cos\theta$, $y=k+b\sin\theta$

Ellipses and hyperbolas are called central conics because they have a centre of symmetry, while parabolas are non-central. For both ellipses and hyperbolas, a and b are the axis lengths. The larger one of a and b is the major axis while the smaller one is the minor axis.

Circle:
Implicit form: $(x-h)^2+(y-k)^2=r^2$
Parametric form: $x=h+r\cos\theta$, $y=k+r\sin\theta$

 

Parabola:
Implicit form: $x^2=4py$
Parametric form: $x=t$, $y=\dfrac{t^2}{4p}$

How to distinguish between non-degenerate and degenerate conics?
[pending]

Conics in matrix form
Each point $\vec{x}=(x,y)$ is considered to be a column vector with 1 as its third component, i.e. $\vec{x}=\begin{pmatrix} x\\y\\1\end{pmatrix}$ and $\vec{x}^T= (x, y, 1)$. The six coefficients of the general second degree polynomial are then used to construct a 3x3 symmetric matrix as follows: $\vec{Q}=\begin{pmatrix} A&B&D\\B&C&E\\D&E&F\end{pmatrix}$
$\vec{x}^T\vec{Q}\vec{x}=\vec{0}$

Application:
Although conics and quadric surfaces existed around 2000 years ago, they are still the most popular objects in many computer aided design and modeling systems.

Thursday 15 January 2015

Eigenvalues and eigenvectors

$\lambda$ is an eigenvalue of $A$ if there exists a non-zero vector $\vec{v}$ such that
$A\vec{v}=\lambda \vec{v}$.
Vector $\vec{v}$ is called an eigenvector of $A$ corresponding to $\lambda$.

$A\vec{v}=\lambda \vec{v}\\
(A-\lambda I)\vec{v}=\vec{0}$

Note that $A-\lambda I$ must be singular, implying that $\det(A-\lambda I)=0$.
[If $A-\lambda I$ is invertible, $(A-\lambda I)^{-1}(A-\lambda I)\vec{v}=(A-\lambda I)^{-1}\vec{0} \Rightarrow \vec{v}=\vec{0}$. Contradiction. We want a non-zero vector.]
After dealing with some algebra, we know the eigenvalues.

Sidenote: the set of all vectors $\vec{v}$ satisfying $A\vec{v}=\lambda \vec{v}$ is called the eigenspace of $A$ corresponding to $\lambda$.

Application: solving differential equations

Example:
$\frac{dx}{dt}=4x-y\\ \frac{dy}{dt}=2x+y$
$\frac{dX}{dt}=AX$ where $X=\begin{pmatrix}x\\y \end{pmatrix},A=\begin{pmatrix}4&-1\\2&1 \end{pmatrix}$

$\frac{d}{dt}PY=APY\\
P\frac{dY}{dt}=APY\\
\frac{dY}{dt}=P^{-1}APY$
Characteristic polynomial $\delta(t)$ of $A$:
$\delta(t)=|tI-A|=\begin{vmatrix} t-4 & 1\\ -2 & t-1 \end{vmatrix}=t^2-5t+6=(t-3)(t-2)$
Eigenvalues of $A$: $2,3$
Substitute $t=3$, we have the homogeneous system $-x+y=0$ and $-2x+2y=0 \Rightarrow \vec{v_1}=(1,1)$.
Substitute $t=2$, $-2x+y=0 \Rightarrow \vec{v_2}=(1,2)$.
$P=(\vec{v_1}\vec{v_2})=\begin{pmatrix}1&1\\1&2\end{pmatrix}$
$B=P^{-1}AP=\begin{pmatrix}3&0\\0&2\end{pmatrix}$
Diagonalise the system by changing variables using $P$.
$\begin{pmatrix}x\\y \end{pmatrix}=P\begin{pmatrix}r\\s \end{pmatrix}\\
x = r + s\\
y = r + 2s\\
\frac{d}{dt}\begin{pmatrix}r\\s \end{pmatrix}=\begin{pmatrix}3&0\\0&2\end{pmatrix}\begin{pmatrix}r\\s \end{pmatrix}\\
\begin{pmatrix}\frac{dr}{dt}\\\frac{ds}{dt} \end{pmatrix}=\begin{pmatrix}3r\\2s \end{pmatrix}\\
\therefore x=ae^{3t}+be^{2t}\\
y=ae^{3t}+2be^{2t}$

More applications

Saturday 10 January 2015

Methods of integration

Usual methods:
1. Direct Integration
It's not too difficult to recognise integrals of the form $\int f(x)f'(x)dx=\int f(x) d[f(x)]$.

Example:
$\int (5x^3+4x^2)(15x^2+8x)dx\\= \int (5x^3+4x^2) d(5x^3+4x^2)\\= \frac{(5x^3+4x^2)^2}{2}+C$

2. Expansion
Whenever we see (x ±  )^(a number) and it seems that substitution won't work, expansion is a good approach.

Example:
$\int (x^2-1)^6 dx\\= \int (x^{12}-6x^{10}+15x^8-20x^6+15x^4-6x^2+1) dx\\= \frac{x^{13}}{13}-6\frac{x^{11}}{11}+5\frac{x^9}{3}-20\frac{x^7}{7}+\frac{x^5}{3}-2x^3+x+C$

3. Substitution
i. u-substitution
ii. trig-substitution
[How to find the right substitution?]
[<=> change of coordinates]
[<=> linear algebra]

4. Integration by parts
$\frac{d}{dx}(uv)=u\frac{dv}{dx}+v\frac{du}{dx}\\ uv=\int udv+\int vdu\\ \int udv=uv - \int vdu$

[How to know which is u and which is v?]


6. Reduction formula
Technical.

Special methods:
3. Substitution (cont'd)
iii. Weierstrass t-substitution / Tangent half-angle substitution

Fundamental Theorem of Calculus

Example:
$\large \int_0^1 \frac{x^t-1}{\ln x} dx$ where $t\geq 0$

Let $\large{F(t)=\int_0^1 \frac{x^t-1}{\ln x}dx\\
\begin{align}F'(t)&=\frac{d}{dt}\int_0^1\frac{x^t-1}{\ln x}dx\\
&=\int_0^1 \frac{d}{dt}\frac{x^t-1}{\ln x}dx\\
&=\int_0^1\frac{1}{\ln x}\frac{d}{dt}(x^t-1)dx \:\:\:\:\:[\frac{d}{dx}x^t=x^t \ln x]\\
&=\int_0^1 x^t dx\\
&=\frac{1}{t+1}\end{align}}$

Integrate again,
$F(t)=\ln(t+1)+C$
Since $F(0)=0$, we have $C=0$ and $F(t)=\ln(t+1)$.

Make use of known integrals

Manipulations

Change of coordinates (Transform into a double integral then switch to polar coordinates)

Let I = $\int_0^\infty e^{-x^2}dx$

$I^2=\int_0^\infty e^{-x^2}dx \int_0^\infty e^{-y^2}dy$

$=\int_0^\infty[\int_0^\infty e^{-x^2}dx]e^{-y^2}dy$

$=\int_0^\infty \int_0^\infty e^{-(x^2+y^2)}dx dy$

$=\int_0^{\frac{\pi}{2}}\int_0^\infty e^{-r^2}rdr d\theta$

$=\int_0^{\frac{\pi}{2}}-\frac{1}{2}e^{-r^2}|_0^\infty d\theta$

$=\frac{\pi}{4}$

It follows that I = $\frac{\sqrt\pi}{2}$.

Evaluate a general integral first: Parameter differentiation p56

Symmetry
http://www2.math.umd.edu/~punshs/Calculus/Integration.pdf

Odd function
$\int_{-2}^2 (1-2x-x^{16} \sin x+17x^7\sqrt{1+x^2}-x^{11} \cos x )dx$
Let $f(x)=-2x-x^{16} \sin x+17x^7\sqrt{1+x^2}-x^{11} \cos x$
$f(-x)=2x+x^{16} \sin x-17x^7\sqrt{1+x^2}+x^{11} \cos x=-f(x)$
$\Rightarrow \int_{-2}^2 (1-2x-x^{16} \sin x+17x^7\sqrt{1+x^2}-x^{11} \cos x )dx\\=\int_{-2}^2 (1+f(x))dx\\=4$

Linear Algebra
i. Orthogonality
Example:
$\int_{-\pi}^\pi (3+2\sin x+3\cos x)(1+4\sin x)dx$

$1,\sin x,\cos x$ are orthogonal on $[-\pi,\pi]$

$=\int_{-\pi}^\pi 3 dx+\int_{-\pi}^\pi 8\sin^2 x dx$

$=3(2\pi)+8\pi$

$=14\pi$

ii. Change of basis
Example:
Let $\mathbb{B}=\left\{1,\cos t,\cos^2 t,...,\cos^6 t\right\}$ and $\mathbb{C}=\left\{1,\cos t,\cos 2t,...,\cos 6t\right\}$.
By De Moivre's theorem,
$\cos 2t=-1+2\cos^2 t$
$\cos 3t=-3\cos t+4\cos^3 t$
$\cos 4t=1-8\cos^2 t+8\cos^4 t$
$\cos 5t=5\cos t-20\cos^3 t+16\cos^5 t$
$\cos 6t=-1+18\cos^2 t-48\cos^4 t+32\cos^6 t$

$P=[\mathbb{B}]_\mathbb{C}=\begin{pmatrix} 1 & 0 & -1 & 0 & 1 & 0 & -1\\ 0 & 1 & 0 & -3 & 0 & 5 & 0 \\ 0 & 0 & 2 & 0 & -8 & 0 & 18 \\ 0 & 0 & 0 & 4 & 0 & -20 & 0 \\ 0 & 0 & 0 & 0 & 8 & 0 & -48 \\ 0 & 0 & 0 & 0 & 0 & 16 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 32 \end{pmatrix}$

$P^{-1}=\frac{1}{32}\begin{pmatrix} 32 & 0 & 16 & 0 & 12 & 0 & 10\\ 0 & 32 & 0 & 24 & 0 & 20 & 0 \\ 0 & 0 & 16 & 0 & 16 & 0 & 15 \\ 0 & 0 & 0 & 8 & 0 & 10 & 0 \\ 0 & 0 & 0 & 0 & 4 & 0 & 6 \\ 0 & 0 & 0 & 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}$

Say we want to evaluate $\int (5\cos^3 t-6\cos^4 t+5\cos^5 t-12\cos^6 t)dt$.

$P^{-1}(0,0,0,5,-6,5,-12)=(-6,\frac{55}{8},-\frac{69}{8},\frac{45}{16},-3,\frac{5}{16},\frac{-3}{8})$

$\int (5\cos^3 t-6\cos^4 t+5\cos^5 t-12\cos^6 t)dt\\
=\int (-6+\frac{55}{8}\cos t-\frac{69}{8}\cos 2t+\frac{45}{16}\cos 3t-3\cos 4t+\frac{5}{16}\cos 5t-\frac{3}{8}\cos 6t)dt\\
=-6t+\frac{55}{8}\sin t-\frac{69}{16}\sin 2t+\frac{15}{16}\sin 3t-\frac{3}{4}\sin 4t+\frac{1}{16}\sin 5t-\frac{1}{16}\sin 6t+C$

Geometry
$\int_0^1 [(1-x^5)^{\frac{1}{4}}-(1-x^4)^{\frac{1}{5}}]dx$
Note that $\int_0^1 (1-x^5)^{\frac{1}{4}}dx=\int_0^1 (1-y^4)^{\frac{1}{5}}dy$.
Both are the area bounded by $x^5+y^4=1$ with the x and y-axes.
So answer: 0.

Recursive [prob solv] refer to notebook

Probability density function
This will be explained in another post.

Examples:
Normal distribution
$\large \int_{-\infty}^\infty e^{-4x^2}dx$
$\large =\int_{-\infty}^\infty exp\left\{\frac{x^2}{\frac{1}{4}} \right\}dx$
$\large =\int_{-\infty}^\infty exp\left\{\frac{x^2}{2(\frac{1}{8})} \right\}dx$
$\large \sigma^2=\frac{1}{8}$ and $\mu =0$
$\large{\begin{align}\therefore \int_{-\infty}^\infty e^{-4x^2}dx &= \sqrt{\frac{1}{8}} \sqrt{2\pi} \int_{-\infty}^\infty \frac{1}{\sqrt{\frac{1}{8}} \sqrt{2\pi}}exp\left\{\frac{x^2}{2(\frac{1}{8})} \right\}dx\\
&=\sqrt{\frac{1}{8}} \sqrt{2\pi}\\
&=\frac{\sqrt{\pi}}{2}\end{align}}$

Gamma distribution
$\large \int_0^\infty 81e^{-3x}x^4dx$

$\large =\frac{\Gamma(5)}{3}\int_0^\infty \frac{3e^{-3x}(3x)^4}{\Gamma(5)}dx$

$\large =\frac{4!}{3}$

$\large =8$

We can integrate a function of the form $e^{-kx}x^s$ using gamma distribution. This avoids integrating by parts as many as s times.

Beta distribution
$\large \int_0^1 x^5 (1-x^2)^9 dx$

$\large =\int_0^1 (x^2)^{\frac{5}{2}} (1-x^2)^9 dx$

$\large \stackrel{y=x^2}{=}\int_0^1 \frac{y^{\frac{5}{2}}(1-y)^9}{2\sqrt{y}}dy$

$\large =\frac{1}{2}\int_0^1 y^2(1-y)^9dy$

$\large =\frac{1}{2}B(3,10) \int_0^1 \frac{y^2(1-y)^9}{B(3,10)}dy$

$\large =\frac{1}{2}\frac{\Gamma(3)\Gamma(10)}{\Gamma(13)}$

$\large =\frac{1}{2}\frac{2!9!}{12!}$

$\large =\frac{1}{12\cdot 11\cdot 10}$

$\large =\frac{1}{1320}$

Tabular Integration by Parts
Refer to http://www.maa.org/sites/default/files/pdf/mathdl/CMJ/Horowitz307-311.pdf

Comparing coefficients
Refer to https://johnmayhk.wordpress.com/2014/08/08/integration-by-comparing-coefficients/

Laplace transform <=> residue theory

Complex analysis

Friday 9 January 2015

Integration and statistics

Have you ever wondered why you see many integration problems of the form $\int xf(x)dx$ or $\int (x-\text{a number})^2 f(x)dx$ in your textbook?

Those problems are related to expected value and variance, which are applied in statistics.

Prerequisite knowledge:
For a discrete random variable, the expected value is the sum of all xP(x). For a continuous random variable, P(x) is the probability density function, and integration takes the place of addition.

Definitions:
Let X be an absolutely continuous random variable with probability density function $f_X(x)$.

The expected value of X is:
$E(X)=\int_{-\infty}^{\infty}xf_X(x)dx$

It should fulfill absolute integrability, i.e. $\int_{-\infty}^{\infty}|x|f_X(x)dx<\infty$. This ensures that the improper integral $\int_{-\infty}^{\infty}xf_X(x)dx$, a shorthand for $\lim\limits_{t \to -\infty} \int_t^0 xf_X(x)dx+\lim\limits_{t \to \infty} \int_0^t xf_X(x)dx$, is well-defined. When the absolute integrability condition is not satisfied, the expected value of X does not exist.

Let f(x) be a probability density function on the domain [a,b], then the variance of f(x) is $\int_a^b(x-\mu)^2 f(x)dx$.

Examples

Monday 5 January 2015

Generating functions in probability

Say we want to find the probability of a family of three children having exactly two boys and one girl.

Conventional methods
1. List out all possibilities
There are $2^3 = 8$ possibilities (because a child can be either boy or girl, and there are three of them): BBB, BBG, BGB, BGG, GBB, GBG, GGB, and GGG.
As a result, the required probability is $\dfrac{3}{8}$.

2. Binomial probability
$$P(X=r)=C^n_r\:p^r (1-p)^{n-r}$$In this case, $r$ is the number of boy(s), $n = 3, p = 1-p = \frac{1}{2}$. $$P(X=2)=C^3_2 (\frac{1}{2})^2 (\frac{1}{2})^1 = \frac{3}{8}$$

Generating function method
This method makes use of properties of combination.
The function $(b+g)^3=b^3+3b^2g+3bg^2+g^3$ demonstrates that there is 1 possibility of having 3 boys, 3 possibilities of having 2 boys and 1 girl (BBG, BGB, GBB), and so on.
$$\text{Required probability}=\dfrac{\text{coefficient of}\: b^2g}{\text{sum of all coefficients}}=\dfrac{3}{8}$$
Example:

Conventional method
7 can only be obtained if we have 2, 2, 3 or 1, 3, 3.

Cases for 2, 2, 3:
2, 2, 3 --> 3*3*1
2, 3, 2 --> 3*1*3
3, 2, 2 --> 1*3*3

Cases for 1, 3, 3:
1, 3, 3 --> 2*1*1
3, 1, 3 --> 1*2*1
3, 3, 1 --> 1*1*2

So probability: $\dfrac{3(3\cdot3 + 2)}{6^3}=\dfrac{11}{72}$

Generating function method
The outcomes of a single die are given by the polynomial
and the outcomes of three dice rolls are given by .
Let .
Number of dice outcomes with a value of $7$ is the coefficient of $x^7$. $$\begin{align}f(x)&=(2x+3x^2+x^3)^3\\
&=\cdots+3(3\cdot 3+2)x^7+\cdots\\
&=\cdots+33x^7+\cdots\end{align}$$
[Here we are essentially doing the same thing as the conventional method, looking for the coefficient of $x^2x^2x^3$ and $xx^3x^3$. This generating function method can be useful to those who doesn't know the convention method.]
So the probability of getting a total of $7$ is $\dfrac{33}{6^3}=\dfrac{11}{72}$.

How to find the right function?
[Disclaimer: The followings are just my interpretations.]
For an event repeated for n times, each time with mutually exclusive outcomes a, b and c, the function will be $(a+b+c)^n$.
The coefficient of a particular term of the function which represents a particular outcome will be the number of possibilities of that outcome. The sum of all the coefficients of the function will be the total number of possibilities. Alternatively, you can find it by referring to the question. For example, for the dice question, we know the total number of possibilities is $6^3$ because we are rolling $3$ dice and there are $6$ possible outcomes in rolling each dice.

Thursday 1 January 2015

Differential equations

First Order Linear Differential Equations:
1. Separating variables
Example 1:
$\dfrac{dy}{dx}=f(y)\\
\dfrac{dy}{f(y)}=dx\\
\int \dfrac{dy}{f(y)}=\int dx$

Example 2:
$\dfrac{dy}{dx}=\dfrac{1}{f(x)f(y)}\\
f(y)dy=\dfrac{dx}{f(x)}\\
\int f(y)dy=\int \dfrac{dx}{f(x)}$

2. Integrating factors

To solve equations of the form
$a(x)\dfrac{dy}{dx}+b(x)y=c(x)$

i. Express in standard form
$\dfrac{dy}{dx}+p(x)y=q(x)$

ii. Multiply both sides by the integrating factor $e^{\int p(x) dx}$

iii. $\dfrac{d}{dx}(ye^{\int p(x) dx})=q(x)e^{\int p(x) dx}$

iv. $ye^{\int p(x) dx}=\int q(x)e^{\int p(x) dx}+C$

v. Divide both sides by the integrating factor

vi. Use initial conditions to find particular solutions

3. Laplace Transform