Saturday 30 May 2015

Asymptotic notations

Introduction
Let's start with some motivating examples showing how these notations are used.

Using Taylor series expansions, we have

$\begin{align} e^x &=1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\frac{x^4}{4!}+\cdots\:\: \text{for all x}\\
&=1+x+O(x^2)\:\:\text{as}\: x \to 0 \end{align}\\
\Rightarrow e^x \sim 1+x$

$\begin{align}\log(1+x)&=x-\frac{x^2}{2}+\frac{x^3}{3}-\frac{x^4}{4}+\cdots\\
&=x+O(x^2) \end{align}\\
\Rightarrow \log(1+x) \sim x$

The symbol $O(x^2)$ means that the remainder is bounded by $Ax^2$ as $x$ tends to $0$ for some finite $A$.

Big O notation is a symbolism used in complexity theory, computer science, and mathematics to describe the asymptotic behavior of functions. It tells you how fast a function grows or declines. The letter O is used because the rate of growth of a function is also called its order.

If $x_n=3n^2-2n+3$, then $x_n=o(n^3)$, $x_n=O(n^2)$ and $x_n \sim 3n^2\: \text{as}\: n \to \infty$.

o-order is less useful than O-order. For example, $\sin z = z + o(z^2)$ as $z \to 0$ tells us that $\sin z - z \to 0$ faster than $z^2$, however $\sin z = z + O(z^3)$, tells us specifically that $\sin z - z \to 0$ like $z^3$.

Definitions
The symbols $O$, $o$ and $\sim$ were first used by E. Landau and P. Du Bois- Reymond.
Suppose $f(z)$ and $g(z)$ are functions of the continuous complex variable $z$ defined on some domain $D \subset \mathbb{C}$ and possess limits as $z \to z_0$ in $D$.

Asymptotically bounded
$f(z)=O(g(z))$ $\: \text{as}\:$ $z \to z_0$ means that there exists constants $K \geq 0$ and $\delta >0$ such that, for $0<|x-x_0|<\delta$, $|f(z)| \leq K |g(z)|$.

Asymptotically smaller
$f(z)=o(g(z))$ $\: \text{as}\:$ $z \to z_0$ means that for all $\epsilon >0$ such that, for $0<|z-z_0|<\epsilon$, $|f(z)|\leq \epsilon |g(z)|$
Equivalently this means that for non-zero $g(z)$ in a neighbourhood of $z_0$ except possibly at $z_0$, then as $z \to z_0$: $\large \frac{f(z)}{g(z)} \to 0$.

Asymptotically equal
$f(z) \sim g(z)$ $\: \text{as}\:$ $z \to z_0$ means that for non-zero $g(z)$ in a neighbourhood of $z_0$ except possibly at $z_0$, then as $z \to z_0$: $\large \frac{f(z)}{g(z)} \to 1$.

Applications
Approximations & finding limits
This notation saves us the trouble of computing many arithmetic operations. Meanwhile, it avoids the ambiguity of using "the three dots".
$f(x) \sim g(x)$ when $x \to \infty \Rightarrow \large\lim\limits_{x \to \infty} \frac{f(x)}{g(x)}=1$
This is why we can use the following approximations in multiplicative situations.



Computer science -- Analysis of algorithms [later]

Monday 18 May 2015

Methods to find probability II

Methods to find probability I

Tree diagram, tables

Conditional probability
$P(A|B)=P(AB)/P(B)$

Rule of total probability
$P(A)=P(A|B)+P(A|B^c)$
Sometimes P(A) itself is hard to find, we can use the rule of total probability to find P(A) indirectly.

Reduced sample space
"In the card game bridge, the 52 cards are dealt out equally to 4 players—called East, West, North, and South. If North and South have a total of 8 spades among them, what is the probability that East has 3 of the remaining 5 spades?"

Since we know that North and South have 8 spades among their 26 cards, East and West together have 26 cards, 5 of them being spades. Therefore, the probability that East has 3 of the 5 spades is given by
$\large \binom{5}{3} \binom{21}{10} / \binom{26}{13}$.

Note that this is actually a hypergeometric distribution, which we will introduce in our next post.
Further note that this problem looks very much like a conditional probability problem in the form "Given that ..., find the probability that ..." We will show you the calculations using conditional probability, but the method of reduced sample space would be far more straightforward.

Let B denote the event North and South have a total of 8 spades and 18 non-spades.
$\large P(B)=\binom{13}{8} \binom{39}{18}/ \binom{52}{26}\\$
Let A denote the event East has a total of 3 spades and 10 non-spades.
$\large P(AB)=\binom{13}{8} \binom{39}{18} \binom{5}{3} \binom{21}{10} / [\binom{52}{26}\binom{26}{13}]\\
\large P(A|B)=P(AB)/P(B)=\binom{5}{3} \binom{21}{10}/ \binom{26}{13}$

Bayes' formula
$P(A|B)=P(B|A)P(A)/P(B)$

Inclusion-exclusion principle
Let's say you toss a dice for four times. What is the probability that there is at least one 3 in the four tosses? We can make use of complement to simplify our calculations. The required probability is just $1-\text{P(no 3s)}=1-(5/6)^4=671/1296$. What about calculating the probability directly? We use the inclusion-exclusion principle. $P(\bigcup\limits_{i=1}^n A_i)=\sum\limits_{i=1}^nP(A_i)-\sum\limits_{i<j}P(A_iA_j)+\sum\limits_{i<j<k}P(A_iA_jA_k)-\cdots+(-1)^{n-1}P(\bigcap\limits_{i=1}^n A_i)$

Let $A_i$ denote the event ith dice is a 3.
Thus we need to find $P(\bigcup\limits_{i=1}^4 A_i)$. By the inclusion-exclusion principle, it is $\sum\limits_{i=1}^4 P(A_i)-\sum\limits_{1<i<j<4}P(A_iA_j)+\sum\limits_{i<j<k}P(A_iA_jA_k)-\cdots-P(A_1A_2A_3A_4)$
$=4(1/6)-6(1/6)^2+4(1/6)^3-(1/6)^4=671/1296$.

Permutations and combinations