Let
a_1,a_2,\cdots be any sequence of numbers. We are often interested in sums such as
a_1+\cdots+a_n. This sum can be written more compactly using either of the following equivalent notations:
\sum_{i=1}^n a_i\quad \text{or}\quad \sum_{1\leq i\leq n} a_i. In general, if
R(i) is any relation involving
i, the symbol
\sum_{R(i)} a_i means the sum of all
a_i, where
i is an integer satisfying the condition
R(i). The letter
i is a dummy variable or index variable that may be renamed without changing the result.
We now discuss four important algebraic operations on sums.
1) The distributive law, for products of sums
\bigg(\sum_{R(i)} a_i\bigg)\bigg(\sum_{S(j)} b_j\bigg)=\sum_{R(i)}\bigg(\sum_{S(j)} a_ib_j\bigg) Consider for example
\begin{align}\bigg(\sum_{i=1}^2 a_i\bigg)\bigg(\sum_{j=1}^3 b_j\bigg)&=(a_1+a_2)(b_1+b_2+b_3)\\
&=(a_1b_1+a_1b_2+a_1b_3)+(a_2b_1+a_2b_2+a_2b_3)\\
&=\sum_{i=1}^2\bigg(\sum_{j=1}^3 a_ib_j\bigg).\end{align} 2) Interchanging order of summation
\sum_{R(i)}\sum_{S(j)} a_{ij}=\sum_{S(j)}\sum_{R(i)} a_{ij} For example,
\sum_{R(i)}\sum_{j=2}^2 a_{ij}=\sum_{R(i)}(a_{i1}+a_{i2})\\ \sum_{j=1}^2 \sum_{R(i)} a_{ij}=\sum_{R(i)} a_{i1}+\sum_{R(i)} a_{i2}. This means that
\sum_{R(i)}(b_i+c_i)=\sum_{R(i)}b_i+\sum_{R(i)}c_i, where
b_i=a_{i1} and
c_i=a_{i2}. This operation is very useful because we often know a simple form for
\sum_{R(i)} a_{ij} but not for
\sum_{S(j)} a_{ij}. More generally, we have
\sum_{R(i)}\sum_{S(i,j)}a_{ij}=\sum_{S'(j)}\sum_{R'(i,j)}a_{ij}, where
S'(j) is the relation 'there is an integer
i s.t. both
R(i) and
S(i,j) are true'; and
R'(i,j) is the relation 'both
R(i) and
S(i,j) are true'. For example, if the summation is
\sum_{i=1}^n\sum_{j=1}^i a_{ij}, then
S'(j) is the relation 'there is an integer
i s.t.
1\leq i\leq n and
1\leq j\leq i, namely
1\leq j\leq n; and
R'(i,j) is the relation '
1\leq i\leq n and
1\leq j\leq i', namely
j\leq i\leq n. Thus,
\sum_{i=1}^n\sum_{j=1}^i a_{ij}=\sum_{j=1}^n\sum_{i=j}^n a_{ij}. 3) Change of variable
\sum_{R(i)} a_i=\sum_{R(j)} a_j=\sum_{R(p(j))} a_{p(j)} The first equality holds because we are simply changing the dummy variable. In the second case,
p(j) is a function of
j that represents a permutation of the range. For each integer
i satisfying the relation
R(i), there is a unique integer
j satisfying the relation
p(j)=i. This condition is always satisfied in cases
p(j)=c+j and
p(j)=c-j, where
c is an integer not depending on
j. For instance,
\sum_{1\leq i\leq n} a_i=\sum_{1\leq i-1\leq n} a_{i-1}=\sum_{2\leq i\leq n+1} a_{i-1}. Note that for infinite sums, replacing
j by
p(j) is valid iff
\sum_{R(j)}|a_j| exists.
4) Manipulating the domain
If
R(i) and
S(i) are arbitrary relations, we have
\sum_{R(i)}a_i+\sum_{S(i)}a_i=\sum_{R(i)\;\text{or}\;S(i)}a_i+\sum_{R(i)\;\text{and}\;S(i)}a_i. For example,
\sum_{1\leq i\leq m}a_i+\sum_{m\leq i\leq n}a_i=\bigg(\sum_{1\leq i\leq n}a_i\bigg)+a_m, assuming that
1\leq m\leq n.
Examples from high school:
[The sum of a geometric progression] Assume that
r\neq 1,n\geq 0. A conventional way of deriving the formula is to consider
S=a+ar+\cdots+ar^n and
S-rS.
a+ar+\cdots+ar^n\\ \underline{-)\; ar+ar^2+\cdots+ar^{n+1}}\\ a-ar^{n+1}=S(1-r)\\ \Rightarrow S=a\dfrac{1-r^{n+1}}{1-r} This can be easily derived too by manipulating sums:
\begin{align}a+ar+\cdots+ar^n&=a+\sum_{i=1}^n ar^i\\
&=a+r\sum_{i=1}^n ar^{i-1}\\
&=a+r\sum_{i=0}^{n-1} ar^i\\
&=a+r\sum_{i=0}^n ar^i-ar^{n+1}\\
&=a\dfrac{1-r^{n+1}}{1-r}\end{align}
[The sum of a arithmetic progression] Assume
n\geq 0. Then
\begin{align}a+(a+d)+\cdots+(a+nd)&=\sum_{i=0}^n (a+di)\\
&=\sum_{n-i=0}^n (a+d(n-i))\\
&=\sum_{i=0}^n (a+dn-di)\\
&=\sum_{i=0}^n (2a+dn)-\sum_{i=0}^n (a+di)\\
&=(n+1)(2a+dn)-\sum_{i=0}^n (a+di)\\
&=a(n+1)+\dfrac{1}{2}dn(n+1)\\
&=\dfrac{1}{2}(n+1)(2a+dn)\end{align} Another example:
Let
\begin{align}S&=\sum_{i=0}^n\sum_{j=0}^i a_ia_j\\
&=\sum_{j=0}^n\sum_{i=j}^n a_ia_j\\
&=\sum_{i=0}^n\sum_{j=i}^n a_ia_j
\end{align} Then
\begin{align}2S&=\sum_{i=0}^n\bigg(\sum_{j=0}^i a_ia_j+\sum_{j=i}^n a_ia_j\bigg)\\
&=\sum_{i=0}^n\bigg((\sum_{j=0}^n a_ia_j)+a_ia_i\bigg)\\
&=\bigg(\sum_{i=0}^n a_i\bigg)^2+\bigg(\sum_{i=0}^n a_i^2\bigg)\end{align} We have derived the identity
\sum_{i=0}^n\sum_{j=0}^i a_ia_j=\dfrac{1}{2}\bigg(\bigg(\sum_{i=0}^n a_i\bigg)^2+\bigg(\sum_{i=0}^n a_i^2\bigg)\bigg).
Examples in analysis
In Riemann integration, we have the following result: Suppose
f\in \mathscr{R}[a,b] and
g\in \mathscr{C}([m,M]), where
M=\sup\{f(x)\mid a\leq x\leq b\} and
m=\inf\{f(x)\mid a\leq x\leq b\}. Then
g\circ f\in \mathscr{R}[a,b].
Idea: Consider continuous points and discontinuous points separately.
Proof: Let
\varepsilon>0. Since
g is uniformly continuous on
[m,M], there exists
\delta>0 s.t.
|g(x)-g(y)|<\varepsilon as
x,y\in [m,M] and
|x-y|<\delta. Also, since
f\in \mathscr{R}[a,b], there exists a partition
P=\{x_0,x_1,\cdots,x_n\} of
[a,b] s.t.
U(f,P)-L(f,P)<\varepsilon \delta. Consider
A=\{i\mid osc(f)< \delta\} (index w.r.t. continuous points) and
A^c=\{i\mid osc(f)\geq \delta\} (index w.r.t. discontinuous points).
i\in A does not cause much trouble. To handle
i\in A^c, consider
\begin{align}\sum_{i\in A^c}(x_i-x_{i-1})&=\dfrac{1}{\delta}\sum_{i\in A^c}\delta (x_i-x_{i-1})\\
&\leq \dfrac{1}{\delta}\sum_{i=1}^n(M_i(f)-m_i(f))(x_i-x_{i-1})\\
&=\dfrac{1}{\delta}(U(f,P)-L(f,P))\\
&<\varepsilon. \end{align}Therefore,
\begin{align}U(g\circ f,P)-L(g\circ f,P)&=\sum_{i=1}^n\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\
&=\sum_{i\in A}\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\&+\sum_{i\in A^c}\bigg(\sup_{[x_{i-1},x_i]}(g\circ f)-\inf_{[x_{i-1},x_i]}(g\circ f)\bigg)(x_i-x_{i-1})\\
&\leq \varepsilon \sum_{i\in A}(x_i-x_{i-1})+C\sum_{i\in A^c}(x_i-x_{i-1})\\
&\leq \varepsilon(b-a)+C\varepsilon\\
&=(b-a+C)\varepsilon,\end{align} where
C=\max\{g(u)\mid m\leq u\leq M\}-\min\{g(u)\mid m\leq u\leq M\}. It follows that
g\circ f\in \mathscr{R}[a,b].
\Box
Say we want to evaluate
\sum_{n=1}^\infty \dfrac{2n-1}{2^n} (which is convergent by ratio test). Consider
\begin{align}s_n&=\sum_{k=1}^n \dfrac{2k-1}{2^k}\\
&=\dfrac{1}{2}+\sum_{k=2}^n\bigg(\dfrac{2k-1}{2^k}-\dfrac{2k-3}{2^k}\bigg)+\sum_{k=2}^n \dfrac{2k-3}{2^k}\\
&=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=2}^n \dfrac{2k-3}{2^k}\\
&=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=2}^n \dfrac{2(k-1)-1}{2^{k-1+1}}\\
&=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\sum_{k=1}^{n-1} \dfrac{2k-1}{2^{k+1}}\\
&=\dfrac{1}{2}+\sum_{k=2}^n\dfrac{1}{2^{k-1}}+\dfrac{1}{2}s_n-\dfrac{2n-1}{2^n}\\
&=1+2\bigg(1-\dfrac{1}{2^{n-1}}\bigg)-\dfrac{2n-1}{2^n}\end{align} As
n\to \infty,
s_n\to 3.
Many manipulations of sums and other formulas become significantly simpler if we use the bracket notation:
[\text{statement}]=\begin{cases} 1,\quad \text{if the statement is true;}\\ 0,\quad \text{if the statement is false.}\end{cases} Then we can write
\sum_{R(i)} a_i=\sum_i a_i[R(i)] since the terms of that infinite sums are zero when
R(i) is false. In fact, we have seen this expression before -- Kronecker delta symbol is a special case of bracket notation:
\delta_{ij}=[i=j]=\begin{cases}1,\quad \text{if }i=j\\ 0,\quad \text{if }i\neq j\end{cases}. With bracket notation, we can derive rule (3) from rule (1) and (2):
\begin{align}\sum_{R(p(j))} a_{p(j)}&=\sum_j a_{p(j)}[R(p(j))]\\
&=\sum_j \sum_i a_i [R(i)][i=p(j)]\\
&=\sum_i a_i[R(i)]\sum_j [i=p(j)].\end{align} If
p is a permutation of the range, then we are left with
\sum_i a_i [R(i)], which is
\sum_{R(i)} a_i.
We can also derive rule (2):
\sum_{i=1}^n\sum_{j=1}^i a_{ij}=\sum_{i,j} a_{ij}[1\leq i\leq n][1\leq j\leq i]=\sum_{i,j}a_{ij}[1\leq j\leq n][j\leq i\leq n]=\sum_{j=1}^n\sum_{i=j}^n a_{ij}, since
[1\leq j\leq n][1\leq j\leq i]=[1\leq j\leq i\leq n]=[1\leq j\leq n][j\leq i\leq n].
More to explore:
Möbius inversion formula
arithmetic function
Reference
illustration