How many strings are there of seven lowercase letters that have the substring tr in them?
So I am having a little problem with this question, I know that the total number of combinations is $26^6$ but there is double counting on some of the combinations.
For example, when you have a case that contains multiples 'tr' then it will be counted multiple times depending on the location of tr, even though its the same string.
t r t r t r a
Any advice on what to subtract to remove this double counting?
Thanks for any help!
$\endgroup$ 62 Answers
$\begingroup$Let $a_n$ be the number of strings of length $n$ with no tr's. An $(n-1)$-long string can be extended in $26$ ways unless it ends in "t", in which case it can only be extended in $25$ ways. So $$a_n= 26a_{n-1}-a_{n-2}$$ for $n\ge2$; the initial conditions are $a_0=1$ and $a_1=26$.
You can give a formula for $a_n$, but to compute $a_7$ it suffices to run out the recurrence. The final answer is $26^7-a_7 = 71,112,600$.
In case there's doubt, another way to verify the recurrence is to write down the $26\times26$ transition matrix for letters. All of the entries are $1$'s except for a single $0$ off the diagonal (corresponding to the prohibited tr). The characteristic polynomial is computed to be $z^{24}(z^2-26z+1)$.
$\endgroup$ 5 $\begingroup$There are two possible solutions here, one using recurrences and the other one using PIE.
Recurrence.
The recurrences use two sequences $\{a_n\}$ and $\{b_n\}$ which count strings not containing the two-character pattern that end in the first character of the pattern and that do not. This gives
$$a_1 = 1 \quad\text{and}\quad b_1 = 25$$ and for $n\gt 1$ $$a_n = a_{n-1} + b_{n-1} \quad\text{and}\quad b_n = 24a_{n-1} + 25b_{n-1}.$$
These recurrences produce for $1\le n\le 7$ the sequence of sums $\{a_n+b_n\}$ which is $$26, 675, 17524, 454949, 11811150, 306634951, 7960697576,\ldots$$ so that the answer to the problem is (count of no ocurrences) $$7960697576.$$
Inclusion-Exclusion.
Let $M_{\ge q}$ be the set of strings containing at least $q$ ocurrences of the two-character pattern and let $M_{=q}$ be the set containing exactly $q$ ocurrences of the pattern. Then by inclusion-exclusion we have
$$|M_{=0}| = \sum_{k=0}^{\lfloor n/2\rfloor} (-1)^k |M_{\ge k}|.$$
Note however that $$|M_{\ge k}| = 26^{n-2k} {n-2k + k\choose k} = 26^{n-2k} {n-k\choose k}.$$
This is because when we have $k$ copies of the pattern there are $n-2k$ freely choosable letters that remain. Hence we have a total of $n-2k+k=n-k$ items to permute. We then choose the $k$ locations of the patterns among the $n-k$ items.
This gives the formula $$|M_{=0}| = \sum_{k=0}^{\lfloor n/2\rfloor} (-1)^k \times 26^{n-2k} \times {n-k\choose k}.$$
a := proc(n) option remember; if n=1 then return 1 fi; a(n-1)+b(n-1); end; b := proc(n) option remember; if n=1 then return 25 fi; 24*a(n-1)+25*b(n-1); end; ex_pie := proc(n) option remember; add((-1)^q*26^(n-2*q)*binomial(n-q,q), q=0..floor(n/2)); end;
Proof that the two answers are the same.
Introduce the generating functions $$A(z) = \sum_{n\ge 0} a_n z^n \quad\text{and}\quad B(z) = \sum_{n\ge 0} b_n z^n.$$
Observe that the correct intial value pair is $a_0 = 0$ and $b_0 = 1.$
Multiply the two recurrences by $z^n$ and sum over $n\ge 1$ to get $$A(z) - 0 = z A(z) + z B(z) \quad\text{and}\quad B(z) - 1 = 24 z A(z) + 25 z B(z).$$
Solve these two obtain $$A(z) = \frac{z}{z^2-26z+1} \quad\text{and}\quad B(z) = \frac{1-z}{z^2-26z+1}.$$
This yields the following generating function $G(z)$ for $\{a_n+b_n\}:$ $$G(z) = \frac{1}{z^2-26z+1}.$$
On the other hand we have $$G(z) = \sum_{n\ge 0} z^n \left(\sum_{k=0}^{\lfloor n/2\rfloor} 26^{n-2k} (-1)^k {n-k\choose k}\right).$$
This is $$\sum_{k\ge 0} 26^{-2k} (-1)^k \sum_{n\ge 2k} 26^n z^n {n-k\choose k} \\ = \sum_{k\ge 0} 26^{-2k} (-1)^k \sum_{n\ge 0} 26^{n+2k} z^{n+2k} {n+2k-k\choose k} \\ = \sum_{k\ge 0} z^{2k} (-1)^k \sum_{n\ge 0} 26^n z^{n} {n+k\choose k} = \sum_{k\ge 0} z^{2k} (-1)^k \frac{1}{(1-26z)^{k+1}} \\ = \frac{1}{1-26z} \sum_{k\ge 0} z^{2k} (-1)^k \frac{1}{(1-26z)^{k}} = \frac{1}{1-26z} \frac{1}{1+z^2/(1-26z)} \\ = \frac{1}{1-26z+z^2}.$$
This establishes the equality of the generating functions which was to be shown.
Closed form and OEIS entry.
The roots of the denominator of the generating function are $$\rho_{1,2} = 13 \pm 2\sqrt{42}.$$ Writing $$\frac{1}{1-26z+z^2} = \frac{1}{(z-\rho_1)(z-\rho_2)} = \frac{1}{\rho_1-\rho_2} \left(\frac{1}{z-\rho_1}-\frac{1}{z-\rho_2}\right) \\ = \frac{1}{4\sqrt{42}} \left(\frac{1}{\rho_1}\frac{1}{z/\rho_1-1} -\frac{1}{\rho_2}\frac{1}{z/\rho_2-1}\right) \\ = \frac{1}{4\sqrt{42}} \left(-\frac{1}{\rho_1}\frac{1}{1-z/\rho_1} +\frac{1}{\rho_2}\frac{1}{1-z/\rho_2}\right).$$
We now extract coefficients to get $$[z^n] G(z) = \frac{1}{4\sqrt{42}} \left(\rho_2^{-n-1}-\rho_1^{-n-1}\right).$$
Since $\rho_1\rho_2 = 1$ this finally becomes $$[z^n] G(z) = \frac{1}{4\sqrt{42}} \left(\rho_1^{n+1}-\rho_2^{n+1}\right)$$
which is the sequence $$26, 675, 17524, 454949, 11811150, 306634951, 7960697576, \\ 206671502025, 5365498355074, 139296285729899, \ldots$$
This is OEIS A097309 which has additional material and where in fact we find a copy of the problem statement that initiated this thread.
Alternative derivation of the closed form of $G(z).$
This uses the following integral representation. $${n-k\choose k} = \frac{1}{2\pi i} \int_{|w|=\epsilon} \frac{(1+w)^{n-k}}{w^{k+1}} \; dw.$$
This gives for the inner sum $$\frac{1}{2\pi i} \int_{|w|=\epsilon} \frac{(1+w)^n}{w} \left(\sum_{k=0}^{\lfloor n/2\rfloor} 26^{n-2k} (-1)^k \frac{1}{(1+w)^k w^k}\right) \; dw.$$
Note that the defining integral is zero when $\lfloor n/2\rfloor \lt k \le n,$ so this is in fact $$\frac{1}{2\pi i} \int_{|w|=\epsilon} \frac{(1+w)^n}{w} \left(\sum_{k=0}^n 26^{n-2k} (-1)^k \frac{1}{(1+w)^k w^k}\right) \; dw.$$
Simplifying we obtain $$\frac{1}{2\pi i} \int_{|w|=\epsilon} \frac{(1+w)^n}{w} 26^n \frac{(-1)^{n+1}/26^{2(n+1)}/(1+w)^{n+1}/w^{n+1}-1} {(-1)/26^2/(1+w)/w-1} \; dw$$ or $$\frac{1}{2\pi i} \int_{|w|=\epsilon} (1+w)^{n+1} 26^n \frac{(-1)^{n+1}/26^{2(n+1)}/(1+w)^{n+1}/w^{n+1}-1} {(-1)/26^2-w(w+1)} \; dw$$
The difference from the geometric series contributes two terms, the second of which has no poles inside the contour, leaving just
$$\frac{1}{2\pi i} \int_{|w|=\epsilon} \frac{(-1)^{n+1}}{26^{n+2}} \frac{1}{w^{n+1}} \frac{1}{(-1)/26^2-w(w+1)} \; dw.$$
It follows that $$G(z) = \sum_{n\ge 0} z^n [w^n] \frac{(-1)^{n+1}}{26^{n+2}} \frac{1}{(-1)/26^2-w(w+1)}.$$
What we have here is an annihilated coefficient extractor which simplifies as follows. $$\frac{-1}{26^2} \sum_{n\ge 0} (-z/26)^n [w^n] \frac{1}{(-1)/26^2-w(w+1)} \\ = \frac{-1}{26^2} \frac{1}{(-1)/26^2+z/26(-z/26+1)} = -\frac{1}{-1+z(-z+26)} \\ = \frac{1}{1-26z+z^2}.$$
This concludes the argument.
There is another annihilated coefficient extractor at thisMSE link.
$\endgroup$ 3