Indian Computing Olympiad

Training Material


Sliding Window Algorithms→Writing (IOI 2006)

Deciphering the Mayan writing has proven to be a harder task than anticipated by the early investigations. After almost two hundred years, very little of it was actually understood. It has been only in the last three decades that real advances have been made.

Mayan writing is based on small drawings known as glyphs which represent sounds. Mayan words are normally written as glyphs put together at various positions.

One of several problems in deciphering Mayan writing arises in the order of reading. When placing several glyphs in order to form a word, Mayan writers sometimes decided the position based more on their own esthetic views than on any particular rule. This leads to the fact that, even though the sound for many glyphs is known, sometimes archaeologists are not sure how to pronounce a written word.

The archaeologists are looking for a special word W. They know the glyphs for it, but they don't know all the possible ways of arranging them. Since they knew you were coming to IOI-06, they have asked for your help. They will provide you with the g glyphs from W and a sequence S of all the glyphs (in the order they appear) in the carvings they are studying. Help them by counting the number of possible appearances of the word W.

Example

Suppose W = cAda and the given sequence S is AbrAcadAbR then the answer is 2.

You may assume that the number of glyphs in W is at most 3000 and the number of glyphs in S is not more than 3000000.

Solution

The obvious idea is the following: Pick each position in S. Check if the next |W| positions is a permutation of W, where |W| is the length of W.

Checking if X is a permutation of W is best done by counting the occurances of each letter in X and in W and verifying that they are indeed equal.

We can implement this idea best by using a sliding window. Begin with the |W| length word beginning at position 1 in S and check if that is permutation of W (as described above). Then shift write by decrementing the count of letter 1 and incrementing the count of letter |W|+1 and so on.

The window shifts right by at most |S|-|W| steps and at each point we have to examine the count of all the letters in our window equals the count of all the letters in W (which can be done in 52 steps, the size of the alphabet).

Thus the algorithm runs in time 52*|S|.

It is possible to make this algorithm more efficient by the doing the following. For each window we also keep track of the count of letters in the alphabet for which the window as the same frequency has the asthe word W. When the window shifts, this can be updated in constant number of operations. With this information checking whether the current window as the same count for all letters as W is just verifying that our count is 52! This works in time proportional to |S| + 52.