Skip to content

Commit 9e210ae

Browse files
harivtrekhleb
authored andcommitted
Z algorithm implementation (trekhleb#77)
* Implemented Z algorithm * Fixed bugs in implementation and added tests * Added README explaining z algorithm
1 parent d74234d commit 9e210ae

File tree

3 files changed

+96
-0
lines changed

3 files changed

+96
-0
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Z-algorithm
2+
3+
The Z-algorithm finds occurrences of a "word" `W`
4+
within a main "text string" `T` in linear time.
5+
6+
Given a string `S` of length `n`, the algorithm produces
7+
an array, `Z` where `Z[i]` represents the ongest substring
8+
starting from `S[i]` which is also a prefix of `S`. Finding
9+
`Z` for the string obtained by concatenating the word, `W`
10+
with a nonce character, say `$` followed by the text, `T`,
11+
helps with pattern matching, for if there is some index `i`
12+
such that `Z[i]` equals the pattern length, then the pattern
13+
must be present at that point.
14+
15+
While the `Z` array can be computed with two nested loops, the
16+
following strategy shows how to obtain it in linear time, based
17+
on the idea that as we iterate over the letters in the string
18+
(index `i` from `1` to `n - 1`), we maintain an interval `[L, R]`
19+
which is the interval with maximum `R` such that `1 ≤ L ≤ i ≤ R`
20+
and `S[L...R]` is a prefix that is also a substring (if no such
21+
interval exists, just let `L = R =  - 1`). For `i = 1`, we can
22+
simply compute `L` and `R` by comparing `S[0...]` to `S[1...]`.
23+
24+
## Complexity
25+
26+
- **Time:** `O(|W| + |T|)`
27+
- **Space:** `O(|W|)`
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
import zAlgorithm from '../zAlgorithm';
2+
3+
describe('zAlgorithm', () => {
4+
it('should find word position in given text', () => {
5+
expect(zAlgorithm('abcbcglx', 'abca')).toBe(-1);
6+
expect(zAlgorithm('abcbcglx', 'bcgl')).toBe(3);
7+
expect(zAlgorithm('abcxabcdabxabcdabcdabcy', 'abcdabcy')).toBe(15);
8+
expect(zAlgorithm('abcxabcdabxabcdabcdabcy', 'abcdabca')).toBe(-1);
9+
expect(zAlgorithm('abcxabcdabxaabcdabcabcdabcdabcy', 'abcdabca')).toBe(12);
10+
expect(zAlgorithm('abcxabcdabxaabaabaaaabcdabcdabcy', 'aabaabaaa')).toBe(11);
11+
});
12+
});
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
/**
2+
* @param {string} word
3+
* @param {string} text
4+
* @return {number[]}
5+
*/
6+
7+
function buildZArray(word, text) {
8+
const zString = `${word}$${text}`;
9+
const zArray = new Array(zString.length);
10+
let left = 0;
11+
let right = 0;
12+
let k = 0;
13+
14+
for (let i = 1; i < zString.length; i += 1) {
15+
if (i > right) {
16+
left = i;
17+
right = i;
18+
19+
while (right < zString.length && zString[right - left] === zString[right]) {
20+
right += 1;
21+
}
22+
23+
zArray[i] = right - left;
24+
right -= 1;
25+
} else {
26+
k = i - left;
27+
if (zArray[k] < (right - i) + 1) {
28+
zArray[i] = zArray[k];
29+
} else {
30+
left = i;
31+
while (right < zString.length && zString[right - left] === zString[right]) {
32+
right += 1;
33+
}
34+
35+
zArray[i] = right - left;
36+
right -= 1;
37+
}
38+
}
39+
}
40+
41+
return zArray;
42+
}
43+
44+
/**
45+
* @param {string} text
46+
* @param {string} word
47+
* @return {number}
48+
*/
49+
export default function zAlgorithm(text, word) {
50+
const zArray = buildZArray(word, text);
51+
for (let i = 1; i < zArray.length; i += 1) {
52+
if (zArray[i] === word.length) {
53+
return (i - word.length - 1);
54+
}
55+
}
56+
return -1;
57+
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy