Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T

Interview Camp

Technique: Rabin Karp String Search
Level: Hard
String Search: Find the index where the larger string A contains a target string T.

Questions to Clarify:
Q. If T occurs multiple times in A, do you want just the first index?
A. Yes

Q. If T doesn't exist in A, can I return -1?
A. Yes

Q. If T is empty, does that mean it exists in A?
A. Yes, empty strings exist in any non-null string.

Q. What if S or T are null?
A. Throw a null pointer exception.
Solution:
In Rabin Karp String search, we make a sliding window of size T. We slide this window one
character at a time across A and check if its value is equal to T.

For example, Let’s say T is "ello" and S is "hello world". Let’s say our sliding window is
called W.

Our first sliding window is "hell". Remember the sliding window is always the size of T.

S -> "hello world"
W -> "hell"

We compare this with T, and it is not equal. So we slide it by one letter:

S -> "hello world"
W -> "ello"

We compare it with T again and now W = T, so T exists in S. We return the index of 'e'.
If we slide all the way to the end of S (which means W is never equal to T), we return -1.

We implement this window using a hash function. Every time we slide the window by one
character, we add the new character to the hash code and remove the first character from it.
This is also called a "rolling" hash function, because it "rolls" forward one character at a time.

We create the hash of T:
hash(T:"ello") => 'e'.x3 + 'l'.x2 + 'l'.x + 'o'

© 2017 Interview Camp (interviewcamp.io)

Interview Camp
Let’s say we have the following hash for the first 4 letters of S:

W => 'h'.x3 + 'e'.x2 + 'l'.x + 'l'

We compare W with hash(T) and see that they are not equal, so we roll W forward by 1 character.
The next character is 'o'. To roll it, we subtract 'h'.x3, multiply by x and add 'o':

W => [('h'.x3 + 'e'.x2 + 'l'.x + 'l') - 'h'.x3].x + 'o'
=> 'e'.x3 + 'l'.x2 + 'l'.x + 'o'

Now, we check with hash(T) again, and we see that the value is equal. Now, we know that these
two are very likely to be equal.

Why very likely? Because 2 strings can map to the same hash code.
So we do a check of the two strings to make sure they are actually equal, and if so, we return
the index of 'e', which is 1. If they are not equal, it was a false positive, we just carry on
to the next character.
Pseudocode:
(Note: Never write pseudocode in an actual interview. Unless you're writing a few
lines quickly to plan out your solution. Your actual solution should be in a real
language and use good syntax.)
search(S, T)
if T or S is null, throw exception
if size(T) > size(S), return -1
if T is empty, return -1
// process first T characters

hashT = 0 // hash of T
hash = 0 // sliding window hash
for i -> 0 to T.length - 1
add T[i] to hashT
add S[i] to hash
if hashT == hash
return 0
for i -> T.length to S.length - 1

remove S[i - T.length] from hash, add S[i] to hash
if hash == hashT
Target found, return i - T.length
Target not found, return -1
Test Cases:
Edge Cases: size(T) > size(S), T is empty/null, S is empty/null
Base Cases: T is 1 char, S is 1 char
Regular Cases: T is at S[0], T is at end, T is in middle


Interview Camp
Time Complexity: O(n)
Space Complexity: O(1)

public static int search(String str, String target) {
if (str == null || target == null)
throw new NullPointerException();
if (target.isEmpty()) // empty string exists in every string

return 0;
if (target.length() > str.length())
return -1;
int x = 53; // a prime number
// calculate hash for first target.length letters

int hashT = 0;
int hash = 0;
for (int i = 0; i < target.length(); i++) {
hashT = hashT * x + target.charAt(i);
hash = hash * x + str.charAt(i);
}
// found match in first substring
if (hashT == hash && target.equals(str.substring(0, target.length())))
return 0;
// calculate x^(t.length - 1) beforehand. Notice we didn't use the inbuilt

// Math.pow() function. This is because it does not wrap around integers if
// they overflow. If an integer goes past ~2 billion, our code goes back to
// 0 and counts from there. This is the behavior we want. Math.pow()
// considers the number to be infinity instead of counting from 0 again.
int xPow = 1;
for (int i = 0; i < target.length() - 1; i++) {
xPow *= x;
}
for (int i = target.length(); i < str.length(); i++) {

int toRemove = str.charAt(i - target.length());
hash = ((hash - toRemove * xPow) * x + str.charAt(i));
if (hash == hashT
&& target.equals(str.substring(i - target.length() + 1, i + 1))) {
return i - target.length() + 1;
}
}
return -1; // not found

}


Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T

Uploaded by

Copyright:

Available Formats

Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T

Uploaded by

Copyright:

Available Formats

// process first T characters

for i -> T.length to S.length - 1

Target not found, return -1

Time Complexity: O(n)

Space Complexity: O(1)

if (target.isEmpty()) // empty string exists in every string

int x = 53; // a prime number

// calculate hash for first target.length letters

// calculate x^(t.length - 1) beforehand. Notice we didn't use the inbuilt

for (int i = target.length(); i < str.length(); i++) {

return -1; // not found

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.