Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T
Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T
Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T
Interview Camp
Technique: Rabin Karp String Search
Level: Hard
String Search: Find the index where the larger string A contains a target string T.
Questions to Clarify:
Q. If T occurs multiple times in A, do you want just the first index?
A. Yes
Q. If T doesn't exist in A, can I return -1?
A. Yes
Q. If T is empty, does that mean it exists in A?
A. Yes, empty strings exist in any non-null string.
Q. What if S or T are null?
A. Throw a null pointer exception.
Solution:
In Rabin Karp String search, we make a sliding window of size T. We slide this window one
character at a time across A and check if its value is equal to T.
For example, Let’s say T is "ello" and S is "hello world". Let’s say our sliding window is
called W.
Our first sliding window is "hell". Remember the sliding window is always the size of T.
S -> "hello world"
W -> "hell"
We compare this with T, and it is not equal. So we slide it by one letter:
S -> "hello world"
W -> "ello"
We compare it with T again and now W = T, so T exists in S. We return the index of 'e'.
If we slide all the way to the end of S (which means W is never equal to T), we return -1.
We implement this window using a hash function. Every time we slide the window by one
character, we add the new character to the hash code and remove the first character from it.
This is also called a "rolling" hash function, because it "rolls" forward one character at a time.
We create the hash of T:
hash(T:"ello") => 'e'.x3 + 'l'.x2 + 'l'.x + 'o'
© 2017 Interview Camp (interviewcamp.io)
Interview Camp
Let’s say we have the following hash for the first 4 letters of S:
W => 'h'.x3 + 'e'.x2 + 'l'.x + 'l'
We compare W with hash(T) and see that they are not equal, so we roll W forward by 1 character.
The next character is 'o'. To roll it, we subtract 'h'.x3, multiply by x and add 'o':
W => [('h'.x3 + 'e'.x2 + 'l'.x + 'l') - 'h'.x3].x + 'o'
=> 'e'.x3 + 'l'.x2 + 'l'.x + 'o'
Now, we check with hash(T) again, and we see that the value is equal. Now, we know that these
two are very likely to be equal.
Why very likely? Because 2 strings can map to the same hash code.
So we do a check of the two strings to make sure they are actually equal, and if so, we return
the index of 'e', which is 1. If they are not equal, it was a false positive, we just carry on
to the next character.
Pseudocode:
(Note: Never write pseudocode in an actual interview. Unless you're writing a few
lines quickly to plan out your solution. Your actual solution should be in a real
language and use good syntax.)
search(S, T)
if T or S is null, throw exception
if size(T) > size(S), return -1
if T is empty, return -1
Test Cases:
Edge Cases: size(T) > size(S), T is empty/null, S is empty/null
Base Cases: T is 1 char, S is 1 char
Regular Cases: T is at S[0], T is at end, T is in middle
© 2017 Interview Camp (interviewcamp.io)
Interview Camp
© 2017 Interview Camp (interviewcamp.io)