Interview Camp: Level: Hard String Search: Find The Index Where The Larger String A Contains A Target String T

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

 

Interview Camp 

 
Technique: Rabin Karp String Search 

Level: Hard 

String Search: Find the index where the larger string A contains a target string T. 
 
Questions to Clarify: 
Q. If T occurs multiple times in A, do you want just the first index? 
A. Yes 
 
Q. If T doesn't exist in A, can I return -1? 
A. Yes 
 
Q. If T is empty, does that mean it exists in A? 
A. Yes, empty strings exist in any non-null string. 
 
Q. What if S or T are null? 
A. Throw a null pointer exception. 

Solution: 
In Rabin Karp String search, we make a sliding window of size T. We slide this window one 
character at a time across A and check if its value is equal to T. 
 
For example, Let’s say T is ​"ello"​ and S is ​"hello world"​. Let’s say our sliding window is  
called W. 
 
Our first sliding window is ​"hell"​. Remember the sliding window is always the size of T. 
 
S -> "hello world"
W -> "hell" 
 
We compare this with T, and it is not equal. So we slide it by one letter: 
 
S -> "hello world"
W -> "ello"  
 
We compare it with T again and now W = T, so T exists in S. We return the index of​ 'e'​. 
If we slide all the way to the end of S (which means W is never equal to T), we return -1. 
 
We implement this window using a hash function. Every time we slide the window by one  
character, we add the new character to the hash code and remove the first character from it.  
This is also called a "rolling" hash function, because it "rolls" forward one character at a time. 
 
We create the hash of T: 
hash(T:"ello") => 'e'.x​3​ + 'l'.x​2​ + 'l'.x + 'o' 
 

 
 
© ​2017 Interview Camp (interviewcamp.io) 
 
Interview Camp 

Let’s say we have the following hash for the first 4 letters of S: 
  
W => 'h'.x​3​ + 'e'.x​2​ + 'l'.x + 'l' 
 
We compare W with hash(T) and see that they are not equal, so we roll W forward by 1 character. 
The next character is ​'o'​. To roll it, we subtract ​'h'.x​3​, multiply by x and add ​'o'​: 
 
W => ​[(​'h'.x​3​ + 'e'.x​2​ + 'l'.x + 'l') -​ 'h'.x​3​].x​ + 'o'
=> 'e'.x​3​ + 'l'.x​2​ + 'l'.x + 'o' 
  
Now, we check with hash(T) again, and we see that the value is equal. Now, we know that these 
two are very likely to be equal.  
 
Why very likely? Because 2 strings can map to the same hash code. 
So we do a check of the two strings to make sure they are actually equal, and if so, we return 
the index of ​'e'​, which is 1. If they are not equal, it was a false positive, we just carry on 
to the next character. 

Pseudocode: 
(Note: Never write pseudocode in an actual interview. Unless you're writing a few
lines quickly to plan out your solution. Your actual solution should be in a real
language and use good syntax.)

search(S, T)
if T or S is null, throw exception
if size(T) > size(S), return -1
if T is empty, return -1

// process first T characters


hashT = 0 // hash of T
hash = 0 // sliding window hash
for i -> 0 to T.length - 1
add T[i] to hashT
add S[i] to hash
if hashT == hash
return 0

for i -> T.length to S.length - 1


remove S[i - T.length] from hash, add S[i] to hash
if hash == hashT
Target found, return i - T.length

Target not found, return -1

Test Cases: 
Edge Cases: size(T) > size(S), T is empty/null, S is empty/null 
Base Cases: T is 1 char, S is 1 char 
Regular Cases: T is at S[0], T is at end, T is in middle 

 
 
© ​2017 Interview Camp (interviewcamp.io) 
 
Interview Camp 

Time Complexity: O(n) 

Space Complexity: O(1) 


 
public static int search(String str, String target) {
if (str == null || target == null)
throw new NullPointerException();

if (target.isEmpty()) // empty string exists in every string


return 0;
if (target.length() > str.length())
return -1;

int x = 53; // a prime number

// calculate hash for first target.length letters


int hashT = 0;
int hash = 0;
for (int i = 0; i < target.length(); i++) {
hashT = hashT * x + target.charAt(i);
hash = hash * x + str.charAt(i);
}
// found match in first substring
if (hashT == hash && target.equals(str.substring(0, target.length())))
return 0;

// calculate x^(t.length - 1) beforehand. Notice we didn't use the inbuilt


// Math.pow() function. This is because it does not wrap around integers if
// they overflow. If an integer goes past ~2 billion, our code goes back to
// 0 and counts from there. This is the behavior we want. Math.pow()
// considers the number to be infinity instead of counting from 0 again.
int xPow = 1;
for (int i = 0; i < target.length() - 1; i++) {
xPow *= x;
}

for (int i = target.length(); i < str.length(); i++) {


int toRemove = str.charAt(i - target.length());
hash = ((hash - toRemove * xPow) * x + str.charAt(i));
if (hash == hashT
&& target.equals(str.substring(i - target.length() + 1, i + 1))) {
return i - target.length() + 1;
}
}

return -1; // not found


}

 
 
© ​2017 Interview Camp (interviewcamp.io) 

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy