Algorithms For Predictino
Algorithms For Predictino
This algorithm predicts the next possible choices for a given prefix string.
Global Variables:
Local Variables:
Begin:
Set str[ ] = prefix.
Set L = length of str.
Set current = pointer.
For i=0 to i=L.
{
If (current has a valid pointer for str[i])
{
current = current->str[i].
if (current->is_end == true)
{
if (i < (L-2))
{
Predict(str[i+1, L], head).
}
{
else
{
if (i != L)
{
Predict(str[i+1, L], head).
}
else
{
// 1
// 2
// 3
// 4
return.
}
}
}
last_pointer = current;
Initialize current_children linked list.
For all valid children pointers of current
{
insert new node at the end of current_children.
insert children value and prefix count in the last node of current_child.
}
Select top 5 from current_children (based on prefix_count) from current_children
Store them sequentially in predictions array.
End
Comments:
1. To account for multiple words in the string.
2. If there are at least two characters following the character with is_end set to true
3. str[i+1, L] returns the string starting from i+1 index upto length of array.
This will allow to predict testpassword or passwordmonkey
4. No predictions possible after a prefix but there are more characters following that
point, then split as in case 3.
This will allow to predict ezqpasswordcat
*The code inside the first For loop in above algorithms can be altered by removing
the if conditions, to call the Predict algorithm recursively on every suffix of the
passed string argument. This way a word starting anywhere in the string can be
detected. But this will be slower on large inputs.
//3 and //4 cover most of these cases, so the above algorithm calls Predict
recursively only when it detects the end of the word or when there are no more
predictions at some level.
Algorithm 3: Read_password( )
This algorithm reads input from stdin, calls Predict Algorithm and gives feedback to
user.
Global Variables:
Local Variables:
Begin:
Initiate i to zero.
Initiate d_i to zero.
Initiate char string[ ].
Initiate char c.
Read a character from stdin into c.
If ( c == \r)
{
return.
}
// 1
If a exists in predictions
{
display c in red.
Echo @ in place of a is predictable.
Change c to a.
}
case 1:
If l exists in predictions
{
display c in red.
Echo 1 in place of l is predictable.
Change c to l.
}
case 0:
If o exists in predictions
{
display c in red.
Echo 0 in place of o is predictable.
Change c to o.
}
}
display c in green.
if (c == _ or c == @ or c == &)
// 3
{
reset i and d_i to zero.
reset string[ ] and digit_string[ ].
}
If c is a digit or special character
// 4
{
reset digit_predictions array.
append c to digit_string[ ].
Increment d_i.
Predict_digit(digit_string, digit_head).
Display predictions from digit_predicitons array.
}
else
{
reset predictions array.
append c to string[ ].
Increment i.
If (c already exists in string[0, i-1])
// 5
{
Let j be the index of latest occurance of c in string[0, i-1].
If (j = i-1)
{
append c to predictions array.
// 6
// 7
}
}
Predict(string, head).
Display predictions from predicitons array.
End
Comments:
1. Stop when user hits carriage return.
2. When user replaces a character by a similar looking symbol.
This will allow to predict p@$$w0rd
3. _, @ and & are used to separate words in a password string by many users. In
this case discard the previous word from prefix and predict the next word separately.
This will allow to predict singh@punjab or jass_password
4. Digits are stored and predicted separately. This allows proper predictions when
digits are interleaved with words or are stacked in between of a word.
This will allow to predict pass1234word or p1a1s1s1w1o1r1d.
5. To handle repetitions.
6. A single character is being repeated for ex. errr
7. A group of characters is being repeated for ex. - oyeoye
Other Notes:
As the size of the trie does not affect the speed of the prediction operation,
the database should contain all the common words and phrases. The