Java Program to Implement Bitap Algorithm for String Matching

The Bitap Algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is “approximately equal” to a given pattern. Here approximately equal states that if the substring and pattern are within a given distance k of each other. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. This does most of the bitwise operations, which are quick.
The Bitap Algorithm is also known as shift-or, shift-and, or Baeza Yates Gonnet Algorithm.
Example:
Input: Text: zambiatek Pattern: geeks Output: Pattern found at index: 0 Input: Text: Youareawesome Pattern: Youareamazing Output: No Match.
Approach:
- Input the text pattern as a string.
 - We will convert this String to a simple Char Array
 - If the length is 0 or exceeding 63, we return “No Match”.
 - Now, by taking array R which complements 0.
 - Taking an array “pattern_mask” which complements 0, to 1
 - Taking the pattern as an index of “pattern_mask” then by using the and operator we and it with the result of the complement of 1L(long integer) by shifting it to left side by i times.
 - Now by running the loop till text length.
 - We now or it with R and pattern mask.
 - Now by left shifting the 1L by the length of the pattern and then the result is an and with R
 - If it is equal to zero we print it by I-len+1 else return -1
 - Output is the index of the text, where the pattern matches.
 
Code:
Java
// Java Program to implement Bitap Algorithm.import java.io.*;import java.io.IOException;public class GFG {    public static void main(String[] args)        throws IOException    {        System.out.println("Bitap Algorithm!");        String text = "zambiatek";        String pattern = "geeks";        // This is an object created of the class for        // extension of functions.        GFG g = new GFG();        // Now here we go to findPattern functions , we two        // arguments.        g.findPattern(text, pattern);    }    public void findPattern(String t, String p)    {        // we convert the String text to Character Array for        // easy indexing        char[] text = t.toCharArray();        // we convert the String pattern to Character Array        // to access each letter in the String easily.        char[] pattern = p.toCharArray();        // Index shows the function bitap search if they are        // equal at a particular index or not        int index = bitap_search(text, pattern);        // If the pattern is not equal to the text of the        // string approximately Then we tend to return -1 If        // index is -1 Then we print there is No match        if (index == -1) {            System.out.println("\nNo Match\n");        }        else {            // Else if there is a match            // Then we print the position of the index at            // where the pattern and the text matches.            System.out.println(                "\nPattern found at index: \n" + index);        }    }    private int bitap_search(char[] text, char[] pattern)    {        // Here the len variable is taken        // This variable accepts the pattern length as its        // value        int len = pattern.length;        // This is an array of pattern_mask of all        // character values in it.        long pattern_mask[]            = new long[Character.MAX_VALUE + 1];        // Here the variable of being long type is        // complemented with 1;        long R = ~1;        // Now if the length of the pattern is 0        // we would return -1        if (len == 0) {            return -1;        }        // Or if the length of the pattern exceeds the        // length of the character array Then we would        // declare that the pattern is too long. We would        // return -1               if (len > 63) {            System.out.println("Pattern too long!");            return -1;        }        // Now filling the values in the pattern mask        // We would run th eloop until the max value of        // character Initially all the values of Character        // are put up inside the pattern mask array And        // initially they are complemented with zero               for (int i = 0; i <= Character.MAX_VALUE; ++i)            pattern_mask[i] = ~0;        // Now len being the variable of pattern length ,        // the loop is set  till there Now the pattern being        // the index of the pattern_mask 1L means the long        // integer is shifted to left by i times The result        // of that is now being complemented the result of        // the above is now being used as an and operator We        // and the pattern_mask and the result of it               for (int i = 0; i < len; ++i)            pattern_mask[pattern[i]] &= ~(1L << i);               // Now the loop is made to run until the text length        // Now what we do this the R array is used        // as an Or function with pattern_mask at index of        // text of i        for (int i = 0; i < text.length; ++i) {                       R |= pattern_mask];            // Now  result of the r after the above            // operation            // we shift it to left side by 1 time            R <<= 1;            // If the 1L long integer if shifted left of the            // len And the result is used to and the result            // and R array            // If that result is equal to 0            // We return the index value            // Index=i-len+1                       if ((R & (1L << len)) == 0)                return i - len + 1;        }        // if the index is not matched        // then we return it as -1        // stating no match found.               return -1;    }} | 
Output
Bitap Algorithm! Pattern found at index: 0
				
					


