Distinct elements in subarray using Mo’s Algorithm

Given an array ‘a[]’ of size n and number of queries q. Each query can be represented by two integers l and r. Your task is to print the number of distinct integers in the subarray l to r. Given a[i] <= 10^6

Examples :

Input : a[] = {1, 1, 2, 1, 2, 3}
        q = 3
        0 4
        1 3
        2 5
Output : 2
         2
         3
In query 1, number of distinct integers
in a[0...4] is 2 (1, 2)
In query 2, number of distinct integers 
in a[1..3] is 2 (1, 2)
In query 3, number of distinct integers 
in a[2..5] is 3 (1, 2, 3)

Input : a[] = {7, 3, 5, 9, 7, 6, 4, 3, 2}
        q = 4
        1 5
        0 4
        0 7
        1 8
Output : 5
         4
         6
         7

Let a[0…n-1] be input array and q[0..m-1] be array of queries. 

Approach :

  1. Sort all queries in a way that queries with L values from 0 to \sqrt(n) - 1     are put together, then all queries from \sqrt(n)     to 2*\sqrt(n) - 1     , and so on. All queries within a block are sorted in increasing order of R values.
  2. Initialize an array freq[] of size 10^6     with 0 . freq[] array keep count of frequencies of all the elements in lying in a given range.
  3. Process all queries one by one in a way that every query uses number of different elements and frequency array computed in previous query and stores the result in structure.
    • Let ‘curr_Diff_element’ be number of different elements of previous query.
    • Remove extra elements of previous query. For example if previous query is [0, 8] and current query is [3, 9], then remove a[0], a[1] and a[2]
    • Add new elements of current query. In the same example as above, add a[9].
  4. Sort the queries in the same order as they were provided earlier and print their stored results

Adding elements()

  • Increase the frequency of element to be added(freq[a[i]]) by 1.
  • If frequency of element a[i] is 1.Increase curr_diff_element by 1 as 1 new element has been added in range.

Removing elements() 

  • Decrease frequency of element to be removed (a[i]) by 1.
  • if frequency of an element a[i] is 0.Just decrease curr_diff_element by 1 as 1 element has been completely removed from the range.

Note : 

In this algorithm, in step 2, index variable for R change at most O(n * \sqrt(n)     ) times throughout the run and same for L changes its value at most O(m * \sqrt(n)     ) times. All these bounds are possible only because sorted queries first in blocks of \sqrt(n)     size. The preprocessing part takes O(m Log m) time. Processing all queries takes O(n * \sqrt(n)     ) + O(m * \sqrt(n)     ) = O((m+n) *\sqrt(n)     ) time. 

Below is the implementation of above approach : 

CPP




// Program to compute no. of different elements
// of ranges for different range queries
#include <bits/stdc++.h>
using namespace std;
 
// Used in frequency array (maximum value of an
// array element).
const int MAX = 1000000;
 
// Variable to represent block size. This is made
// global so compare() of sort can use it.
int block;
 
// Structure to represent a query range and to store
// index and result of a particular query range
struct Query {
    int L, R, index, result;
};
 
// Function used to sort all queries so that all queries
// of same block are arranged together and within a block,
// queries are sorted in increasing order of R values.
bool compare(Query x, Query y)
{
    // Different blocks, sort by block.
    if (x.L / block != y.L / block)
        return x.L / block < y.L / block;
 
    // Same block, sort by R value
    return x.R < y.R;
}
 
// Function used to sort all queries in order of their
// index value so that results of queries can be printed
// in same order as of input
bool compare1(Query x, Query y)
{
    return x.index < y.index;
}
 
// calculate distinct elements of all query ranges.
// m is number of queries n is size of array a[].
void queryResults(int a[], int n, Query q[], int m)
{
    // Find block size
    block = (int)sqrt(n);
 
    // Sort all queries so that queries of same
    // blocks are arranged together.
    sort(q, q + m, compare);
 
    // Initialize current L, current R and current
    // different elements
    int currL = 0, currR = 0;
    int curr_Diff_elements = 0;
 
    // Initialize frequency array with 0
    int freq[MAX] = { 0 };
 
    // Traverse through all queries
    for (int i = 0; i < m; i++) {
         
        // L and R values of current range
        int L = q[i].L, R = q[i].R;
 
        // Remove extra elements of previous range.
        // For example if previous range is [0, 3]
        // and current range is [2, 5], then a[0]
        // and a[1] are subtracted
        while (currL < L) {
             
            // element a[currL] is removed
            freq[a[currL]]--;
            if (freq[a[currL]] == 0)
                curr_Diff_elements--;
             
            currL++;
        }
 
        // Add Elements of current Range
        // Note:- during addition of the left
        // side elements we have to add currL-1
        // because currL is already in range
        while (currL > L) {
            freq[a[currL - 1]]++;
 
            // include a element if it occurs first time
            if (freq[a[currL - 1]] == 1)
                curr_Diff_elements++;
             
            currL--;
        }
        while (currR <= R) {
            freq[a[currR]]++;
 
            // include a element if it occurs first time
            if (freq[a[currR]] == 1)
                curr_Diff_elements++;
             
            currR++;
        }
 
        // Remove elements of previous range. For example
        // when previous range is [0, 10] and current range
        // is [3, 8], then a[9] and a[10] are subtracted
        // Note:- Basically for a previous query L to R
        // currL is L and currR is R+1. So during removal
        // of currR remove currR-1 because currR was
        // never included
        while (currR > R + 1) {
 
            // element a[currL] is removed
            freq[a[currR - 1]]--;
 
            // if occurrence of a number is reduced
            // to zero remove it from list of
            // different elements
            if (freq[a[currR - 1]] == 0)
                curr_Diff_elements--;
             
            currR--;
        }
        q[i].result = curr_Diff_elements;
    }
}
 
// print the result of all range queries in
// initial order of queries
void printResults(Query q[], int m)
{
    sort(q, q + m, compare1);
    for (int i = 0; i < m; i++) {
        cout << "Number of different elements" <<
            " in range " << q[i].L << " to "
            << q[i].R << " are " << q[i].result << endl;
    }
}
 
// Driver program
int main()
{
    int a[] = { 1, 1, 2, 1, 3, 4, 5, 2, 8 };
    int n = sizeof(a) / sizeof(a[0]);
    Query q[] = { { 0, 4, 0, 0 }, { 1, 3, 1, 0 },
                { 2, 4, 2, 0 } };
    int m = sizeof(q) / sizeof(q[0]);
    queryResults(a, n, q, m);
    printResults(q, m);
    return 0;
}


Java




//Java program to compute no. of different elements
// of ranges for different range queries
 
import java.io.*;
import java.util.*;
 
class GFG {
     
    // Class to represent a query range and to store
    // index and result of a particular query range
    public static class Query{
        int L, R, index, result;
        Query(int l, int r, int i, int res){
            this.L = l;
            this.R = r;
            this.index = i;
            this.result = res;
        }
    }
     
    // Used in frequency array (maximum value of an
    // array element).
   public static  int MAX = 1000000;
       
    // Variable to represent block size. This is made
    // global so compare() of sort can use it.
    public static int block;
     
    // calculate distinct elements of all query ranges.
    // m is number of queries n is size of array a[].
    public static void queryResults(int a[], int n, Query q[], int m)
    {
        // Find block size
        block = (int)Math.sqrt(n);
       
        // Sort all queries so that all queries
        // of same block are arranged together and within a block,
        // queries are sorted in increasing order of R values.
        Arrays.sort(q, new Comparator<Query>() {
            public int compare(Query x, Query y) {
                // Different blocks, sort by block.
                if (x.L / block != y.L / block)
                    return x.L / block < y.L / block?1:-1;
               
                // Same block, sort by R value
                return x.R < y.R?1:-1;
            }
        });
       
        // Initialize current L, current R and current
        // different elements
        int currL = 0, currR = 0;
        int curr_Diff_elements = 0;
       
        // Initialize frequency array with 0
        int freq[] = new int[MAX];
        Arrays.fill(freq,0);
       
        // Traverse through all queries
        for (int i = 0; i < m; i++) {
               
            // L and R values of current range
            int L = q[i].L, R = q[i].R;
       
            // Remove extra elements of previous range.
            // For example if previous range is [0, 3]
            // and current range is [2, 5], then a[0]
            // and a[1] are subtracted
            while (currL < L) {
                   
                // element a[currL] is removed
                freq[a[currL]]--;
                if (freq[a[currL]] == 0) {
                    curr_Diff_elements--;
                }
                currL++;
            }
       
            // Add Elements of current Range
            // Note:- during addition of the left
            // side elements we have to add currL-1
            // because currL is already in range
            while (currL > L) {
                freq[a[currL - 1]]++;
       
                // include a element if it occurs first time
                if (freq[a[currL - 1]] == 1) {
                    curr_Diff_elements++;
                }
                currL--;
            }
            while (currR <= R) {
                freq[a[currR]]++;
       
                // include a element if it occurs first time
                if (freq[a[currR]] == 1){
                    curr_Diff_elements++;
                }
                currR++;
            }
       
            // Remove elements of previous range. For example
            // when previous range is [0, 10] and current range
            // is [3, 8], then a[9] and a[10] are subtracted
            // Note:- Basically for a previous query L to R
            // currL is L and currR is R+1. So during removal
            // of currR remove currR-1 because currR was
            // never included
            while (currR > R + 1) {
       
                // element a[currL] is removed
                freq[a[currR - 1]]--;
       
                // if occurrence of a number is reduced
                // to zero remove it from list of
                // different elements
                if (freq[a[currR - 1]] == 0){
                    curr_Diff_elements--;
                }
                currR--;
            }
            q[i].result = curr_Diff_elements;
        }
    }
     
    // print the result of all range queries in
    // initial order of queries
    public static void printResults(Query q[], int m)
    {
        // FSort all queries in order of their
        // index value so that results of queries can be printed
        // in same order as of input
        Arrays.sort(q, new Comparator<Query>() {
            public int compare(Query x, Query y) {
                return (x.index < y.index)?-1:1;
            }
        });
         
         
        for (int i = 0; i < m; i++) {
            System.out.println("Number of different elements in range " + q[i].L + " to " + q[i].R + " are " + q[i].result);
        }
    }
     
    //Driver Code
    public static void main (String[] args) {
        int a[] = { 1, 1, 2, 1, 3, 4, 5, 2, 8 };
        int n = a.length;
        Query q[] = new Query[3];
        q[0]=new Query(0,4,0,0);
        q[1] = new Query(1, 3, 1, 0);
        q[2] = new Query( 2, 4, 2, 0);
        int m = q.length;
        queryResults(a, n, q, m);
        printResults(q, m);
    }
}
//This code is contributed by shruti456rawal


Output

Number of different elements in range 0 to 4 are 3
Number of different elements in range 1 to 3 are 2
Number of different elements in range 2 to 4 are 3
Feeling lost in the world of random DSA topics, wasting time without progress? It’s time for a change! Join our DSA course, where we’ll guide you on an exciting journey to master DSA efficiently and on schedule.
Ready to dive in? Explore our Free Demo Content and join our DSA course, trusted by over 100,000 zambiatek!

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Check Also
Close
Back to top button