GfG QA is closing soon... Please refer PRACTICE or GfG now on wards !!

Adjacent duplicate sequences

Please help me resolve issue.
I need an algorithm that finds all duplicate(longest repeated substring/subsequence) consecutive.


For exmp:

    subsequence_1 = ['A', 'C', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'E', 'E']
    
    [ 9 ...  10] = ['E', 'E']
    [ 3 ...  8]  = ['B', 'A']
    [ 2 ...  7]  = ['A', 'B']
        
    
    subsequence_2 = ['A', 'C', 'A', 'B', 'B', 'A', 'B', 'B', 'A', 'B', 'B', 'Z', 'Z', 'Z', 'E', 'N', 'D']
    

    [ 11 ...  13] = ['Z']

    [ 2 ...  10] = ['A', 'B', 'B']

    [ 3 ...  8] = ['B', 'B', 'A', 'B', 'B', 'A']
    
    [ 4 ...  9] = ['B', 'A', 'B', 'B', 'A', 'B']
    
    

I used main lorentz algorithm to find duplicates,
but it can find  tandems.
This is not exactly what I need.
Sometimes the sequence is repeated more than twice.


I need to implement is on Python this functionality.
Please help me to find the algorithm for resolve my problem.
It would be a nice to see ready Python implementation. 

asked May 19, 2016 by romensd

1 Answer

answered May 20, 2016 by Bharat Singh

It is not what exactly i need.

For sequence  [X,A,A,B,A,A,B,A,A,B,X,A,B,C,A,B,C,A,B,C]  i expect get  result 

[X,A,1A removed ,B,A,1A removed ,B,A,1A removed ,B,X,A,B,C,2 ABC removed]

Maybe Kolpakov and Kucherov or Landau Schmidt algorithms  ?


 

 

 

...