Relative Content

Tag Archive for algorithmpermutationset-difference

Efficient algorithm for finding the non-overlapping set of two very large sets of strings generated by permutation

I am looking for the most efficient algorithm for comparing large set of strings that are representing possible amino acid substitutions at various positions of a 9mer peptide.
As a toy example, the matrix M1

1 2 3
A 1 1 1
B 1 1 0
C 0 1 0
defines the set of three-letter strings which have an A, B, or C at positions 1, 2, and 3 when a “1” exists in the corresponding row in the matrix (set1).

Matrices in which one or more “1”s are replaced by “0” define subsets of set1. For example a set2 comprising the strings “ABA”, “ACA” can be defined by a matrix M2

1 2 3
A 1 0 1
B 0 1 0
C 0 1 0