Ok so for example I have an array of strings with each string as below:
364VMS1029
364VMSH920
364VMSH192
364VMSU839
364VMN2382
364VMR223
364VMR2X3
364VMN829
364VMN8757
364VMN831
How can I dynamically get the program to recognise the common characters among all strings in the array, which in this case is 364VM
and filter them out?
If there’s no common character, then don’t do anything.
When you have a complicated problem that you can’t solve, try breaking it down into simpler problems and solve those.
Your problem rephrased:
Remove the common prefix and suffix from a list of strings.
A simpler problem would be:
Find the common prefix for a pair of strings.
This should be much simpler to solve, for example like this:
string GetPrefix(string first, string second)
{
int prefixLength = 0;
for (int i = 0; i < Math.Min(first.Length, second.Length); i++)
{
if (first[i] != second[i])
break;
prefixLength++;
}
return first.Substring(0, prefixLength);
}
Now that you have this, you can build back to the original problem:
Find the common prefix for a list of strings.
Here, it’s very helpful to realize that the prefix of three strings is the same as the prefix of the prefix of the first two strings and the third string. (Hmm, that sounds confusing, maybe it will be clearer in a more formal notation: prefix(A, B, C) = prefix(prefix(A, B), C).)
This means that you can use the LINQ method Aggregate()
on the GetPrefix()
method above to get the prefix of a whole list of strings:
string GetPrefix(IEnumerable<string> strings)
{
return strings.Aggregate(GetPrefix);
}
The next step:
Remove the common prefix from a list of strings.
Now that we can find the common prefix, we can remove it using LINQ Select()
and Substring()
:
IEnumerable<string> RemovePrefix(IEnumerable<string> strings)
{
var prefix = GetPrefix(strings);
return strings.Select(s => s.Substring(prefix.Length, s.Length - prefix.Length));
}
This assumes you want to get a new sequence containing the filtered strings. If you want to modify an existing collection, use a for
loop instead of the Select()
.
One last step:
Remove the common prefix and suffix from a list of strings.
I’ll leave this as an exercise for the reader. This answer contains simple code for reversing a string, which could be helpful. (I think you don’t need the code from other answers to that question, since it looks like your strings are ASCII-only.)
You have to compare every character at the beginning of the strings to see how long the longest common prefix is, and stop once you’ve found a difference.
(Depending on how expensive string access is vs. sorting, you might save time by first sorting the list and then looking only at the first and last string – whatever they have in common will, by definition, also be common to all strings between them. But it is a matter of profiling to find out whether this saves or adds time.)
3