I am writing a program that needs to be able to process data from a number of different sources. The sources output data in a variety of formats. So, depending on which source is being used, I need to initialize a string list with a different set of column headers. The quantity of headers varies between 10 and 18 columns depending on the data source.
Essentially, I need a clean method to conditionally initialize a list. I have three working solutions but I am not sure which is the cleanest/best practice:
- Example #1 requires that I add each header to the list line-by-line. With the number of potential sources and columns that need to be added for each, this list gets pretty long.
- Example #2 requires that I use {} around each case statement to limit the scope so that I can declare the dataHeaders list individually in each case statement. I have never encountered a situation wherein I had to limit the scope to an individual case statement and as such, I suspect it is bad practice.
- Example #3 seems like the best option. I actually implemented this one as I was writing this question. I don’t really see any drawbacks here but I thought I’d still pose the question anyhow.
Is there a better/cleaner way to do this? If not, which example is the best practice?
Example #1:
List<string> headers = new List<string>();
switch (dataSource)
{
case "Source 1":
headers.Add("header1");
headers.Add("header2");
headers.Add("header3");
headers.Add("...");
headers.Add("header18");
break;
case "Source 2":
headers.Add("headerA");
headers.Add("headerB");
headers.Add("headerC");
headers.Add("...");
headers.Add("headerJ");
break;
default:
// TODO: Handle bad sources.
break;
}
Example #2:
List<string> headers = new List<string>();
switch (dataSource)
{
case "Source 1":
{
List<string> dataHeaders = new List<string>() { "header1", "header2", "header3", "...", "header18" }
headers = dataHeaders;
break;
}
case "Source 2":
{
List<string> dataHeaders = new List<string>() { "headerA", "headerB", "headerC", "...", "headerJ" }
headers = dataHeaders;
break;
}
default:
{
// TODO: Handle bad sources.
break;
}
}
Example #3:
List<string> headers = new List<string>();
switch (dataSource)
{
case "Source 1":
headers.AddRange(new List<string>() { "header1", "header2", "header3", "...", "header18" })
break;
case "Source 2":
headers.AddRange(new List<string>() { "headerA", "headerB", "headerC", "...", "headerJ" })
break;
default:
// TODO: Handle bad sources.
break;
}
2
A lot of times, you can really clean up a problem like this by expressing your data in a form with less syntax than a programming language, such as a CSV string. I don’t know C#, but an example in Scala is:
def headersForDataSource(dataSource: String): Option[Array[String]] = {
val headers = """
Source1,header1,header2,header3,...,header18
Source2,headerA,headerB,headerC,...,headerJ
"""
val lines = headers.lines map (_.trim split ',')
lines find (_(0) == dataSource) map (_.tail)
}
You can see this makes your list of headers significantly more compact and easier to read, because it doesn’t clutter and intermingle it with the syntax of the data structure. It’s putting it in your preferred format, instead of the format forced on you by the programming language.
What is the issue you’re trying to fix?
If you’re looking to improve performance, then this is definitely a case of premature optimisation. Any of the three methods you give could fill a list with thousands of items in the fraction of a second it takes to lift your finger from the mouse after clicking on a button.
If you’re just trying to make your code less verbose, then a variation on option 3 is the best.
var headers = new List<string>();
switch (dataSource)
{
case "Source 1":
headers.AddRange(new [] { "header1", "header2", "header3", "...", "header18" })
break;
case "Source 2":
headers.AddRange(new [] { "headerA", "headerB", "headerC", "...", "headerJ" })
break;
default:
// TODO: Handle bad sources.
break;
}
Note that you don’t have to specify the type of the list, or the arrays for the elements – these will be automatically inferred by the compiler.
You also can avoid switch
and if
using maps
. I’m going to use Java syntaxis, but I guess it could be done in C# easily
Unmodificable (static or constant) Descriptor
Map<String,String[]> HEADERS_MAP = new HashMap<String,String[]>();
HEADERS_MAP.put("Source 1",new String[]{"header1","headerN"});
HEADERS_MAP.put("Source 2",new String[]{"headerA","headerZ"});
...
//Preventing Map from changes
HEADERS_MAP = Collections.unmodifiableMap(HEADERS_MAP);
Note: I think Maps are Dictorinary in C#
Header builder
//Each consumer will have it's onw list
return Arrays.asList(HEADERS_MAP.get(dataSource));
Edit:
I have edited by first answer @unholysampler’s comment.
Now Map stores String[] and builder
returns a new fresh/unbound list. So builder consumer will be able to modify the content of the list and it will prevent code from unwanted changes. It could be done with Collections.unmodificableList
but that way adds little bit more code and complexity. It does code less readable.
2
I think a slightly modified option 2 is your best bet. Leave the declaration of headers outside the switch, and inside the switch just set headers = new List
You also don’t need the curly braces to create blocks for each case statement; C# will force you to have break at the end of each case.
The advantage of doing this is that if you add a new case, you’ll be warned that headers isn’t initialized if you try to use it later, where as your current #2 will just end up with an empty list, which may be a logic error to track down later.
Here is shorter version, combining suggestions of Peregrine and Karl Bielefeldt:
string hline;
switch (dataSource)
{
case "Source 1":
hline = "header1,header2,header3,...,header18";
break;
case "Source 2":
hline = "headerA,headerB,headerC,...,headerZ";
break;
default:
// TODO: Handle bad sources.
break;
}
var headers = new List<string>();
headers.AddRange(hline.Split(','));
If you can refer to your different data sources by an index from a consecutive interval, I would consider to avoid the whole switch-case at all, and initialize hline
like this:
string hLine = new []{
"header1,header2,header3,...,header18",
"headerA,headerB,headerC,...,headerZ",
"..."}[dataSourceIndex];
Direct answer
Based on the options presented, I’d err towards #2 (i.e. collection initializer), but dataHeaders
is an unnecessary inbetween step (just assign the new list to headers
directly) and I do prefer multiline here for readability’s sake.
Improving the approach
However, your suggested solutions all give me a magic string bee in my bonnet.
These headers might change one day, when one of the sources decides to update its own code, or when you shift towards using new sources. If you push this header configuration to a config file (of any sort you’d like), then you get the best of both worlds:
- No more magic string header values.
- Your code no longer needs to worry about how to initialize a bunch of string values.
- Adding more sources no longer requires adding more code, which is OCP-friendlier.
There are plenty of ways to write this data to an external file. To keep things simple, I’m just going to assume a flat txt file whose filename is the source name.
// Source 1.txt
header1
header2
header3
...
header18
Your code can then parse this file, instead of needing to juggle all the values in the file.
public IEnumerable<string> GetHeadersForSource(string sourceName)
{
var filepath = Path.Combine(myDirectory, $"{sourceName}.txt");
if(!File.Exists(filepath))
{
// return string[0];
// OR
// throw new FileNotFoundException("...", filepath);
}
var headers = File.ReadAllLines(filepath)
.Select(line => line.Trim())
.Where(line => !String.IsNullOrWhitespace(line))
.ToList();
return headers;
}
And now your entire switch case can be replaced with a single statement:
var headers = GetHeadersForSource(dataSource);
Disclaimer
This example focused on the header values. You’re still using a magic string for sources themselves, which should also be changed, but I considered to be out of scope for the posted question.
Why not use a Dictionary?
var headers = new List<string>();
var headersToAdd = new Dictionary<string, string[]>
{
{"Source 1", new string[] {"header1", "header2"} },
{"Source 2", new string[] {"headerA", "headerB"} }
};
headers.AddRange(headersToAdd["Source 1"]);
headers.AddRange(headersToAdd["Source 2"]);
1
If you prefer a functional style with immutability:
static IEnumerable<string> GetHeaders(string dataSource)
{
switch (dataSource)
{
case "Source 1":
yield return "header1";
yield return "header2";
yield return "header3";
yield return "...";
yield return "header18";
yield break;
case "Source 2":
yield return "headerA";
yield return "headerB";
yield return "headerC";
yield return "...";
yield return "headerJ";
yield break;
default:
// TODO: Handle bad sources.
yield break;
}
}
2