The simple script below shows how to combine csv files without duplicating headings using C#. This technique assumes that all of the files have the same structure, and that all of them contain a header row.
The timings in this post came from combining 8 csv files with 13 columns and a combined total of 9.2 million rows.
I first tried combining the files with the PowerShell technique described here. It was painfully slow and took an hour and a half! This is likely because it is deserializing and then serializing every bit of data in the files, which adds a lot of unnecessary overhead.
Next I tried the C# script below using LINQPad. When reading from and writing to a network share, it took 3 minutes and 56 seconds. Much better! Next I tried it on a local SSD drive and it took just 44 seconds. To recap:
- PowerShell with deserialization/serialization: 90 minutes
- C# with source and destination on network drive: 3 minutes and 56 seconds
- C# with source and destination on a local SSD: 44 seconds
C# script combines multiple csv files without duplicating headers.
Combining 8 files with a combined total of about 9.2 million rows
took about 3.5 minutes on a network share and 44 seconds on an SSD.
string sourceFolder = @"C:\CSV_Files";
string destinationFile = @"C:\CSV_Files\CSV_Files_Combined.csv";
// Specify wildcard search to match CSV files that will be combined
string filePaths = Directory.GetFiles(sourceFolder, "CSV_File_Number?.csv");
StreamWriter fileDest = new StreamWriter(destinationFile, true);
for (i = 0; i < filePaths.Length; i++)
string file = filePaths[i];
string lines = File.ReadAllLines(file);
if (i > 0)
lines = lines.Skip(1).ToArray(); // Skip header row for all but first file
foreach (string line in lines)
Let me know what you think in the comments. And if you know of a faster way to accomplish this, I’d love to hear it!