Read the Top N Rows from Large Text Files with C#

This post describes one way that you can read the top N rows from large text files with C#. This is very useful when working with giant files that are too big to open, but you need to view a portion of them to determine the schema, data types, etc. I’ve used PowerShell many times to do this with large csv files, but in this example we’re going to use C# and look at the Wikipedia XML dump of pages and articles....

March 17, 2017 · Chris Koester

Combine CSV Files using C#

The simple script below shows how to combine csv files without duplicating headings using C#. This technique assumes that all of the files have the same structure, and that all of them contain a header row. using System.IO; using System.Linq; namespace CombineCsvFiles { class Program { static void Main(string[] args) { var sourceFolder = @"C:\csv_files"; var destinationFile = @"C:\csv_files\csv_files_combined.csv"; // Wildcard search returns files with 1-2 digit suffix string[] filePaths = Directory....

January 27, 2017 · Chris Koester

Delete SSAS Partitions with TMSL and PowerShell

This post demonstrates how to delete SSAS partitions with TMSL and PowerShell. TMSL stands for Tabular Model Scripting Language, and it is used for working with SSAS Tabular databases programmatically at compatibility level 1200 (SQL Server 2016). Multidimensional models and previous versions of SQL Server all use Analysis Services Scripting Language (ASSL for XMLA). A common use for TMSL is automating the processing of tabular models using the SQL Agent, SSIS, PowerShell, or some other application....

January 5, 2017 · Chris Koester

Query XML on the Web with C# and SSIS

This post describes how to query XML on the web with C# and SSIS. The emphasis is on the C# code, as I assume the reader is somewhat familiar with the SSIS Script Task. And in order to allow anyone to easily try this example, there is no authentication step. If you’re retrieving XML values from an API, there’s a good chance that you will need to authenticate first. I recently worked on a data integration project where I had to perform two steps involving a REST API in order to download web analytics data....

May 5, 2016 · Chris Koester

Select Columns from CSV Files with PowerShell

# ------------------------------------------------------------------------ # NAME: CSV_SelectColumns.ps1 # AUTHOR: Chris Koester # DATE: 11/2/2015 # # KEYWORDS: CSV, text, text file # # COMMENTS: This script is used to loop through all CSV files in a folder, # select the desired columns, then output the files to a separate folder. # # DIRECTIONS: Enter the source/destination folder paths and the # desired columns as variable values below. # # REFERENCES: # http://blogs.technet.com/b/heyscriptingguy/archive/2011/10/17/easily-remove-columns-from-a-csv-file-by-using-powershell.aspx # ------------------------------------------------------------------------ # Folder containing source CSV files....

December 8, 2015 · Chris Koester

Get Data from Twitter API with Power Query

For my inaugural Power Query post, I’ll show how you can get data from the Twitter API, specifically the Twitter GET search/tweets method. The web API is one of my favorite data sources in Power Query, because it opens a new world of data to Excel users. There are literally thousands, and perhaps tens of thousands of APIs out there. Some examples include Data.gov, United States Census, Weather Underground, and Twitter....

July 16, 2015 · Chris Koester