• Twitter
  • LinkedIn

Qubole ODBC from C

Here is a detailed tutorial for accessing the Qubole ODBC driver from .NET. Because of the level of detail, you may want to scan the headers and skip to the section you want. I m assuming you re going to use Visual Studio to develop in some sort of Windows environment.

Setting up the Qubole ODBC Driver:

This is how to set up the Qubole ODBC Driver as of 6/18/2014. They may add functionality, so get more current documentation if what you see does not fit what you see when setting up.

1. Get the driver:

According to Qubole, the driver can be obtained by contacting Qubole Support. They will provide you with one or two *.msi files and possibly some *.tdc files. We re going to ignore the .tdc files for the purpose of this demo. Pick the *.msi file that matches your processor architecture (I got files for x86 or x64).

2. Install the driver: This works like most install files.

  • Click the *.msi file that you chose above..
  • You ll get the Welcome page. Click Next.
  • Then comes Terms of Use. Read (you do read these things, don t you?) and click the Accept checkbox (assuming of course you accept the terms).
  • Next is the Destination Folder dialog. I chose the default.
  • Then there is the Account Token dialog. You get your account token by navigating to the User s Tab of the Control Panel in the Qubole UI. The account token is right under the label API Token for Account: as shown on the bottom of this screenshot:

  • Then you can enter the cluster labels. Qubole recommends using a different cluster for Presto use, so there are dialogs for Hive and Presto. You get these from the labels. Mine were default and Presto .

  • On the next dialog, you click the Install button, and the ODBC installation is complete.

3. Check installation:

  • From your Windows Start Menu, search for Data Sources (ODBC) . Click this program.
  • Click the System DSN tab and you should find 2 new DSNs one for Qubole Hive and one for Qubole Presto. Pay attention to those names. We ll use them below.

Creating test data:

I generally have a small Hive table to use for initial work on a project. It saves delay for example if I forget to use the Limit keyword in my query, or the Test or Constrained Run query modes.

In this case I grabbed 1000 lines from the default_qubole_airline_origin_destination table and put them in table ODBCTestData with the following Hive statements:

drop table if exists ODBCTestData;
create table ODBCTestData like default_qubole_airline_origin_destination;
insert overwrite table ODBCTestData select * from default_qubole_airline_origin_destination limit 1000;

Making a Hive query in Qubole and retrieving the results:

Given this is a programming blog, I ll provide the code here. Aside from pasting this into a C# console application, the following things may be necessary for this to work:

  • Change the filename in line 30 to go into a file and folder of your choice.
  • If the name of the datasource in the ODBC Data Source Administrator is not the same as the DSN in line 27, correct that.
  • If you are using the 64-bit driver, change your Visual Studio project s properties so that:
    • Prefer 32-bit is unchecked in the Build Tab.
    • Enable the Visual Studio hosting process is unchecked in the Debug Tab.

The last 2 items avoid an architecture mismatch between a 32-bit application and the 64-bit driver.

using System;
using System.Data.Odbc;
using System.IO;

namespace QuboleOdbcCsharp
{
	class Program
	{
		static void Main(string[] args)
		{
			ConnectToData();
		}

		public static void ConnectToData()
		{
			OdbcConnection odbcConnection = null;
			StreamWriter streamWriter = null;
			OdbcDataReader reader = null;

			try
			{
				//Define the connection
				odbcConnection = new System.Data.Odbc.OdbcConnection()
				{
					//Note: "Qubole Hive ODBC Driver DSN" below is the name of the 
					//DSN from ODBC Data Source Administrator application
					ConnectionString = "DSN=Qubole Hive ODBC Driver DSN;Initial Catalog=default"
				};
				//Set up an output file
				streamWriter = new StreamWriter(@"c:\junk\OdbcTest.txt");
				{
					//Open the connection
					odbcConnection.Open();

					//Issue a query against this connection returning an OdbcDataReader
					var command = odbcConnection.CreateCommand();
					command.CommandText = "select * from odbctestdata;";
					reader = command.ExecuteReader();

					//Loop through the data reader getting 4 columns from the table.
					while (reader.Read())
					{
						streamWriter.WriteLine("{0}\t{1}\t{2}\t{3}",
							reader.GetString(reader.GetOrdinal("origin")),
							reader.GetString(reader.GetOrdinal("dest")),
							reader.GetString(reader.GetOrdinal("tkcarrier")),
							reader.GetString(reader.GetOrdinal("passengers")));
					}
				}
			}
			catch (Exception ex)
			{
				// Should handle exceptions.  This is a demo.
			}
			finally
			{
				if (reader != null && !reader.IsClosed)
				{
					reader.Close();
				}
				if (streamWriter != null)
				{
					streamWriter.Close();
				}
				if (odbcConnection != null)
				{
					odbcConnection.Close();
				}
			}
		}
	}
}

Related Blogs

Latest Blogs