Qubole ODBC from C
Here is a detailed tutorial for accessing the Qubole ODBC driver from .NET. Because of the level of detail, you may want to scan the headers and skip to the section you want. I m assuming you re going to use Visual Studio to develop in some sort of Windows environment.
Setting up the Qubole ODBC Driver:
This is how to set up the Qubole ODBC Driver as of 6/18/2014. They may add functionality, so get more current documentation if what you see does not fit what you see when setting up.
1. Get the driver:
According to Qubole, the driver can be obtained by contacting Qubole Support. They will provide you with one or two *.msi files and possibly some *.tdc files. We re going to ignore the .tdc files for the purpose of this demo. Pick the *.msi file that matches your processor architecture (I got files for x86 or x64).
2. Install the driver: This works like most install files.
- Click the *.msi file that you chose above..
- You ll get the Welcome page. Click Next.
- Then comes Terms of Use. Read (you do read these things, don t you?) and click the Accept checkbox (assuming of course you accept the terms).
- Next is the Destination Folder dialog. I chose the default.
- Then there is the Account Token dialog. You get your account token by navigating to the User s Tab of the Control Panel in the Qubole UI. The account token is right under the label API Token for Account: as shown on the bottom of this screenshot:
- Then you can enter the cluster labels. Qubole recommends using a different cluster for Presto use, so there are dialogs for Hive and Presto. You get these from the labels. Mine were default and Presto .
- On the next dialog, you click the Install button, and the ODBC installation is complete.
3. Check installation:
- From your Windows Start Menu, search for Data Sources (ODBC) . Click this program.
- Click the System DSN tab and you should find 2 new DSNs one for Qubole Hive and one for Qubole Presto. Pay attention to those names. We ll use them below.
Creating test data:
I generally have a small Hive table to use for initial work on a project. It saves delay for example if I forget to use the Limit keyword in my query, or the Test or Constrained Run query modes.
In this case I grabbed 1000 lines from the default_qubole_airline_origin_destination table and put them in table ODBCTestData with the following Hive statements:
drop table if exists ODBCTestData;
create table ODBCTestData like default_qubole_airline_origin_destination;
insert overwrite table ODBCTestData select * from default_qubole_airline_origin_destination limit 1000;
Making a Hive query in Qubole and retrieving the results:
Given this is a programming blog, I ll provide the code here. Aside from pasting this into a C# console application, the following things may be necessary for this to work:
- Change the filename in line 30 to go into a file and folder of your choice.
- If the name of the datasource in the ODBC Data Source Administrator is not the same as the DSN in line 27, correct that.
- If you are using the 64-bit driver, change your Visual Studio project s properties so that:
- Prefer 32-bit is unchecked in the Build Tab.
- Enable the Visual Studio hosting process is unchecked in the Debug Tab.
The last 2 items avoid an architecture mismatch between a 32-bit application and the 64-bit driver.
using System;
using System.Data.Odbc;
using System.IO;
namespace QuboleOdbcCsharp
{
class Program
{
static void Main(string[] args)
{
ConnectToData();
}
public static void ConnectToData()
{
OdbcConnection odbcConnection = null;
StreamWriter streamWriter = null;
OdbcDataReader reader = null;
try
{
//Define the connection
odbcConnection = new System.Data.Odbc.OdbcConnection()
{
//Note: "Qubole Hive ODBC Driver DSN" below is the name of the
//DSN from ODBC Data Source Administrator application
ConnectionString = "DSN=Qubole Hive ODBC Driver DSN;Initial Catalog=default"
};
//Set up an output file
streamWriter = new StreamWriter(@"c:\junk\OdbcTest.txt");
{
//Open the connection
odbcConnection.Open();
//Issue a query against this connection returning an OdbcDataReader
var command = odbcConnection.CreateCommand();
command.CommandText = "select * from odbctestdata;";
reader = command.ExecuteReader();
//Loop through the data reader getting 4 columns from the table.
while (reader.Read())
{
streamWriter.WriteLine("{0}\t{1}\t{2}\t{3}",
reader.GetString(reader.GetOrdinal("origin")),
reader.GetString(reader.GetOrdinal("dest")),
reader.GetString(reader.GetOrdinal("tkcarrier")),
reader.GetString(reader.GetOrdinal("passengers")));
}
}
}
catch (Exception ex)
{
// Should handle exceptions. This is a demo.
}
finally
{
if (reader != null && !reader.IsClosed)
{
reader.Close();
}
if (streamWriter != null)
{
streamWriter.Close();
}
if (odbcConnection != null)
{
odbcConnection.Close();
}
}
}
}
}