Connect to hive from python using jdbc
Category : Connect to hive from python using jdbc
Released: Mar 21, View statistics for this project via Libraries. Tags db, api, java, jdbc, bridge, connect, sql, jpype, jython. Or you can get a copy of the source by cloning from the JayDeBeApi github project and install with. If you are using cPython ensure that you have installed JPype properly.
It has been tested with JPype1 0. Older JPype installations may cause problems. Basically you just import the jaydebeapi Python module and execute the connect method. The first argument to connect is the name of the Java driver class. Third you can optionally supply a sequence consisting of user and password or alternatively a dictionary containing arguments that are internally passed as properties to the Java DriverManager. See the Javadoc of DriverManager class for details. See the documentation of your Java runtime environment.
For example I have to set it on my Ubuntu machine like this. In theory every database with a suitable JDBC driver should work.
It is confirmed to work with the following databases:. Please submit bugs and patches. All contributors will be acknowledged. Mar 21, Mar 19, Jan 10, Apr 26, Apr 10, The public gateway that the clusters sit behind redirects the traffic to the port that HiveServer2 is actually listening on.
The following connection string shows the format to use for HDInsight:. You can get it through Azure portal.
You can only use port to connect to the cluster from some places outside of the Azure virtual network. HDInsight is a managed service, which means that all connections to the cluster are managed via a secure Gateway. You cannot connect to HiveServer 2 directly on ports or because these ports are not exposed to the outside.
When establishing the connection, you must use the HDInsight cluster admin name and password to authenticate to the cluster gateway. From a Java application, you must use the name and password when establishing a connection. For example, the following Java code opens a new connection using the connection string, admin name, and password:. In the following script, replace sshuser with the SSH user account name for the cluster. From a command line, change your work directory to the one created in the prior step, and then enter the following command to copy files from an HDInsight cluster:.
From the left of the window, select Drivers. Use the Test button to verify that the connection works. If the test succeeds, you see a Connection successful dialog. If an error occurs, see Troubleshooting.
To save the connection alias, use the Ok button at the bottom of the Add Alias dialog. When prompted, select Connect.
Once connected, enter the following query into the SQL query dialog, and then select the Run icon a running person. The results area should show the results of the query. Follow the instructions in the repository to build and run the sample. Symptoms : When connecting to an HDInsight cluster that is version 3.
The stack trace for this error begins with the following lines:. Cause : This error is caused by an older version commons-codec. In the SquirreL directory, under the lib directory, replace the existing commons-codec. Cause : This error is caused by the limitation on Gateway nodes. However, a gateway is not designed to download a huge amount of data, so the connection might be closed by the Gateway if it cannot handle the traffic.
Copy data directly from blob storage instead. You may also leave feedback directly on GitHub. Skip to main content. Exit focus mode.For example — we may want to do a rowcount of all tables in one of our Hive databases, without having to code a fixed list of tables in our Hive code.
We can compile Java code to run queries against hive dynamically, but this can be overkill for smaller requirements. Scripting can be a better way to code more complex Hive tasks. Python code can be used to execute dynamic Hive statements, which is useful in these sorts of scenarios:. There are several Python libraries available for connecting to Hive such as PyHive and Pyhs2 the latter unfortunately now unmanaged.
Some major Hadoop vendors however decline to support this type of direct integration explicitly. This is effectively a wrapper allowing Java DB drivers to be used in Python scripts. Hive queries can be dynamically generated and executed to retrieve row counts for all the tables found above:.
Output from Hive queries now should be printed to the screen. Pros and cons of the solution Pros:. View all posts by niftimusmaximus. You are commenting using your WordPress. You are commenting using your Google account.
You are commenting using your Twitter account.
Using ODBC to Connect to HiveServer2
You are commenting using your Facebook account. Notify me of new comments via email. Notify me of new posts via email. Show Show. Skip to content Jun 8, Jun 8, Posted in hadoophive. Python to the rescue Python code can be used to execute dynamic Hive statements, which is useful in these sorts of scenarios: Code branching depending on results of a Hive query — e.
Share this: Twitter Facebook. Like this: Like Loading Published by niftimusmaximus. Leave a Reply Cancel reply Enter your comment here Fill in your details below or click an icon to log in:. Email required Address never made public. Name required. By continuing to use this website, you agree to their use.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community. Already on GitHub? Sign in to your account. What is. The motivation behind HiveRunner is to provide a local, self contained, unit test harness with no external or platform dependencies. As such it offers no features for connecting to remote Hive instances and I suspect it is unlikely to do so in the future.
Given that you wish to use JDBC I would recommend looking at libraries focused on building database integration tests.
A long long time ago I used something called DBUnit to do something similar. We currently use an in house tool, but I suspect there are some good OSS options out there. Generally it would be wise to build up a suite of small, lightweight HiveRunner unit tests in addition to any integration tests that you might create. They each serve a different purpose. On Wednesday, 6 Januaryameybarve15 notifications github. Doesn't hiveRunner use embedded server?How to connect to Hive on a secure cluster using JDBC uber driver
What is hsqldb? Can I give information for hsqldb to jdbc, if yes what would be the above properties? Ok, I think I misunderstood your original post. You wish to connect to the internal Hive instance created by HiveRunner. It should be possible to connect to this using JDBC buy you'll only be able to query metadata, not Hive warehouse data.
I've not tried this but it looks like the connection details could be obtained as follows:. However, I don't think this is actually what you want. From this my guess would be:.
Let me know what if anything works. I'd also be curious to understand what it is you're trying to do once you connect to the instance. On 6 January atameybarve15 notifications github. ConnectException: Connection refused. It means hiveRunner is not starting hiveServer on above host and port? What is my host and port with above jdbc URL? It looks as though the HiveServer2 instance is never started within HiverRunner, meaning that there is no thread listening for connections. I suspect this is intentional as there seems to be little point spinning this up if tests can be run locally without it.
As it stands, without a concrete use-case the lack of JDBC connectivity is not really an issue. Unless ameybarve15 has any further comments, I think this issue can now be closed.Along with the dependencies, I moved all the jars from the dir via --jars in spark-submit and that didn't work either. Could anyone let me know what dependencies am I missing in the sbt file? If not, what could be the mistake I am doing here since, the same type of code works in Jave with the same libraries dependencies in the project and I couldn't understand what is wrong here?
Any help is much appreciated. Support Questions. Find answers, ask questions, and share your expertise. Turn on suggestions. Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type. Showing results for. Search instead for. Did you mean:. Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. All forum topics Previous Next. Labels: Apache Hive Apache Spark.
I am trying to connect to Hive server from scala code as below. Check the settings. The full exception stack could be seen below: Exception in thread "main" java. ClassNotFoundException: org. Reply Views. Tags 6. Already a User? Sign In. Don't have an account? Coming from Hortonworks?
Activate your account here.In combination with UDF s this has the potential to be quite a powerful approach to leverage the best of the two.
In this post I would like to demonstrate the preliminary steps necessary to make R and Hive work. If you have the Hortonworks Sandbox setup you should be able to simply follow along as you read.
If not you probably are able to adapt where appropriate. By default this means the machine should be able to access port or where the Hive server is installed. Next we are going to use a sample table in Hive to query from R setting up all required packages.
Using Hive from R with JDBC
This is quite straight forward except for the need to configure javareconf. You might be able to omit this step but if you are having trouble and run into an exception like this:. To check whether or not the jars are really on the classpath you can use. Hive comes with some sample tables either pre-installed or ready to be setup after installation.
We are going to use this tables to run some sample queries using R. To set the required privileges log in to the machine where Hive CLI is running and run the following command:. View all posts by hkropp.
Like Like. I am using a Linux server with hadoop hortonworks sandbox 2. This is the script I am using to connect:. I have checked that my. I have view some of the jar files and found that hive-service. Any idea what is causing the problem?? I am fairly new to R and programming for that matter so any specific steps are much appreciated. My server has hadoop I follow the below mention steps to connect to Hive.
Error in. I also use RHive to connect to Hive but there was also no success. I have pulled the data into my client machine, processed the data using some statistical techniques. Now I have a table created in hive, I want to write the processed data to this table. You can always write the data back e. You are commenting using your WordPress. You are commenting using your Google account. You are commenting using your Twitter account. You are commenting using your Facebook account.
Notify me of new comments via email. Notify me of new posts via email. Skip to content. You might be able to omit this step but if you are having trouble and run into an exception like this: checking whether JNI programs can be compiled You may need to use non-standard compiler flags or a different compiler in order to fix this.
Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. I am trying to connect to Hive database from python, which is SSL enabled. Can someone help me with this?
Learn more. Asked 3 days ago. Active 3 days ago. Viewed 5 times. Abhishek Krishna Abhishek Krishna 1. New contributor. Active Oldest Votes. Abhishek Krishna is a new contributor. Be nice, and check out our Code of Conduct. Sign up or log in Sign up using Google.
Sign up using Facebook. Sign up using Email and Password. Post as a guest Name. Email Required, but never shown. The Overflow Blog. The Overflow How many jobs can be done at home? Featured on Meta. Community and Moderator guidelines for escalating issues via new response….
Feedback on Q2 Community Roadmap. Triage needs to be fixed urgently, and users need to be notified upon…. Dark Mode Beta - help us root out low-contrast and un-converted bits. Technical site integration observational experiment live on Stack Overflow. Related Hot Network Questions.