Why do Apache Hive needs Apache Thrift? On the Thrift's site it says that it can compile in multiple languages, but I can't understand where does it fits and why do Hive need it.
Thanks
Cited from safaribooksonline:
Chapter 16. Hive Thrift Service
Hive has an optional component known as HiveServer or HiveThrift that
allows access to Hive over a single port. Thrift is a software
framework for scalable cross-language services development. See
http://thrift.apache.org/ for more details. Thrift allows clients
using languages including Java, C++, Ruby, and many others, to
programmatically access Hive remotely.
The CLI is the most common way to access Hive. However, the design of
the CLI can make it difficult to use programmatically. The CLI is a
fat client; it requires a local copy of all the Hive components and
configuration as well as a copy of a Hadoop client and its
configuration. Additionally, it works as an HDFS client, a MapReduce
client, and a JDBC client (to access the metastore). Even with the
proper client installation, having all of the correct network access
can be difficult, especially across subnets or datacenters.
Couldn't have said it better. Emphasis mine.
https://cwiki.apache.org/confluence/display/Hive/HiveServer
HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results. HiveServer is built on Apache ThriftTM (http://thrift.apache.org/), therefore it is sometimes called the Thrift server although this can lead to confusion because a newer service named HiveServer2 is also built on Thrift.
For more details on how to connect to hive server(thrift server) see the link above.
Related
We are analyzing ignite to use it in .NET platform, In Ignite document we can see there is some limitation on running cluster in Java and Thin Client in .NET. The list provided in the document doesn't mention about "Data Streaming". So would like to know whether "Data Streaming" is supported in "Mixed-Platform" or not ?
Yes, Data Streaming is supported in mixed-platform clusters, and you can do it from Ignite.NET Thin Client without any limitations.
P.S. The document is slightly outdated (fix on the way), Services are also supported in this scenario, you can make Java <-> .NET service calls in both directions, which some customers use extensively:
https://ignite.apache.org/docs/latest/net-specific/net-java-services-execution
I am going to use redis as the web cache for geoserver, so I need to understand the database connection in geoserver. I have imported geoserver in Eclipse, and there are two projects named gs-sec-jdbc and gs-web-sec-jdbc. I do not know what the “sec" means and what are the differences between these two projects. Any help or tutorial is appreciated! Thank you !
These two modules are part of the security subsystem (gs-web-sec-jdbc is the web based user interface). They are almost certainly not the modules you are looking for.
If you intend to use redis to store WMS output images then you need to look at how GeoWebCache works.
If you want to provide a cache between GeoServer and a JDBC database then you will need to explore GeoTools' JDBC datastore mechanism.
Good day,i have a need to use hive just the way we could use mysql. I want to find a way in which I can host it online so that people in different places can communicate to one hive service. Thanks in advance.
This is the functionality that Hive Server 2 provides
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2
It exposes Hive as a thrift web service, and there are JDBC and ODBC drivers available.
You can also put Apache Knox in front of it, in order to have more options for authentication and authorization https://knox.apache.org/
If you are using a common distribution such as Hortonworks or Cloudera, Hive Server 2 will probably be installed automatically when you install Hive.
what is difference between org.apache.hive.jdbc.HiveDriver and org.apache.hadoop.hive.jdbc.HiveDriver ?
Which one to use to Write a JDBC Client to connect to hive ?
Hive 0.11 includes a new JDBC driver that works with HiveServer2, enabling users to write JDBC applications against Hive. The application needs to use the JDBC driver class and specify the network address and port in the connection URL in order to connect to Hive.
HiveServer2 (HIVE-2935), brings concurrency, authentication, and a foundation for authorization to Hive
HiveServer2 is an improved version of HiveServer that supports Kerberos authentication and multi-client concurrency and use the driver "org.apache.hive.jdbc.HiveDriver"
HiveServer1 or Thrift server cannot handle concurrent requests from more than one client. This is actually a limitation imposed by the Thrift interface that HiveServer exports, and can't be resolved by modifying the HiveServer code. The Driver which Hive Server "org.apache.hadoop.hive.jdbc.HiveDriver"
Please find the links which will help you to understand more.
org.apache.hadoop.hive.jdbc.HiveDriver and org.apache.hive.jdbc.HiveDriver
To work with, it depend on ur requirement which version you are having and how you have done Hive Configuration.
i use sesame for a project and i use a local nativestore file repository. everything is fine but when multiple clients use my application silmuntaneusly the repository locks.How can i deal with parallel connections problem;
A Sesame Native Store assumes it has sole, unique access to its datadir. This means that you can not create two NativeStore objects that use the same datadir, as this will cause inconsistencies and potential deadlocks. So, you need to share a single NativeStore object.
In a single JRE, this can be easily achieved by using a RepositoryManager. See this article for an explanation and code examples. If your setup requires several independent client applications to connect to Sesame, you will either have to implement your own server app for these clients to connect to, or you can use a Sesame Server and have each client connect via a HTTPRepository.