How to connect to remote hive server using spark2.0 - hive

I m trying to connect to a remote hive server which is in different cluster using spark. Used both hive2 and thrift but no luck
val s = SparkSession.builder().appName("Man test").config("hive.metastore.uris", "jdbc:hive2://abc.svr.yy.xxxx.net:2171/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_HOST#abc.AD.xxx.COM").enableHiveSupport().getOrCreate()
val s = SparkSession.builder().appName("Man test").config("hive.metastore.uris", "thrift://xxxx.svr.us.yyyy.net:2000").config("spark.sql.warehouse.dir", "/apps/hive/warehouse").enableHiveSupport().getOrCreate()
println("in method session created")
s.sql("show databases").show()
I m getting the below error when use jdbc:hive2
java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
and when I use thrift :
javax.security.sasl.SaslException: No common protection layer between client and server.
please let me know if I am missing something here.

I solved the same issue by adding the following to the JVM options.
-Djavax.security.sasl.qop="auth-conf"
See: https://github.com/prestodb/presto/issues/8604

Related

spark-bigquery-connector VS firewall

Need some help. Trying to import data from BigQuery using spark-bigquery-connector: spark 2.4.0, scala 2.11.12, hadoop 2.7, spark-bigquery-with-dependencies_2.11-0.24.2
The corporate firewall blocks access to external services. Tell me, please, what urls need to be provided for spark-bigquery-connector to work?
Have this error:
Exception in thread "main" com.google.cloud.spark.bigquery.repackaged.com.google.cloud.bigquery.BigQueryException: Error getting access token for service account: Connection refused: connect, iss:

Database proxy in karate

I am using karate framework for my API testing in our organization. I am able to execute my project in local where DB connections are successful, when i execute in cloud jenkins we are getting below error
Error : Failed to obtain JDBC Connection; nested exception is java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
DB class used: https://github.com/intuit/karate/blob/master/karate-demo/src/main/java/com/intuit/karate/demo/util/DbUtils.java
Do we have any option to set proxy for DB only, i have also gone through proxy setup in karate-config.js like karate.configure('proxy', { uri: 'http://my.proxy.host:8080', username: 'john', password: 'secret' ,nonProxyHosts:['http://example.com'] }. This is setting up proxy to my API and not for DB instance.
I am also trying to check whether my jenkins server firewall is blocking to create a connection to my db.
Any help from karate framework creators or implementer's?
whether my jenkins server firewall is blocking
That is most likely the case, nothing Karate (or anyone associated with it) can do here to help.
Also please read this: https://stackoverflow.com/a/52078427/143475

Kafka Schema Registry - StoreInitializationException: Timed out trying to create or validate schema topic configuration

i'm trying to configure schema registry to work with SSL, i have already zookeeper and kafka brokers working with the same SSL keys.
but whenever i start the schema-registry i get the following error
ERROR Error starting the schema registry(io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication)
io.confluent.kafka.schemaregistry.exceptions.SchemaRegistryInitializationException: Error initializing kafka store while initializing schema registry
Caused by: io.confluent.kafka.schemaregistry.storage.exceptions.StoreInitializationException: Timed out trying to create or validate schema topic configuration
schema-registry.properties configuration :
listeners=https://IP:8081
kafkastore.connection.url=IP:2181
kafkastore.bootstrap.servers=SSL://IP:9092
kafkastore.topic=_schemas
kafkastore.topic.replication.factor=1
kafkastore.security.protocol=SSL
ssl.truststore.location=/.kafka_ssl/kafka.server.truststore.jks
ssl.truststore.password=password
ssl.keystore.location=/.kafka_ssl/kafka.server.keystore.jks
ssl.keystore.password=password
ssl.key.password=password
ssl.endpoint.identification.algorithm=
inter.instance.protocol=https
can someone advise ?
There are a couple of reasons that might cause this issue. Try to use a different topic for kafkastore.topic in case _schemas got corrupted.
For example,
kafkastore.topic=_schemas_new

Superset with Apache Spark on Hive

I have Apache Superset installed via Docker on my local machine. I have a separate production 20 Node Spark cluster with Hive as the Meta-Store. I want my SuperSet to be able to connect to Hive and run queries via Spark-SQL.
For connecting to Hive, I tried the following
**Add Database --> SQLAlchemy URI ***
hive://hive#<hostname>:10000/default
but it is giving some error when I test connection. I believe I have to do some tunneling, but I am not sure how.
I have the Hive thrift server as well.
Please let me know how to proceed.
What is the error you are receiving? Although the docs do not mention this, the best way to provide the connection URL is in the following format :
hive://<url>/default?auth=NONE ( when there is no security )
hive://<url>/default?auth=KERBEROS
hive://<url>/default?auth=LDAP
first you should connect the 2 containers together.
lets say you have the container_superset that runs superset and container_spark running spark.
run : docker network ls # display containers and its network
select the name of the superset network (should be something like superset_default )
run : docker run --network="superset_default" --name=NameTheConatinerHere --publish port1:port2 imageName
---> port1:port2 is the port mapping and imageName is the image of spak

JMX connection to Gemfire over SSL

I have used GFSH to start locator like below
start locator --name=gemfire_locator --security-properties-file="../config/gfsecurity.properties" --J=-Dgemfire.ssl-enabled-components=all --mcast-port=0 --J=-Dgemfire.jmx-manager-ssl=true
Also started server
start server --name=server1 --security-properties-file="../config/gfsecurity.properties" --J=-Dgemfire.ssl-enabled-components=all --mcast-port=0 --J=-Dgemfire.jmx-manager-ssl=true
I am trying to connect to Gemfire as ClientCache which works perfectly fine over SSL. But When I connect as JMX client, I am getting below error in Java code as well as Jconsole.
Error:
Exception in thread "main" java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.CommunicationException [Root exception is java.rmi.ConnectIOException: non-JRMP server at remote endpoint]
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
at SamplePlugin.main(SamplePlugin.java:101)
Am I missing any other configuration?
Here is my JAVA_TOOL_OPTIONS:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=true
-Djava.rmi.server.hostname=myhostname
You will also need to add the geode-core jar to your classpath for jvisualvm. Use the --cp:a option. I would suggest just using geode-dependencies.jar as that will get everything you might need.
The reason this is required is explained a bit in the comments for ContextAwareSSLRMIClientSocketFactory. Basically it seems that when RMI uses SSL, the necessary RMIClientSocketFactory is exported from the server to the client for use there. In general this would simply just be SslRMIClientSocketFactory. But in our case, we have a custom socket factory and so the client (jvisualvm in this case) needs to have access to it.