Are communications with a Spark Thrift server in binary mode transmitted securely? - ssl

In my organisation, we have a Spark Thrift server setup with HTTP & SSL because there is an underlying assumption that the binary mode is not securely encrypted over the wire and thus may reveal credentials or sensitive query data.
I have Googled, scan read a research paper and looked at the Thrift protocol spec searching for a definitive answer, but to no avail. It seems that the sheer lack of mention on authentication and encryption means that it is expected to be taken care of by an encasing networking layer?
Is the assumption that a Spark Thrift server in binary mode is transmitting unencrypted or otherwise insecure data correct?

The Thrift protocol does include low level transport:
In the context of a Spark Thrift server this can be enabled in the hive-site.xml file like this:
<property>
<name>hive.server2.use.SSL</name>
<value>true</value>
</property>
Combined with the default TCP Thrift protocol, this does encrypt the thrift protocol traffic. There is not a lot of explicit documentation on this, but since the Spark Thrift server is a fork of the Hive2 server, I found this about setting up a Hive2 server which implies this is possible:
The final problem seems to be that some tools, notably Power BI do not seem to be able to use SSL for a 'Standard' (TCP Thrift protocol) connection.

Related

Kafka and Zookeeper TLS

I am trying to enable TLS for kafka broker exchanges and had a thought regarding Zookeeper TLS. Currently, on Apache Kafka Documentation I cannot see much mentioned about ZK TLS setup (ok, probably because it's a different apache project) and any possible performance impact.
The question is, can I not have the ONLY broker-client and inter-broker exchanges secured? Do I also need to add TLS to zookeeper? Extra security isn't bad, but is it really necessary to it even for zookeeper?
Zookeeper with TLS is only available in Zookeeper 3.5 which is still in beta. Therefore, Kafka isn't supporting TLS connections to zookeeper yet. Doesn't mean you can't do it but it does mean you won't find much documentation on it and if you run in it on something important, you are putting yourself at risk. In this case, I would say the extra security could hurt.

How does Apache thrift fits with Apache hive?

Why do Apache Hive needs Apache Thrift? On the Thrift's site it says that it can compile in multiple languages, but I can't understand where does it fits and why do Hive need it.
Thanks
Cited from safaribooksonline:
Chapter 16. Hive Thrift Service
Hive has an optional component known as HiveServer or HiveThrift that
allows access to Hive over a single port. Thrift is a software
framework for scalable cross-language services development. See
http://thrift.apache.org/ for more details. Thrift allows clients
using languages including Java, C++, Ruby, and many others, to
programmatically access Hive remotely.
The CLI is the most common way to access Hive. However, the design of
the CLI can make it difficult to use programmatically. The CLI is a
fat client; it requires a local copy of all the Hive components and
configuration as well as a copy of a Hadoop client and its
configuration. Additionally, it works as an HDFS client, a MapReduce
client, and a JDBC client (to access the metastore). Even with the
proper client installation, having all of the correct network access
can be difficult, especially across subnets or datacenters.
Couldn't have said it better. Emphasis mine.
https://cwiki.apache.org/confluence/display/Hive/HiveServer
HiveServer is an optional service that allows a remote client to submit requests to Hive, using a variety of programming languages, and retrieve results. HiveServer is built on Apache ThriftTM (http://thrift.apache.org/), therefore it is sometimes called the Thrift server although this can lead to confusion because a newer service named HiveServer2 is also built on Thrift.
For more details on how to connect to hive server(thrift server) see the link above.

WebRtc client to server connection

I'm going to implement Java VoiP server to work with WebRtc. Implementation of browser p2p connection is really straightforward. Server to client connection is slightly more tricky.
After a quick look at RFC I wrote down what should be done to make Java server as browser. Kindly help me to complete list below.
Implement STUN server. Server should be abke to respond binding
request and keep-alive pings.
Implement DTLS protocol along with DTLS handshake. After the DTLS
handshake shared secret will be used as keying material within SRTP
and SRTCP.
Support multiplexing of SRTP and SRTCP stream. SRTP and SRTCP use
same port to adress NAT issue.
Not sure whether should I implement SRTCP. I believe connection will
not be broken, if server does not send SRTCP reports to client.
Decode SRTP stream to RTP.
Questions:
Is there anything else which should be done on server-side ?
How webRtc handles SRTCP reports ? Does it adjust sample rate/bit
rate depends on SRTCP report?
WebRtc claims that following issues will be addressed:
packet loss concealment
echo cancellation
bandwidth adaptivity
dynamic jitter buffering
automatic gain control
noise reduction and suppression
Is is webRtc internals or codec(Opus) internals? Do I need to do anything on server side to handle this issues, for example variable bitrate etc ?
The first step would be to implement Interactive Connectivity Establishement (RFC 5245). Whether you make use of a STUN/TURN server or not is irrelevant, your code needs to issue connectivity checks (which use STUN messages) to the browser and respond to the brower's connectivity checks. ICE is a fairly complex state machine, but it's doable.
You don't have to reinvent the wheel. STUN / TURN servers are external components. Use as they are. WebRTC source code is available which you can use in your application code and call the related methods.
Pls. refer to similar post - Server as WebRTC data channel peer

Understanding Apache ActiveMQ

I am confused about the function of Apache ActiveMQ.
I downloaded ActiveMQ from this link.
So I use it this way (environment: Windows 7): I start the bin/activemq.bat, then it works.
My question is: Does this mean I start a server on my machine? When I initialize the ActiveMQConnectionFactory, the broker URL is tcp://localhost:61616. But what if I want my machine to serve as a server and another machine to connect to my server?
Yes, you can use the primary box as a server and have consumers/subscribers running on other boxes (which will need to connect to the server - you will need to specify the server hostname & port for the connection to be established) - once in place, the messages on the server (topic or queue) can be consumed by the clients.
If you one have one producer and one consumer, you can look into using queues - if you have more than one consumer/subscriber, you can look into setting up a topic to which the consumers will subscribe to. Messages need to be inserted to the topic/queue as needed.
You can specify the server information in your code or preferably in the config file.
For reference to topologies:
http://activemq.apache.org/topologies.html
Also, you can choose to persist your messages or not based on your use case. Kaha DB is the preferred route (specially if performance is of concern).
Useful examples:
http://sujitpal.blogspot.com/2007/12/jms-patterns-with-activemq.html
http://vvratha.blogspot.com/2012/05/java-client-to-sendreceive-messages-for.html
Hope it helps.
Apache ActiveMQ ™ is the most popular and powerful open source messaging and Integration Patterns server
& it act like a third party server.
Apache ActiveMQ is fast, supports many Cross Language Clients and Protocols, comes with easy to use Enterprise Integration Patterns and many advanced features while fully supporting JMS 1.1 and J2EE 1.4. Apache ActiveMQ is released under the Apache 2.0 License.
ActiveMQ have the capabilities to send 100 MB single message framework and maintain 1000 concurrent connection simultaneously , for the further information you can check activemq.xml in your documentation.
Further Info at here about the ActiveMQ

Can I configure Apache ActiveMQ to use the STOMP protocol over UDP?

I'm developing a STOMP binding for Ada, which is working fine utilizing TCP/IP as the transport between the client and an ActiveMQ server configured as a STOMP broker. I thought to support UDP as well (i.e. STOMP over UDP), however, the lack of pertinent information in the ActiveMQ documentation or in web searches suggests to me that this isn't possible, and perhaps it doesn't even make any sense :-)
Confirmation one way or the other (and an ActiveMQ configuration excerpt if this is possible) would be appreciated.
this is not implemented in ActiveMQ at the moment as Stomp transport uses TCP only. It is possible to implement, so if you have a time to do it, give it a try.