How to use BigQuery as sink in Apache Flink? - google-bigquery

Is it possible to use JDBC Connector to write the Flink Datastream to Bigquery or any other options?
New to Apache Flink, any suggestions/examples would be very helpful.

BigQuery is currently not supported as a JDBC dialect by Flink. An overview of the currently supported versions can be found at https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/jdbc/
I'm not aware of a BigQuery sink being available. That implies that in order to write to BigQuery, you would have to create your own custom sink.

Related

Start thrift server in standalone zeppelin

Is it possible to access contents of spark.sql output via JDBC, like %sql interpreter does?
You just need to setup Spark's thriftserver as described in the Spark's documentation. Zeppelin is just consumer of the Spark's execution, and doesn't expose data itself.
If you really need to extract specific information from paragraph, you can use Notebook API.

Writting events from kafka to Hive with ORC format

I am trying to use a kafka connector to writte data in Hive with OCR format from a Kafka bus.
The events in the bus are in avro format. I need somethig like this ConvertAvroToORC
for NiFi but with Kafka Connectors
ORC is not supported with HDFS Kafka Connect currently.
You're welcome to build the PR and try it out on your own.
https://github.com/confluentinc/kafka-connect-hdfs/pull/294
It has been released io.confluent.connect.hdfs.orc.OrcFormat included in the HDFS connector io.confluent.connect.hdfs.HdfsSinkConnector https://docs.confluent.io/current/connect/kafka-connect-hdfs/configuration_options.html#hdfs-2-sink-connector-configuration-properties

How to integrate Apache Nifi with Amazon Athena?

My Requirements:
1. User's will run the sql queries through Apache nifi to Amazon S3.
Is this possible to achieve Nifi integration with Amazon Athena?
You should be able to easily integrate Apache NiFi and Amazon Athena. The NiFi capabilities to leverage/plug-in JDBC drivers and reuse that context in many areas helps here greatly. See here for info on the JDBC drivers with Athena https://docs.aws.amazon.com/athena/latest/ug/connect-with-jdbc.html and here for using some of NiFi's DBCP facilities https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-dbcp-service-nar/1.5.0/org.apache.nifi.dbcp.DBCPConnectionPool/index.html
You should be able to by using a combination of an ExecuteStreamCommand and the awscli. The cli has the capabilities to issue Athena queries

Integrating Kafka and sql server 2008

I have a SQL 2008 R2 database that i would like to integrate with Kafka
so essentially I want to use Change data capture to capture changes in my table and put them on a Kafka Queue - this is for the front end Devs to read the data off Kafka. Has anyone done this before or have any tips on how to go about it?
Kafka Connectors will solve this problem now in particular the JDBC connector.
The JDBC connector allows you to import data from any relational
database with a JDBC driver into Kafka topics. By using JDBC, this
connector can support a wide variety of databases without requiring
custom code for each one.
Source: http://docs.confluent.io/3.0.0/connect/connect-jdbc/docs/jdbc_connector.html
See also:
Kafka Connect JDBC Connector source code on GitHub
Kafka Connect Documentation
There is no way you can do it directly from Sql server. You have to write your own producer that will pull day from Sql, and push to Kafka queue. We are currently doing the same thing via background services that pushed data to Kafka

Write data to Apache Accumulo

i want to write stream data to accumulo!. There is any API for accumulo to write data. It is possible in python instead of java?
See the BatchWriter with you instantiate via Connector. The Accumulo Thrift Proxy enables non-Java clients to interact with Accumulo.