Unable to create the storage plugin for hive in Apache drill - apache

I am new to Apache drill.While creating the storage plugin for Apache hive.I am getting the error.I have tried two ways.Below is the configuration.
1.First approach:
{
"type": "hive",
"enabled": false,
"configProps": {
"hive.metastore.uris": "thrift2:localhost:10000",
"fs.default.name": "hdfs://localhost:9000/",
"hive.metastore.sasl.enabled": "false"
}
}
2.Second approach:
{
"type": "hive",
"enabled": false,
"configProps": {
"hive.metastore.uris": "",
"javax.jdo.option.ConnectionURL": "jdbc:derby://localhost:1527/metastore_db;create=true",
"hive.metastore.warehouse.dir": "/user/tmp/warehouse/hive",
"fs.default.name": "hdfs://localhost:9000",
"hive.metastore.sasl.enabled": "false"
}
}
I am using plain Apache components and both drill and hive2 are installed in the same machine.
For both the cases I am getting the error in the GUI as
Please retry: error (unable to create/ update storage)
Kindly help me in resolving the same.Thanks in Advance!!

I am able to connection through the first approach i.e Hive Remote Metastore Connection.
Here is the Configuration:
{
"type": "hive",
"enabled": false,
"configProps": {
"hive.metastore.uris": "thrift:localhost:9083",
"fs.default.name": "hdfs://localhost:9000/",
"hive.metastore.sasl.enabled": "false"
}
}
Also make sure that Hive metastore is up and running.It can be started using the below command
hive -- service metastore &.
Also the parameter hive.metastore.uris in the hive-site.xml should be updated with thrift://localhost:9083.
Thanks

Related

AWS AMI cannot retrieve password after packer creation using private key

I am building a windows server AMI using packer. It works fine with a hardcoded password, but I am trying to create the AMI so that the password is autogenerated. I tried what was suggested below and the packer logs looks good, it gets a password.
How to create windows image in packer using the keypair
However when I create an EC2 instance from the AMI in terraform the connection to the windows password is lost and cannot be retrieved. What is missing here?
Packer json
{
"builders": [
{
"profile" : "blah",
"type": "amazon-ebs",
"region": "eu-west-1",
"instance_type": "t2.micro",
"source_ami_filter": {
"filters": {
"virtualization-type": "hvm",
"name": "*Windows_Server-2012-R2*English-64Bit-Base*",
"root-device-type": "ebs"
},
"most_recent": true,
"owners": "amazon"
},
"ssh_keypair_name" : "shared.key",
"ssh_private_key_file" : "./common/sharedkey.pem",
"ssh_agent_auth" : "true",
"ami_name": "test-{{timestamp}}",
"user_data_file": "./common/bootstrap_win.txt",
"communicator": "winrm",
"winrm_username": "Administrator"
}
]
}
Adding Ec2Config.exe -sysprep at the end worked.
{
"type": "windows-shell",
"inline": ["C:\\progra~1\\Amazon\\Ec2ConfigService\\Ec2Config.exe -sysprep"]
}
Though beware it seems my IIS configuration does not work after sysprep.

Druid RabbitMQ Firehose

I'm trying to setup druid to work with rabbitmq firehose but getting the following error from Tranquility
java.lang.IllegalArgumentException: Could not resolve type id 'rabbitmq' into a subtype of [simple type, class io.druid.data.input.FirehoseFactory]
I did the following
1. Installed Druid
2. Downloaded extension druid-rabbitmq
3. Copied druid-rabbitmq into druid extensions
4. Copied amqp-client jar to druid lib
5. Added druid-rabbitmq into druid.extensions.loadList in common.runtime.properties
6. In Tranquility server.json configuration added the firehose config
"ioConfig" : {
"type" : "realtime",
"firehose" : {
"type" : "rabbitmq",
"connection" : {
"host": "localhost",
"port": "5672",
"username": "blackbox",
"password": "blackbox",
"virtualHost": "blackbox-vhost",
"uri": "amqp://localhost:5672/blackbox-vhost"
},
"config" : {
"exchange": "test-exchange",
"queue" : "test-q",
"routingKey": "#",
"durable": "true",
"exclusive": "false",
"autoDelete": "false",
"maxRetries": "10",
"retryIntervalSeconds": "1",
"maxDurationSeconds": "300"
}
}
}
I'm using imply 1.3.0 but I think Tranquility is for stream pushing while a firehose is used for stream pulling so I think this was the problem. So now I created a realtime node and it's running fine. I also had to copy lyra jar file into druid lib directory. Now I can publish data from rabbit and its been inserted into druid and I can query the data but problem is that in rabbit the message is still showing as unacked. Any idea?

Making storage plugin on Apache Drill to HDFS

I'm trying to make storage plugin for Hadoop (hdfs) and Apache Drill.
Actually I'm confused and I don't know what to set as port for hdfs:// connection, and what to set for location.
This is my plugin:
{
"type": "file",
"enabled": true,
"connection": "hdfs://localhost:54310",
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null
},
"tmp": {
"location": "/tmp",
"writable": true,
"defaultInputFormat": null
}
},
"formats": {
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
},
"avro": {
"type": "avro"
}
}
}
So, is ti correct to set localhost:54310 because I got that with command:
hdfs -getconf -nnRpcAddresses
or it is :8020 ?
Second question, what do I need to set for location? My hadoop folder is in:
/usr/local/hadoop
, and there you can find /etc /bin /lib /log ... So, do I need to set location on my datanode, or?
Third question. When I'm connecting to Drill, I'm going through sqlline and than connecting on my zookeeper like:
!connect jdbc:drill:zk=localhost:2181
My question here is, after I make storage plugin and when I connect to Drill with zk, can I query hdfs file?
I'm very sorry if this is a noob question but I haven't find anything useful on internet or at least it haven't helped me.
If you are able to explain me some stuff, I'll be very grateful.
As per Drill docs,
{
"type" : "file",
"enabled" : true,
"connection" : "hdfs://10.10.30.156:8020/",
"workspaces" : {
"root" : {
"location" : "/user/root/drill",
"writable" : true,
"defaultInputFormat" : null
}
},
"formats" : {
"json" : {
"type" : "json"
}
}
}
In "connection",
put namenode server address.
If you are not sure about this address.
Check fs.default.name or fs.defaultFS properties in core-site.xml.
Coming to "workspaces",
you can save workspaces in this. In the above example, there is a workspace with name root and location /user/root/drill.
This is your HDFS location.
If you have files under /user/root/drill hdfs directory, you can query them using this workspace name.
Example: abc is under this directory.
select * from dfs.root.`abc.csv`
After successfully creating the plugin, you can start drill and start querying .
You can query any directory irrespective to workspaces.
Say you want to query employee.json in /tmp/data hdfs directory.
Query is :
select * from dfs.`/tmp/data/employee.json`
I have similar problem, Drill cannot read dfs server. Finally, the problem is cause by namenode port.
The default address of namenode web UI is http://localhost:50070/.
The default address of namenode server is hdfs://localhost:8020/.

Apache Drill: table not found on s3 bucket

I'm a newbye with Apache Drill.
The scenario is this:
I've an S3 bucket, where I place my csv file called test.csv.
I've install Apache Drill with instructions from official website.
I followed this tutorial: https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/ for create an S3 plugin.
I start Drill, use the correct "workspace" (with: use my-s3;), but when I try to select records from test.cav file an error occured:
Table 's3./test.csv' not found.
Can anyone help me?
Thanks!
Use the name of your workspace (if you use one) and back ticks in the USE command as follows:
USE `my-s3`.`<workspace-name>`;
SHOW files; //should list test.csv file
SELECT * FROM `test.csv`;
Query the CSV in the local file system using the dfs storage plugin configuration to rule out things like a header causing a problem. This page might help if you haven't seen it.
Storage plugin mentioned in comment above:
{
"type": "file",
"enabled": true,
"connection": "s3n://<accesskey>:<secret>#catpaws",
"workspaces": {},
"formats": {
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
}
}
}
Probably, this is not relevant. It's an excerpt from the Amazon S3 help, which contains lots more info:
<property>
<name>fs.s3.awsAccessKeyId</name>
<value>ID</value>
</property>
<property>
<name>fs.s3.awsSecretAccessKey</name>
<value>SECRET</value>
</property>

composer to disable https completely

my network does not work well with https, so doing
composer.phar install
throws
[Composer\Downloader\TransportException]
The "https://packagist.org/packages.json" file could not be downloaded: Failed to enable crypto
failed to open stream: operation failed
i used
{
"packagist": false
},
{
"type": "composer",
"url": "http://packagist.org",
"options": {
"ssl": {
"verify_peer": "false"
}
}
}
as a http falback, but again it crashes in some other point:
Installing dependencies
- Installing symfony/translation (v2.4.0)
Downloading: 100%
Downloading: 100%
Downloading: 100%
[Composer\Downloader\TransportException]
The "https://api.github.com/repos/symfony/Translation/zipball/0919e0fc709217f8c9e5049f2603419fdd4a34ff" file could not be downloaded: Failed to
enable crypto
failed to open stream: operation failed
my problem is just with TLSv1, previous SSL versions should work, as the browsers work correctly.
how should i do, the problem also exists in other cmd tools that depend on https like npm, bower, git, curl, ...
composer config --global disable-tls true
composer config --global secure-http false
You can turn off TLS (For your specific project) using your composer.json as such:
{
"require": {
"laravel/framework": "5.2.43"
},
"config": {
"preferred-install": "dist",
"disable-tls": true,
"secure-http": false
}
}
NB: Take not of the "disable-tls": true in the config section.
The problem is simply that you wrapped "false" in quotes, which is true when converted to bool.
Use "verify_peer": false instead of "verify_peer": "false":
{
"repositories": [
{
"type": "composer",
"url": "http://packagist.org",
"options": {
"ssl": {
"verify_peer": false
}
}
}
]
}
It's okey.
It will work. You just have a mismatch:
"options": {
"ssl": {
"verify_peer": false
}
}
in order to disable https totaly (not recommanded)
you need to add "secure-http": false in your composer.json file config key like this:
{
"name": "laravel/laravel",
"description": "The Laravel Framework.",
"keywords": ["framework", "laravel"],
"license": "MIT",
"require": {
"laravel/framework": "5.3.*",
},
...
"config": {
"preferred-install": "dist",
"bin-dir": "vendor/bin/",
"secure-http": false
},
"minimum-stability": "dev"
}
You cannot disable SSL with Composer. Even if it works like in your setup, you cannot control the source URLs of any package you use. Some of them do not offer anything without SSL, so you MUST use SSL.
I think it's the best idea to make SSL work. Did you try composer diag and see where the problem is?