Where does "insert overwrite local directory" create file on local file system? - hive

The INSERT OVERWRITE LOCAL DIRECTORY command in Hive creates a file on the local filesystem, but which node's local filesystem will the file be created on? Will it always be the namenode or any node that happens to run the job?

which node's local filesystem will the file be created on?
The file will be created on the system where you execute the hive query.
Example: If you have two nodes, one namenode and one slave node. And if you run the hive query on slave node, the file will be created on the slave node's filesystem.
NOTE: If you want to have two hive installations on both namenode and slave node, just use this property: hive.metastore.uris to point to both namenode and slave node locations.
The property should be like this:
<property>
<name>hive.metastore.uris</name>
<value>thrift://namenode-ip:9083,thrift://slave-ip:9083</value>
</property>
Just change namenode-ip and slave-ip to respective IP addresses.

The file will be created locally on the node from which you execute the job.

Related

I am not able to find how previous team routed redis dump.rdb to store in different directory , it is not storing in /var/lib/redis

I am quite familiar with redis conf files, I also aware that by default redis stores dump.rdb files under /var/lib/redis
I transitioned to handle app where previous team installed redis in /opt/app/, also I see dump.rdb files present in /var/lib/redis but it is not storing anything and date stamp is 2 years old. Now I found that redis storing dump.rdb in different location but I am not able to find that location specified in redis.conf file, Is there any other file where dump.rdb location could be specified that tells redis to store dump.rdb to specific location?
You can use config get dir and config get dbfilename to get the path and filename of current RDB file. You can also use config set dir xxx and config set dbfilename xxx to dynamically change the path and filename.
Also you can use the info server command to get the path of config file your Redis instance is loading (check the config_file item)

How to configure redis to use environment variable as the dist location path?

I have enabled both aof and rdb in my redis server. Redis will save the two files appendonly.aof and dump.rdb on the disk. How can I use environment variables to control the path of these two files?
AFAIK, Redis DOES NOT read these configuration from environment variables.
You CAN config these paths in the redis.conf file, or use the CONFIG SET command to dynamically set these paths.
The corresponding configuration keys are: dir, dbfilename and appendfilename.
NOTE: it seems that, by now, appendfilename is NOT supported to be dynamically changed with CONFIG SET command.

Copy redis database (.rdb file) from a remote server to local

I was given a Redis server that is set up remotely.
I can access data in that and I can do CRUD operation with that server.
But I want the replica of the same database in my local.
I have Redis desktop manager setup in my local. And also redis-server setup running.
Things I have tried:
using SAVE command.
I have connected to the remote server and executed save command. It ran
successfully and created dump.rdb file on that server. But I can't access that file as I don't have permission for server FTP.
using BGSAVE
same scenario here also
using redis-cli command
redis-cli -h server ip -p 6379 save > \\local ip\dump.rdb
Here I got an error The network name cannot be found.
Can anyone please suggest me on how can I copy the .rdb file from the server to local?

Redis dump file prevents slave mode

I have a redis master and slave configuration. What I'm finding is that if I restart the slave while there is a dump.rdb file in the directory, the instance starts as a master with no connected slaves rather than obeying the slaveof command in the config file. If I restart the service after deleting the dump.rdb, then the slave will come back online and connect to the master as expected.
Is this expected behavior? If so, is there a way to allow a slave to be restarted and go directly into slave mode?
Edit: This is running on Windows.

graceful_stop.sh not found in HDP2.1 Hbase

I was reading Hortonworks documenrtation to remove regionserver from any host of cluster (http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_system-admin-guide/content/admin_decommission-slave-nodes-3.html).
It uses graceful_stop.sh script . The same script is described at Apache Hbase book (https://hbase.apache.org/book/node.management.html)
I was trying to find this script but not able to locate it .
hbase#node ~]$ ls /usr/lib/hbase/bin/
draining_servers.rb hbase.cmd hbase-daemon.sh region_status.rb test
get-active-master.rb hbase-common.sh hbase-jruby replication
hbase hbase-config.cmd hirb.rb start-hbase.cmd
hbase-cleanup.sh hbase-config.sh region_mover.rb stop-hbase.cmd
[hbase#node ~]$
Is this script is removed from hbase ?
Is there any other way to stop a region server from anyother host of cluster. For eg - I want to stop region server 1 . Can I do this by logging into region server2?
Yes, the script is removed from hbase if you use package install. But you can still find it in src files.
If you want to stop a region server A from another host B, then host B must have privilege to access A. e.g. You have added public key of host B to authorized_keys in A. For a typical cluster, a RS cannot login to other RS directly for security.
For how to write graceful_stop.sh by yourself, you can look at: https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/fA3019_vpZY