Impala: Show tables like query - hive

I am working with Impala and fetching the list of tables from the database with some pattern like below.
Assume i have a Database bank, and tables under this database are like below.
cust_profile
cust_quarter1_transaction
cust_quarter2_transaction
product_cust_xyz
....
....
etc
Now i am filtering like
show tables in bank like '*cust*'
It is returning the expected results like, which are the tables has a word cust in its name.
Now my requirement is i want all the tables which will have cust in its name and table should not have quarter2.
Can someone please help me how to solve this issue.

Execute from the shell and then filter
impala-shell -q "show tables in bank like '*cust*'" | grep -v 'quarter2'

Query the metastore
mysql -u root -p -e "select TBL_NAME from metastore.TBLS where TBL_NAME like '%cust%' and TBL_NAME not like '%quarter2%'";

Related

How to pass dynamic values into snowflake query?

I have a query to find potential SSN in a table using regex pattern.
db_name.schema_name.Table name: db_name.schema_name.ABC
Column name with Sensitive data: senstve_col
select regexp_substr(senstve_col, '\\b[0-9]{3}[ -][0-9]{2}[ -]{4}\\b') as sensitive_data, * from db_name.schema_name.ABC)
I need to do this for 200 tables with 200 different column names. Also, the db_name and schema_name varies for each table.
Is there a way to pass the values dynamically and store the data into a new table in snowflake?
can someone help with the query to automate the above query for multiple tables?
This is how you call the .SQL in unix
snowsql -o variable_substitution=True --variable NEXT_TABLE=tblname --variable NEXT_COL=colname -f /home/sagar/snowflake/create_table.sql
And this is how you mention the variable name in the .sql
create or replace view ntgrpa_hist.vw_rt_satelliteinfo_latest as
select &NEXT_COL from public.&NEXT_TABLE;

How can I find last modified timestamp for a table in Hive?

I'm trying to fetch last modified timestamp of a table in Hive.
Please use the below command:
show TBLPROPERTIES table_name ('transient_lastDdlTime');
Get the transient_lastDdlTime from your Hive table.
SHOW CREATE TABLE table_name;
Then copy paste the transient_lastDdlTime in below query to get the value as timestamp.
SELECT CAST(from_unixtime(your_transient_lastDdlTime_value) AS timestamp);
With the help of above answers I have created a simple solution for the forthcoming developers.
time_column=`beeline --hivevar db=hiveDatabase --hivevar tab=hiveTable --silent=true --showHeader=false --outputformat=tsv2 -e 'show create table ${db}.${tab}' | egrep 'transient_lastDdlTime'`
time_value=`echo $time_column | sed 's/[|,)]//g' | awk -F '=' '{print $2}' | sed "s/'//g"`
tran_date=`date -d #$time_value +'%Y-%m-%d %H:%M:%S'`
echo $tran_date
I used beeline alias. Make sure you setup alias properly and invoke the above script. If there are no alias used then use the complete beeline command(with jdbc connection) by replacing beeline above. Leave a question in the comment if any.
Here there is already an answer for how to see last modified date for a hive table. I am just sharing how to check last modified date for a hive table partition.
Connect to hive cluster to run hive queries. In most of the cases, you can simply connect by running hive command : hive
DESCRIBE FORMATTED <database>.<table_name> PARTITION(<partition_column>=<partition_value>);
In the response you will see something like this : transient_lastDdlTime 1631640957
SELECT CAST(from_unixtime(1631640957) AS timestamp);
You may get the timestamp by executing
describe formatted table_name
you can execute the below command and convert the output of transient_lastDdlTime from timestamp to date.It will give the last modified timestamp for the table.
show create table TABLE_NAME;
if you are using mysql as metadata use following...
select TABLE_NAME, UPDATE_TIME, TABLE_SCHEMA from TABLES where TABLE_SCHEMA = 'employees';

BigQuery command line tool - append to table using query

Is it possible to append the results of running a query to a table using the bq command line tool? I can't see flags available to specify this, and when I run it it fails and states "table already exists"
bq query --allow_large_results --destination_table=project:DATASET.table "SELECT * FROM [project:DATASET.another_table]"
BigQuery error in query operation: Error processing job '':
Already Exists: Table project:DATASET.table
Originally BigQuery did not support the standard SQL idiom
INSERT foo SELECT a,b,c from bar where d>0;
and you had to do it their way with --append_table
But according to #Will's answer, it works now.
Originally with bq, there was
bq query --append_table ...
The help for the bq query command is
$ bq query --help
And the output shows an append_table option in the top 25% of the output.
Python script for interacting with BigQuery.
USAGE: bq.py [--global_flags] <command> [--command_flags] [args]
query Execute a query.
Examples:
bq query 'select count(*) from publicdata:samples.shakespeare'
Usage:
query <sql_query>
Flags for query:
/home/paul/google-cloud-sdk/platform/bq/bq.py:
--[no]allow_large_results: Enables larger destination table sizes.
--[no]append_table: When a destination table is specified, whether or not to
append.
(default: 'false')
--[no]batch: Whether to run the query in batch mode.
(default: 'false')
--destination_table: Name of destination table for query results.
(default: '')
...
Instead of appending two tables together, you might be better off with a UNION ALL which is sql's version of concatenation.
In big query the comma or , operation between two tables as in SELECT something from tableA, tableB is a UNION ALL, NOT a JOIN, or at least it was the last time I looked.
Just in case someone ends up finding this question in Google, BigQuery has evolved a lot since this post and now it does support Standard.
If you want to append the results of a query to a table using the DML syntax feature of the Standard version, you could do something like:
INSERT dataset.Warehouse (warehouse, state)
SELECT *
FROM UNNEST([('warehouse #1', 'WA'),
('warehouse #2', 'CA'),
('warehouse #3', 'WA')])
As presented in the docs.
For the command line tool it follows the same idea, you just need to add the flag --use_legacy_sql=False, like so:
bq query --use_legacy_sql=False "insert into dataset.table (field1, field2) select field1, field2 from table"
According to the current documentation (March 2018): https://cloud.google.com/bigquery/docs/loading-data-local#appending_to_or_overwriting_a_table_using_a_local_file
You should add:
--noreplace or --replace=false

run SQL script using Rocket Universe from AIX command line (uvsh)

Hopefully this is a simple question. I want to write a shell script that calls a SQL script to do some queries in a Rocket UNIVERSE database. I am doing this from the Server command line (the same machine where the database resides).
In SQLSERVER I might do something like the following:
sqlcmd -S myServer\instanceName -i C:\myScript.sql
In Oracle like this:
SQL>#/dir/test.sql
In UV I can't figure it out:
uvsh ??? some.sql file
So in the test.sql file I might have something like the following:
"SELECT ID, COL1, COL2 FROM PRODUCT WHERE #ID=91;"
"SELECT ID, COL1, COL2 FROM PRODUCT WHERE #ID=92;"
"SELECT ID, COL1, COL2 FROM PRODUCT WHERE #ID=93;"
So can this be done or am I going about this the wrong way? Maybe a different method is more optimal? -- Thanks!
You can send the list of commands to the UniVerse process with the following command.
C:\U2\UV\HS.SALES>type test.sql | ..\bin\uv
You will not need the '"' around each statement you described.
For the HS.SALES account the following SQL commands should work:
SELECT #ID, FNAME, LNAME FROM CUSTOMER WHERE #ID='2';
SELECT #ID, FNAME, LNAME FROM CUSTOMER WHERE #ID='3';
SELECT #ID, FNAME, LNAME FROM CUSTOMER WHERE #ID='4';
Note that this may not do what you want, this will display the results to standard out. Also, caution should be taken when sending commands to UniVerse in this manner. If all of the UniVerse licenses are in use the uv command will fail, and the SQL commands will never execute.

MySQL dump by query

Is it possible to do mysqldump by single SQL query?
I mean to dump the whole database, like phpmyadmin does when you do export to SQL
not mysqldump, but mysql cli...
mysql -e "select * from myTable" -u myuser -pxxxxxxxxx mydatabase
you can redirect it out to a file if you want :
mysql -e "select * from myTable" -u myuser -pxxxxxxxx mydatabase > mydumpfile.txt
Update:
Original post asked if he could dump from the database by query. What he asked and what he meant were different. He really wanted to just mysqldump all tables.
mysqldump --tables myTable --where="id < 1000"
This should work
mysqldump --databases X --tables Y --where="1 limit 1000000"
Dump a table using a where query:
mysqldump mydatabase mytable --where="mycolumn = myvalue" --no-create-info > data.sql
Dump an entire table:
mysqldump mydatabase mytable > data.sql
Notes:
Replace mydatabase, mytable, and the where statement with your desired values.
By default, mysqldump will include DROP TABLE and CREATE TABLE statements in its output. Therefore, if you wish to not delete all the data in your table when restoring from the saved data file, make sure you use the --no-create-info option.
You may need to add the appropriate -h, -u, and -p options to the example commands above in order to specify your desired database host, user, and password, respectively.
You can dump a query as csv like this:
SELECT * from myTable
INTO OUTFILE '/tmp/querydump.csv'
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
You could use --where option on mysqldump to produce an output that you are waiting for:
mysqldump -u root -p test t1 --where="1=1 limit 100" > arquivo.sql
At most 100 rows from test.t1 will be dumped from database table.
If you want to export your last n amount of records into a file, you can run the following:
mysqldump -u user -p -h localhost --where "1=1 ORDER BY id DESC LIMIT 100" database table > export_file.sql
The above will save the last 100 records into export_file.sql, assuming the table you're exporting from has an auto-incremented id column.
You will need to alter the user, localhost, database and table values. You may optionally alter the id column and export file name.
MySQL Workbench also has this feature neatly in the GUI. Simply run a query, click the save icon next to Export/Import:
Then choose "SQL INSERT statements (*.sql)" in the list.
Enter a name, click save, confirm the table name and you will have your dump file.
Combining much of above here is my real practical example, selecting records based on both meterid & timestamp. I have needed this command for years. Executes really quickly.
mysqldump -uuser -ppassword main_dbo trHourly --where="MeterID =5406 AND TIMESTAMP<'2014-10-13 05:00:00'" --no-create-info --skip-extended-insert | grep '^INSERT' > 5406.sql
mysql Export the query results command lineļ¼š
mysql -h120.26.133.63 -umiyadb -proot123 miya -e "select * from user where id=1" > mydumpfile.txt
If you want to dump specific fields from a table this can be handy
1/ create temporary table with your query.
create table tmptable select field1, field2, field3 from mytable where filter1 and fileter2 ;
2/ dump the whole temporary table. then you have your dump file with your specific fields.
mysqldump -u user -p mydatabase tmptable > my-quick-dump.sql
To dump a specific table,
mysqldump -u root -p dbname -t tablename --where="id<30" > post.sql
here is my mysqldump to select the same relation from different tables:
mysqldump --defaults-file=~/.mysql/datenbank.rc -Q -t -c --hex-blob \
--default-character-set=utf8 --where="`cat where-relation-ids-in.sql`" \
datenbank table01 table02 table03 table04 > recovered-data.sql
where-relation-ids-in.sql:
relation_id IN (6384291, 6384068, 6383414)
~/.mysql/datenbank.rc
[client]
user=db_user
password=db_password
host=127.0.0.1
Remark: If your relation_id file is huge, the comment of the where clause will be cut in the dump file, but all data is selected correct ;-)
I hope it helps someone ;-)