Create table name in Hive using variable subsitution - hive

I'd like to create a table name in Hive using variable substitution.
E.g.
SET market = "AUS";
create table ${hiveconf:market_cd}_active as ... ;
But it fails. Any idea how it can be achieved?

You should use backtrics (``) for name for that, like:
SET market=AUS;
CREATE TABLE `${hiveconf:market}_active` AS SELECT 1;
DESCRIBE `${hiveconf:market}_active`;
Example run script.sql from beeline:
$ beeline -u jdbc:hive2://localhost:10000/ -n hadoop -f script.sql
Connecting to jdbc:hive2://localhost:10000/
...
0: jdbc:hive2://localhost:10000/> SET market=AUS;
No rows affected (0.057 seconds)
0: jdbc:hive2://localhost:10000/> CREATE TABLE `${hiveconf:market}_active` AS SELECT 1;
...
INFO : Dag name: CREATE TABLE `AUS_active` AS SELECT 1(Stage-1)
...
INFO : OK
No rows affected (12.402 seconds)
0: jdbc:hive2://localhost:10000/> DESCRIBE `${hiveconf:market}_active`;
...
INFO : Executing command(queryId=hive_20190801194250_1a57e6ec-25e7-474d-b31d-24026f171089): DESCRIBE `AUS_active`
...
INFO : OK
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| _c0 | int | |
+-----------+------------+----------+
1 row selected (0.132 seconds)
0: jdbc:hive2://localhost:10000/> Closing: 0: jdbc:hive2://localhost:10000/

Markovitz's criticisms are correct, but do not produce a correct solution. In summary, you can use variable substitution for things like string comparisons, but NOT for things like naming variables and tables. If you know much about language compilers and parsers, you get a sense of why this would be true. You could construct such behavior in a language like Java, but SQL is just too crude.
Running that code produces an error, "cannot recognize input near '$' '{' 'hiveconf' in table name".(I am running Hortonworks, Hive 1.2.1000.2.5.3.0-37).
I spent a couple hours Googling and experimenting with different combinations of punctuation, different tools ranging from command line, Ambari, and DB Visualizer, etc., and I never found any way to construct a table name or a field name with a variable value. I think you're stuck with using variables in places where you need a string literal, like comparisons, but you cannot use them in place of reserved words or existing data structures, if that makes sense. By example:
--works
drop table if exists user_rgksp0.foo;
-- Does NOT work:
set MY_FILE_NAME=user_rgksp0.foo;
--drop table if exists ${hiveconf:MY_FILE_NAME};
-- Works
set REPORT_YEAR=2018;
select count(1) as stationary_event_count, day, zip_code, route_id from aaetl_dms_pub.dms_stationary_events_pub
where part_year = '${hiveconf:REPORT_YEAR}'
-- Does NOT Work:
set MY_VAR_NAME='zip_code'
select count(1) as stationary_event_count, day, '${hiveconf:MY_VAR_NAME}', route_id from aaetl_dms_pub.dms_stationary_events_pub
where part_year = 2018

The qualifies should be removed
You're using the wrong variable name
SET market=AUS; create table ${hiveconf:market}_active as select 1;

Related

Cannot define a BigQuery column as ARRAY<STRUCT<INT64, INT64>>

I am trying to define a table that has a column that is an arrays of structs using standard sql. The docs here suggest this should work:
CREATE OR REPLACE TABLE ta_producer_conformed.FundStaticData
(
id STRING,
something ARRAY<STRUCT<INT64,INT64>>
)
but I get an error:
$ bq query --use_legacy_sql=false --location=asia-east2 "$(cat xxxx.ddl.temp.sql | awk 'ORS=" "')"
Waiting on bqjob_r6735048b_00000173ed2d9645_1 ... (0s) Current status: DONE
Error in query string: Error processing job 'xxxxx-10843454-yyyyy-
dev:bqjob_r6735048b_00000173ed2d9645_1': Illegal field name:
Changing the field (edit: column!) name does not fix it. What I am doing wrong?
The fields within the struct need to be named so this works:
CREATE OR REPLACE TABLE ta_producer_conformed.FundStaticData
(
id STRING,
something ARRAY<STRUCT<x INT64,y INT64>>
)

DBeaver, How to declare variables and use them?

i just want to know if it is possible to declare variables on the DBeaver´s sql editor and use them on a query
You have to enable variable processing in the "SQL Processing" settings of DBeaver -> Window -> Preferences -> Database -> Editors -> SQL Editor -> SQL Processing. There is a block on Parameters with settings you can change. See the Dynamic Parameter binding section on the wiki.
You should then be able to do:
#set date = '2019-10-09'
SELECT ${date}::DATE, ${date}::TIMESTAMP WITHOUT TIME ZONE
which produces:
| date | timestamp |
|------------|---------------------|
| 2019-10-09 | 2019-10-09 00:00:00 |
Yes you can, using :.
An example:
SELECT * FROM "SYSIBM".SYSDUMMY1
WHERE IBMREQD = :YOUR_VARIABLE
Based on the incredibly helpful post from #nicoschl, here are a couple of minor improvements:
-- using declarations
#set datex_start = cast('2120-01-01' as date) as date_start;
-- datex_start is the var name
-- casting the value in the declaration saves us the work later
-- the var can be given a default fieldname (e.g. "date_start")
-- run as a standalone command since the subsequent SELECT statement doesn't return values when it's all run together
select
${datex_start}
;
This will return a value "2120-01-01" with a fieldname of "date_start".
In the DBeaver SQL editor you can type the following:
-- define variable
#set my_var='hello SQL world'
-- call variable
select :my_var
You can also use ${my_var} to reference the variable; $my_var however did not work for me. I am using DBeaver v. 21.1.
You have to enable at Dbeaver settings:
Top Window > Preferences > and then see print below (updated 2022/08).

How to store the output of a query in a variable in HIVE

I want to store current_day - 1 in a variable in Hive. I know there are already previous threads on this topic but the solutions provided there first recommends defining the variable outside hive in a shell environment and then using that variable inside Hive.
Storing result of query in hive variable
I first got the current_Date - 1 using
select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
Then i tried two approaches:
1. set date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
and
2. set hivevar:date1 = ( select date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1);
Both the approaches are throwing an error:
"ParseException line 1:82 cannot recognize input near 'select' 'date_sub' '(' in expression specification"
When I printed (1) in place of yesterday's date the select query is saved in the variable. The (2) approach throws "{hivevar:dt_chk} is undefined
".
I am new to Hive, would appreciate any help. Thanks.
Hive doesn't support a straightforward way to store query result to variables.You have to use the shell option along with hiveconf.
date1 = $(hive -e "set hive.cli.print.header=false; select date_sub(from_unixtime(unix_timestamp(),'yyyy-MM-dd'),1);")
hive -hiveconf "date1"="$date1" -f hive_script.hql
Then in your script you can reference the newly created varaible date1
select '${hiveconf:date1}'
After lots of research, this is probably the best way to achieve setting a variable as an output of an SQL:
INSERT OVERWRITE LOCAL DIRECTORY '<home path>/config/date1'
select CONCAT('set hivevar:date1=',date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP(),'yyyy-MM-dd'),1)) from <some table> limit 1;
source <home path>/config/date1/000000_0;
You will then be able to use ${date1} in your subsequent SQLs.
Here we had to use <some table> limit 1 as hive got a bug in insert overwrite if we don't specify a table name.

SQL statement to list all of the current database properties

I have tried a few different ways to list the db properties and have come up short.
SQL> SHOW DATABASE VERBOSE emp;
SP2-0158: unknown SHOW option "DATABASE"
SP2-0158: unknown SHOW option "VERBOSE"
SP2-0158: unknown SHOW option "emp"
Heres another that I dont understand why its not working
SQL> show database;
SP2-0158: unknown SHOW option "database"
SQL> DGMGRL
SP2-0042: unknown command "DGMGRL" - rest of line ignored.
Does anyone have ideas as to what I am missing.
There's a table called database_properties - you should query that
select property_name, property_value, description from database_properties
If this isn't what you're looking for, you should be more specific
If you wan the full version information for your DB then:
SELECT *
FROM v$version;
If you want your DB parameters then:
SELECT *
FROM v$parameter;
If you want more information about your DB instance then:
SELECT *
FROM v$database;
If you want the database properties then:
SELECT *
FROM database_properties;
If you want the "size" of your database then this will give you a close enough calculation:
SELECT SUM(bytes / (1024*1024)) "DB Size in MB"
FROM dba_data_files;
You will need DBA level permissions to see these views or you could request the data from your DBA and he will (probably) oblige.
Hope it helps...
SHOW DATABASE is not a valid SQL*Plus command.
The correct syntax is SHOW option where option is one of:
system_variable ALL BTI[TLE]ERR[ORS] [ { FUNCTION | PROCEDURE | PACKAGE |
PACKAGE BODY | TRIGGER | VIEW | TYPE | TYPE BODY | DIMENSION | JAVA CLASS }
[schema.]name] LNO PARAMETERS [parameter_name] PNO RECYC[LEBIN] [original_name]
REL[EASE] REPF[OOTER] REPH[EADER] SGA SPOO[L] SPPARAMETERS [parameter_name
SQLCODE TTI[TLE] USER XQUERY

How to format Oracle SQL text-only select output

I am using Oracle SQL (in SQLDeveloper, so I don't have access to SQLPLUS commands such as COLUMN) to execute a query that looks something like this:
select assigner_staff_id as staff_id, active_flag, assign_date,
complete_date, mod_date
from work where assigner_staff_id = '2096';
The results it give me look something like this:
STAFF_ID ACTIVE_FLAG ASSIGN_DATE COMPLETE_DATE MOD_DATE
---------------------- ----------- ------------------------- ------------------------- -------------------------
2096 F 25-SEP-08 27-SEP-08 27-SEP-08 02.27.30.642959000 PM
2096 F 25-SEP-08 25-SEP-08 25-SEP-08 01.41.02.517321000 AM
2 rows selected
This can very easily produce a very wide and unwieldy textual report when I'm trying to paste the results as a nicely formatted quick-n-dirty text block into an e-mail or problem report, etc. What's the best way to get rid of all tha extra white space in the output columns when I'm using just plain-vanilla Oracle SQL? So far all my web searches haven't turned up much, as all the web search results are showing me how to do it using formatting commands like COLUMN in SQLPLUS (which I don't have).
In your statement, you can specify the type of output you're looking for:
select /*csv*/ col1, col2 from table;
select /*Delimited*/ col1, col2 from table;
there are other formats available such as xml, html, text, loader, etc.
You can change the formatting of these particular options under tools > preferences > Database > Utilities > Export
Be sure to choose Run Script rather than Run Statement.
* this is for Oracle SQL Developer v3.2
What are you using to get the results? The output you pasted looks like it's coming from SQL*PLUS. It may be that whatever tool you are using to generate the results has some method of modifying the output.
By default Oracle outputs columns based upon the width of the title or the width of the column data which ever is wider.
If you want make columns smaller you will need to either rename them or convert them to text and use substr() to make the defaults smaller.
select substr(assigner_staff_id, 8) as staff_id,
active_flag as Flag,
to_char(assign_date, 'DD/MM/YY'),
to_char(complete_date, 'DD/MM/YY'),
mod_date
from work where assigner_staff_id = '2096';
What you can do with sql is limited by your tool. SQL Plus has commands to format the columns but they are not real easy to use.
One quick approach is to paste the output into excel and format it there or just attach the spreadsheet. Some tools will save the output directly as a spreadsheet.
Nice question. I really had to think about it.
One thing you could do is change your SQL so that it only returns the narrowest usable columns.
e.g. (I'm not very hot on oracle syntax, but something similar should work):
select substring( convert(varchar(4), assigner_staff_id), 1, 4 ) as id,
active_flag as act, -- use shorter column name
-- etc.
from work where assigner_staff_id = '2096';
Does that make sense?
If you were doing this on unix/linux, I would suggest running it from the command line and piping it through an awk script.
If I've miss-understood, then please update your question and I'll have another go :)
If you don't have alot of rows returned I'll often use Tom Kytes print_table function.
SQL> set serveroutput on
SQL> execute print_table('select * from all_objects where rownum < 3');
OWNER : SYS
OBJECT_NAME : /1005bd30_LnkdConstant
SUBOBJECT_NAME :
OBJECT_ID : 27574
DATA_OBJECT_ID :
OBJECT_TYPE : JAVA CLASS
CREATED : 22-may-2008 11:41:13
LAST_DDL_TIME : 22-may-2008 11:41:13
TIMESTAMP : 2008-05-22:11:41:13
STATUS : VALID
TEMPORARY : N
GENERATED : N
SECONDARY : N
-----------------
OWNER : SYS
OBJECT_NAME : /10076b23_OraCustomDatumClosur
SUBOBJECT_NAME :
OBJECT_ID : 22390
DATA_OBJECT_ID :
OBJECT_TYPE : JAVA CLASS
CREATED : 22-may-2008 11:38:34
LAST_DDL_TIME : 22-may-2008 11:38:34
TIMESTAMP : 2008-05-22:11:38:34
STATUS : VALID
TEMPORARY : N
GENERATED : N
SECONDARY : N
-----------------
PL/SQL procedure successfully completed.
SQL>
If its lots of rows, i'll just do the query in SQL Developer and save as xls, businessy types love excel for some reason.
Why not just use the "cast" function?
select
(cast(assigner_staff_id as VARCHAR2(4)) AS STAFF_ID,
(cast(active_flag as VARCHAR2(1))) AS A,
(cast(assign_date as VARCHAR2(10))) AS ASSIGN_DATE,
(cast(COMPLETE_date as VARCHAR2(10))) AS COMPLETE_DATE,
(cast(mod_date as VARCHAR2(10))) AS MOD_DATE
from work where assigner_staff_id = '2096';