Impala SQL in Jupyter Notebooks - sql

Can anybody help me with the correct syntax to run multi line Impala - SQL queries in Jupyter notebooks. I have been using the approach below to run queries on a single line but have not been able to work out how to run multi line queries with indentation.
! impala-shell -q 'SELECT COUNT(*) FROM (SELECT DISTINCT eyesight_evaluation.patient_id FROM eyesight_evaluation WHERE severity_of_sight_loss IN ("Slight","Mild","Moderate","Moderately Severe","Severe","Profound")) AS TotalDiagnosed;'
Thanks!

You have to use 3 single quotes for multi-line comments
'''SELECT COUNT(*) FROM (SELECT DISTINCT eyesight_evaluation.patient_id FROM
eyesight_evaluation WHERE severity_of_sight_loss IN
("Slight","Mild","Moderate","Moderately Severe","Severe","Profound")) AS TotalDiagnosed;'''

Related

BigQuery SELECT statement with WHERE clause fail only on BQ CLI

I am running simple select query in BigQuery GCP Console and it works perfectly fine. But, when I run the same query using BQ CLI, it fails.
When I run the same query without the "WHERE" clause, it works.
SELECT field_path FROM `GCP_PROJECT_ID.MY_DATASET.INFORMATION_SCHEMA.COLUMN_FIELD_PATH`
WHERE table_name="MY_TABLE_NAME"
Below is the error message
Error in query string: Error processing job 'GCP_project_ID:jobidxxxxx': Unrecognized name:
MY_DATASET at [1:1xx]
I have tried the following "WHERE" clauses as well. None of these work as well.
... WHERE table_name IN ("MY_TABLE_NAME")
... WHERE table_name like "%MY_TABLE_NAME%"
I have reproduced your query on my own datasets using the Command Line tool and it worked fine. This is the command I ran:
bq query --use_legacy_sql=false 'SELECT field_path FROM `<projectName>.<DatasetName>.INFORMATION_SCHEMA.COLUMN_FIELD_PATHS`where table_name="<TableName>"'

SQL where clause parameterization

I am trying to parameterize certain where clauses to standardized my Postgres SQL scripts for DB monitoring. But have not found a solution that will allow me to have the following script run successfully
variablename = "2021-04-08 00:00:00"
select * from table1
where log_date > variablename;
select * from table2
where log_date > variablename;
Ideally, I would be able to run each script separately, but being able to find/replace the variable line would go a long way for productivity.
Edit: I am currently using DBeaver to run my scripts
To do this in DBeaver I figured out you can use Dynamic Parameter Bindings feature.
https://github.com/dbeaver/dbeaver/wiki/SQL-Execution#dynamic-parameter-bindings
Here is a quick visual demo of how to use it:
https://twitter.com/dbeaver_news/status/1085222860841512960?lang=en
select * from table1
where log_date > :variablename;
When executing the query DBeaver will prompt you for the value desired and remember it when running another query.

Pasting multi-line queries into BigQuery SQL shell

I'm running the BigQuery command line shell and I'm not able to successfully run multi-line queries (aka queries with line breaks) because whenever I paste the query into the shell each line gets run individually instead of the whole thing together.
For example,
select * from table
works fine because it is in one line but if I try to run
select
*
from
table
it does not work because each line gets run separately.
Is there any way to get this to work?
The query command creates a query job that runs the supplied SQL query. in the documentation Using the bq command-line tool you can find some examples like:
bq query --nouse_legacy_sql \
'SELECT
COUNT(*)
FROM
`bigquery-public-data`.samples.shakespeare'
Honestly I too couldn't find decent help on this, so I created a bit of a workaround using the CONCAT function. Notice that I have a space before the FROM clause. Hope his helps.
DECLARE table_name STRING DEFAULT '20220214';
DECLARE query STRING;
SET query = CONCAT("""select user_pseudo_id, user_id""",
""" FROM `app.analytics.events_""" || table_name || """` where user_id IS NOT NULL group by user_pseudo_id, user_id""");
EXECUTE IMMEDIATE query;

Standard SQL in bq command line

I'm trying to run queries in standard mode via command line. I'm using recently version of this bigquery-2.0.17.
firstly I tried to use bq query command:
bq query --use_legacy_sql=False "SELECT string_col FROM `source_name`"
And get exception
FATAL Flags parsing error: Unknown command line flag 'use_legacy_sql'
After, I've run in shell mode
project_name> query --use_legacy_sql=False "SELECT string_col FROM `source_name`"
And get:
float() argument must be a string or a number
Could you advise me, how can I run query in command line.
Thanks
The commands you are running are all correct. The only thing I do differently is I read the queries from a file so I don't have problems with my bash trying to interpret the ` sign, like so:
cat query.sql | bq query --use_legacy_sql=False
(You won't have this issue in the bq shell interpreter).
Supposing your queries are all correct and following the Standard syntax (you can also run a simple query like "select [1, 2]" to see if it works, if it does, then maybe your source_name has some issue going on), the only thing that comes to mind is maybe if you try to reinstall gcloud and even update it (currently we are at bq version 2.0.24), as this seems to be more related to environment rather than command syntax.
Need to use gcloud installation instead of alone bq.
Thanks

From unix to Sql

I am making a shell script where I am reading inputs from one file. File contains data
123
1234
121
I am reading the inputs from this file using while read line do condition and putting all inputs in SQL statements.Now in my shell script i am going on SQL Prompt and running some queries. In one condition, I am using EXECUTE IMMEDIATE STATEMENT in SQL.
as
EXECUTE IMMEDIATE 'CREATE TABLE BKP_ACDAGENT4 as SELECT * FROM BKP_ACDAGENT WHERE DATASOURCEAGENTID IN ('123','1234','121')';
I want this to be execute, but somehow its not working.
Can anyone help me in executing it?
You need to escape the single quotes which you have used for the predicates in your IN list, that is the single quotes in
WHERE DATASOURCEAGENTID IN ('123','1234','121')';
are causing the issue here. You need to escape the single quotes using two single quotes
EXECUTE IMMEDIATE 'CREATE TABLE BKP_ACDAGENT4 as SELECT * FROM BKP_ACDAGENT WHERE DATASOURCEAGENTID IN (''123'',''1234'',''121'')';
The above will work on all Oracle version.
If you're one Oracle 10g or above, you can use q keyword
EXECUTE IMMEDIATE q'[CREATE TABLE BKP_ACDAGENT4 as SELECT * FROM BKP_ACDAGENT WHERE DATASOURCEAGENTID IN ('123','1234','121')]';