BigQuery create Table differences between standard and legacy sql - google-bigquery

I have a few questions around the create table syntax in standard and legacy sql
The new BigQueryUI doesn't show standard sql types and shows only legacy types. I understand they are mapped one to one with the legacy types but the examples in creating partitioned tables shows options which are not available in the UI
If I create a table using the JSON field schema can I still use the standard sql?
The BigQueryUI shows only partitioning the table by Ingestion time, but I want to create a table with date column and I don't see an option for it. If I have to create the DDL manually, I did not see the examples on how to use JSON field schema to construct a create table statement.

The new BigQueryUI doesn't show standard sql types
BigQuery standardSQL and LegacySQL are 2 options to write SQL syntax (See this link for more detail) and have nothing to do with the column Types in BigQuery, Details on table types can be found in this link, I also find this Link helpful
If I create a table using the JSON field schema can I still use the standard sql?
To create a table using JSON you need to run bq command line, If you need help how to write this syntax let us know
but I want to create a table with date column and I don't see an option for it
You can use this standardSQL syntax to do this:
#standardSQL
CREATE OR REPLACE TABLE `project.dataset.tableId`
PARTITION BY myDate
CLUSTER BY cluster_col AS
SELECT * from sourceTable
Note: myDate column is a column in the source table

Related

What is the equivalent of Select into in Google bigquery

I am trying to write a SQL Command to insert some data from one table to a new table without any insert statement in bigquery but I cannot find a way to do it. (something similar to select into)
Here is the table:
create table database1.table1
(
pdesc string,
num int64
);
And here is the insert statement. I also tried the select into but it is not supported in bigquery.
insert into database1.table1
select column1, count(column2) as num
from database1.table2
group by column1;
Above is a possible way to insert. but I am looking for a way that I do not need to use any select statement. I am looking for something similar to 'select into' statement.
I am thinking of declaring variables and then somehow feed the data into the tables but do not know how.
I am not a Google employee. However - I understand the reasoning for not supporting creating a copy of a table (or query) from the console.
The challenge is that each table needs to be created must have a number of features defined such as associated project and expiry time.
Looking through the documentation (briefly) - it is worth exploring using bq utility - specifically the cp command -
Explore the following operations :
cache the query results to a temporary table
get the name of said temporary table
pass to a copy table command perhaps?
Other methods are described in the google cloud doco https://cloud.google.com/bigquery/docs/managing-tables#copy-table

Get query used to create a table

We use snowflake at work to store data, and for one of the tables, I dont have the SQL query used to create the table. Is there a way to see the query used to make that table?
I tried using the following
get_ddl('table', 'db.table', true)
but this gives me an output like-
This doesnt give me any information about the sql query that was used. How do I get that in snowflake?
If get_ddl() is not enough you may use INFORMATION_SCHEMA.
To get more information you have 2 options:
Use the QUERY_HISTORY() table functions: https://docs.snowflake.com/en/sql-reference/functions/query_history.html
Use the QUERY_HISTORY() view: https://docs.snowflake.com/en/sql-reference/account-usage/query_history.html
If you use the funtions/view above and filter all the records by QUERY_TEXT, maybe you get more information about the exact SQL that was used to create your table.

Adding today date in Table name when using Create Table function in standard sql GBQ

I am quite new to GBQ and any help is appreciated it.
I have a query below:
#Standard SQL
create or replace table `xxx.xxx.applications`
as select * from `yyy.yyy.applications`
What I need to do is to add today's date at the end of the table name so it is something like xxx.xxx.applications_<todays date>
basically create a filename with Application but add date at the end of the name applications.
I am writing a procedure to create a table every time it runs but need to add the date for audit purposes every time I create the table (as a backup).
I searched everywhere and can't get the exact answer, is this possible in Query Editor as I need to store this as a Proc.
Thanks in advance
BigQuery doesn't support dynamic SQL at the moment which means that this kind of construction is not possible.
Currently BigQuery supports Parameterized Queries but its not possible to use parameters to dynamically change the source table's name as you can see in the provided link.
BigQuery supports query parameters to help prevent SQL injection when
queries are constructed using user input. This feature is only
available with standard SQL syntax. Query parameters can be used as
substitutes for arbitrary expressions. Parameters cannot be used as
substitutes for identifiers, column names, table names, or other parts
of the query.
If you need to build a query based on some variable's value, I suggest that you use some script in SHELL, Python or any other programming language to create the SQL statement and then execute it using the bq command.
Another approach could be using the BigQuery client library in some of the supported languages instead of the bq command.

How do I generate a table name that contains today's date?

It may seem a little strange, but there are already tables with names for each date.
In my project, I have tables for each date to make statistics easier to handle.
Of course, I don't think this is always the best way, but this is the table structure for my project.
(It's a common technique in Google BigQuery and Amazon Athena. This question is about Google BigQuery)
So to get the data, I want to generate today's date. If I use TODAY, I can get the data of the latest day without rewriting the code even if it is the next day.
I tried, but the code didn't work.
Not work 1:
CONCAT in FROM
SELECT
*
FROM
CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
Error:
Table-valued function not found: CONCAT at [4:3]
Not work 2:
create temporary function:
create temporary function getTableName() as (CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo')));
Error:
CREATE TEMPORARY FUNCTION statements must be followed by an actual query.
Question
How do I generate a table name that contains TODAY's date?
In this case, I would recommend you to use Wild tables in BigQuery, which allows you to use some features in Standard SQL.
With Wild Tables you can use _TABLE_SUFFIX, it grants you the ability to filter/scan tables containing this parameter. The syntax would be as follows:
SELECT *
FROM `test-proj-261014.sample.test_*`
where _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d', CURRENT_DATE)
I hope it helps.
Your first query should go like this:
select CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
For creating temporary function, use the below code:
create temp function getTableName() as
((select CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
));
select getTableName()
The error "CREATE TEMPORARY FUNCTION statements must be followed by an actual query." is because once the temporary functions are defined then you have to use the actual query to use that function and then the validity of function dies out. To define persistent UDFs and use them in multiple queries please go through the link to define permanent functions.You can reuse persistent UDFs across multiple queries, whereas you can only use temporary UDFs in a single query.

retrieve data type,size,precision and scale of postgresql columns

How to retreive data type, size, precision and scale of all columns of a particular table in postgresql with a single sql statement?
The table can have columns of the data types ( 'int2','int4','int8','float4','float8','numeric','varchar','char','date','timestamp'). If precision/scale are not applicable for a particular data type we can leave it as null.
I should not use INFORMATION_SCHEMA for this because though this schema is built-in we can drop this schema. SO i wrote code using this schema and if some how the customer drops this schema my code breaks. I just want to use pg_catalog tables/views.
The tables in information_schema are views on the system tables. If you use this command in psql, you will get the view definition that generates the columns view:
\dS+ information_schema.columns
You'll still have some work to do as the type casts to user-friendly outputs are also based on information_schema, but that shouldn't be too hard.
EDIT: In versions prior to 8.4, you have to use \d+, not \dS+