What is the meaning of last_modified_timestamp in BigQuery - google-bigquery

I am running the below query, trying to understand the meaning of last_modified_timestamp. Is it when the data in the table was last updated (DML operation) or when the table structure was last modified (DDL operation).
SELECT * FROM dataset_name.__TABLES__;

As per the official documentation at:
https://cloud.google.com/bigquery/docs/information-schema-tables
LAST_MODIFIED_TIME is "The time when the data was most recently written to the partition".

Related

What is the equivalent of Select into in Google bigquery

I am trying to write a SQL Command to insert some data from one table to a new table without any insert statement in bigquery but I cannot find a way to do it. (something similar to select into)
Here is the table:
create table database1.table1
(
pdesc string,
num int64
);
And here is the insert statement. I also tried the select into but it is not supported in bigquery.
insert into database1.table1
select column1, count(column2) as num
from database1.table2
group by column1;
Above is a possible way to insert. but I am looking for a way that I do not need to use any select statement. I am looking for something similar to 'select into' statement.
I am thinking of declaring variables and then somehow feed the data into the tables but do not know how.
I am not a Google employee. However - I understand the reasoning for not supporting creating a copy of a table (or query) from the console.
The challenge is that each table needs to be created must have a number of features defined such as associated project and expiry time.
Looking through the documentation (briefly) - it is worth exploring using bq utility - specifically the cp command -
Explore the following operations :
cache the query results to a temporary table
get the name of said temporary table
pass to a copy table command perhaps?
Other methods are described in the google cloud doco https://cloud.google.com/bigquery/docs/managing-tables#copy-table

How can I schedule a script in BigQuery?

At last BigQuery supports using ; in the queries, so I can write more than one query in one "block", if I seperate them with semicolon.
If I run the code manually, it works. But I cannot schedule that.
When I want to schedule, I have two choices:
(New) Web UI: I must give a destination table. If I don't do it, I could not save the scheduled query. But all my queries are updates and inserts with different "destination tables". Like these:
UPDATE project.exampledataset.a
SET date = current_date()
WHEN TRUE
;
INSERT INTO project.otherdataset.b
SELECT c,d
FROM project.otherdataset.c
So I cannot even make a scheduling in the Web UI.
Classic UI: I tried this, because the official documentary states, that I should leave the "destination table" blank, and Classic UI allows it. I can setup the scheduling, but it doesn't run, when it should. I get the error message in email "Error status: Dataset specified in the query ('') is not consistent with Destination dataset 'exampledataset'."
AIK scripting (and using semicolon) is a very new feature in BigQuery, but I hope someone can help me.
Yes, I know that I could schedule every query one by one, but I would like to resolve it with one big script.
Looks like the scheduled query was defined earlier with destination dataset defined with APPEND/TRUNCATE type transaction. While updating the same scheduled query to a DML query, GUI doesn't show the dataset field / table name to update to NULL. Hence this error is coming considering the previously set dataset and table name in the scheduled query.
Hence the fix is to delete the scheduled query and create it from scratch with DML query option. It worked for me.
Scripting is supported in scheduled query now. However, scripting query, when being scheduled, doesn't support setting a destination table for now. You still need to use DDL/DML to make change to existing table.
E.g.:
CREATE OR REPLACE TABLE destinationTable AS
SELECT *
FROM sourceTable
WHERE date >= maxDate
As of 2022, the BQ Console UI will let you create a new scheduled query without a destination dataset, but it won't let you update a prior SELECT to use DDL/DML block syntax. However, you can use the BigQuery Data Transfer API to update the destinationDatasetId field, via transferconfigs/patch. Use transferconfigs/list to get the configId for a given scheduled query.
Note that you can either use the in-browser API Explorer, if you have the appropriate credentials, or write a programmatic solution. Also seems useful for setting/updating any other fields, including renaming scheduled queries.

BigQuery data using SQL "INSERT INTO" is gone after some time

Today I notice another strange behaviour of BigQuery.
I run UDF standard SQL in the BQ web ui:
CREATE TEMPORARY FUNCTION ...
INSERT INTO projectid.dataset.inserttable...
All seems good, the result of the UDF SQL are inserted in the insert table correct, I can tell from "Number of rows". But the table size is not correct, still keep the table size before run the insert query. Furthermore, I found all the inserted rows are gone after 1 hour later.
Some more info I found, when run a "DETELE FROM insert table true" or "SELECT ...", then the deleted number of rows and table size seems correct with the inserted data. But just can not preview the insert table correctly in the WEB UI.
Then I am guessing the "Detail" or "Preview" info of the table has time delay? May I know do you have any idea about this behaviour?
The preview may have a delay, so SELECT * FROM YourTable; will give the most up-to-date results, or you can use COUNT(*) just to verify that the number of rows is correct. You can think of it as being similar to streaming, if you have tried that, where some rows may be in the streaming buffer for a while before they make it into regular storage.

"Invalid snapshot time" error without table decorator usage

We got an error that {"message":"Invalid snapshot time 1472342794519, unable to read before 1472342794785","reason":"invalid"}. Other QA describes that such an error happens when table decorators' parameters are invalid, however, our query does not have table decorators.
The query uses TABLE_DATE_RANGE, but its arguments are date timestamp, so the lower digits must be 0s, not like that in the above error.
Retrying the same query succeeded.
I can provide the job ID, but because it includes internal information of our company. I apologize that I cannot directly write it here.
The tables that the TABLE_DATE_RANGE wildcard evaluates to are resolved as of the time of the start of the query. Looking at the timestamps, it looks like the table was deleted right after the job started execution. This causes the table resolution to throw that error.

Oracle SQL table history

So this is the problem, I have a Oracle table, just a normal table with let's say 300 rows of data.
I need to figure out how that data looked like at least 100 days before, who made changes and timestamp would be nice also.
Thing is, I do have a schedule backup, but I believe someone changed data and removed backup, so I am trying to find out this change over system records.
I did google this and for results I got now, I didn't got what I needed.
select * from dba_hist_sqltext - it gives me results for current session only
SELECT SCN_TO_TIMESTAMP(ORA_ROWSCN) - specified number is not a valid system change number
Oracle SQL info -
Java(TM) Platform 1.7.0_51
Oracle IDE 4.0.0.13.80
Versioning Support 4.0.0.13.80
Do you guys have any other idea?
Thank you in advance
You may use flashback: Using Oracle Flashback Technology
With that, you can query using the AS OF TIMESTAMP clause, like this:
SELECT * FROM yourtable
AS OF TIMESTAMP TO_TIMESTAMP('2004-04-04 09:30:00', 'YYYY-MM-DD HH:MI:SS');
This has to be enabled though, and there is a certain (configurable) size limit, so there is no guarantee that you can still query those records of 100 days ago. On a busy database the history may go back only a couple of minutes.