capture data from bigquery table to another bigquery table every 15 minutes - google-bigquery

I need to copy data one big query table to anther big query table which has same structure
The data should be copied every 5 minute.
How can I do that?
Regards

Use Scheduling queries
You can schedule queries to run on a recurring basis. Scheduled
queries must be written in standard SQL, which can include data
definition language (DDL) and data manipulation language (DML)
statements. You can organize query results by date and time by
parameterizing the query string and destination table.

Related

Run a Big Query Schedule everytime a table is updated

So I have this schedule that gets data from some tables and aggregate then into a single table, that I use as a source in a data studio dash, one of these tables (table 1), is updated daily, sometimes more than once, I wanted to know if there is a way to automatically run the schedule every time this table (table1) is updated.
Unfortunately, BigQuery scheduled queries need the parameters of run_time and run_date, so if you want this to be only with BigQuery, you can schedule multiple queries at different times.
Additionally, using a trigger in BigQuery is not possible because this is unsupported.
I recommend you to use a Cloud Run Action that gets the event after any Insert and this creates an event Trigger, so this could do what you are looking for, but without any scheduled queries.

SSIS Incremental Load-15 mins

I have 2 tables. The source table being from a linked server and destination table being from the other server.
I want my data load to happen in the following manner:
Everyday at night I have scheduled a job to do a full dump i.e. truncate the table and load all the data from the source to the destination.
Every 15 minutes to do incremental load as data gets ingested into the source on second basis. I need to replicate the same on the destination too.
For incremental load as of now I have created scripts which are stored in a stored procedure but for future purposes we would like to implement SSIS for this case.
The scripts run in the below manner:
I have an Inserted_Date column, on the basis of this column I take the max of that column and delete all the rows that are greater than or equal to the Max(Inserted_Date) and insert all the similar values from the source to the destination. This job runs evert 15 minutes.
How to implement similar scenario in SSIS?
I have worked on SSIS using the lookup and conditional split using ID columns, but these tables I am working with have a lot of rows so lookup takes up a lot of the time and this is not the right solution to be implemented for my scenario.
Is there any way I can get Max(Inserted_Date) logic into SSIS solution too. My end goal is to remove the approach using scripts and replicate the same approach using SSIS.
Here is the general Control Flow:
There's plenty to go on here, but you may need to learn how to set variables from an Execute SQL and so on.

SSIS Alternatives to one-by-one update from RecordSet

I'm looking for a way to speed up the following process: I have a SSIS package that loads data from Excel files on a weekly basis to SQL Server. There are 3 fields: Brand, Date, Value.
In the dataflow, I check for existing combinations of Brand+Date, and new combinations go to the table directly, the existing ones go to a RecordSet destination for updates:
The next step is to update the Value of the existing combinations:
As you can see, there are thousands of records to update, and it takes too long. The number of records tend to grow week by week. Please suggest.
The fastest way will be do this inside a Stored procedure using ELT (Extract Load Transform) approach.
Push all data from excel as is into a table(called load to a staging table in theory). Since you do not seem to be concerned with data validation steps, this table can be a replica of final destination table columns.
Next step is to call a stored procedure using Execute SQL task. Inside this procedure you can put all your business logic. Since this steps with native data manipulation on SQL server entities, it is the fastest alternative.
As a last part, please delete all entries from the staging table.
You can use indexes on staging table to make the SP part even faster.

Insert bigquery query result to mysql

In one of my PHP application, I need to show a report based on the aggregate data, which is fetched from BigQuery. I am planning to execute the queries using a PHP cron job then insert data to MySQL table from which the report will fetch data. Is there any better way of doing this like directly insert the data to MySQL without an application layer in between ?
Also I am interested in real time data, but the daily cron only update data once and there will be some mismatch of the counts with actual data if I check it after some time. If I run hourly cron jobs, I am afraid the data reading charges will be high as I am processing a dataset which is 20GB. Also my report cannot be fetched fro Bigquery itself and it needs to have data from MySQL database.

How to transfer data between databases with SELECT statement in Teradata

So I am stuck on this Teradata problem and I am looking to the community for advice as I am new to the TD platform. I am currently working with a Teradata Data Warehouse and have an interesting task to solve. Currently we store our information in a live production database but want to stage tables in another database before using FastExport to export the files. Basically we want to move our tables into a database to take a quick snapshot.
I have been exploring different solutions and am unsure how to proceed. I need to be able to automate a create table process from one DB in Teradata to another. The tricky part is I would like to create many tables off of the source table using a WHERE clause. For example, I have a transaction table and want to take a snapshot of the transaction table for a certain date range month by month. Meaning that the original table Transaction would be split into many tables such as Transaction_May2001, Transaction_June2001, Transaction_July2001 and so on and so forth.
Thanks
This is assuming by two databases you are referring to the same physical installation of Teradata.
You can use the CREATE TABLE AS construct to accomplish this:
CREATE TABLE {MyDB}.Transaction_May2001
AS (
SELECT *
FROM Transaction
WHERE Transaction_Date BETWEEN DATE '2001-05-01' AND '2001-05-31'
)
{UNIQUE} PRIMARY INDEX ({Same PI definition as Transaction Table})
WITH DATA AND STATS;
If you neglect to specify the explicit PI in the CREATE TABLE AS then Teradata will take the first column of the SELECT clause and use it as the PI of the new table.
Otherwise, you would be looking to use a Teradata utility as suggested by ryanbwork in the comment to your question.