BigQuery - add description to set of schemas

BigQuery - add description to set of schemas - google-bigquery

In my work came into a set of tables which all have the same schema. Until now no one made descriptions to the columns in the table and I want to add my description to them. My question is how to add my schema's descriptions to all the tables (about hundreds of them) without going every table and change it manualy?
Thanks!

By using bq command line tool, you can prepare a quick script on Linux.
Firstly, list all the tables by using "bq show" command, secondly in a loop put the needed descriptions to the tables by using "bq update" command.
Just for further help, here a link explaining how to write shell scripts:
http://www.wikihow.com/Write-a-Shell-Script-Using-Bash-Shell-in-Ubuntu

Related

Duplicate several tables in bigquery project at once

In our BQ export schema, we have one table for each day as per the screenshot below.
I want to copy the tables before a certain date (2021-feb-07). I know how to copy one day at a time via the UI, but is there not a way to use the cloud console to write a code for copying the selected date range, all at once? Or maybe an sql command directly from a query window?

I think you should transform your sharding tables into a partitioned table. So you can handled your tables with just a single query. As mention in the official documentation, partitioned tables perform better.
To make the conversion, you can just execute the following commands in the console.
bq partition \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
mydataset.sourcetable_ \
mydataset.mytable_partitioned
This will make your sharded tables sourcetable_(xxx) into a single partitioned table mytable_partitioned which can be query with just a single query trough your entire set of data entries.
SELECT
*
FROM
`myprojectid.mydataset.mytable_partitioned`
WHERE
_PARTITIONTIME BETWEEN TIMESTAMP('2022-01-01') AND TIMESTAMP('2022-01-03')
For more details about the conversion commands you can check this link. Also, I recommend to check the links about querying partionated tables and partiotioned tables for more details.

Google BigQuery list tables

I need to list all tables in my BigQuery, but I don't know how to do it, I try search but I didn't find anything about it.
I need to know if the table exists, if it exists I search for record, if not I create table and insert record.

Depending where/how you want to do this, you can use CLI, API calls or client libraries. Here you have all the info about listing tables
As an example, if you want to list them using Command Line Interface, you can do it like:
bq ls <project>:<dataset>
If you want to use normal SQL queries, you can use the INFROMATION_SCHEMA Beta feature
SELECT table_name from `<project>.<dataset>.INFORMATION_SCHEMA.TABLES`
(project is optional)

Hive - Is there a way to dynamically create tables from a list

I'm using Hive to aggregate stats, and I want to do a breakdown by the industry our customers fall under. Ideally, I'd like to write the stats for each industry to a separate output file per industry (e.g. industry1_stats, industry2_stats, etc.). I have a list of various industries our customers are in, but that list isn't pre-set.
So far, everything I've seen from Hive documentation indicates that I need to know what tables I'd want beforehand and hard-code those into my Hive script. Is there a way to do this dynamically, either in the Hive script itself (preferable) or through some external code before kicking off the Hive script?

I would suggest go for a shell script..
Get the list of columns
hive -e 'select distinct industry_name from [dbname].[table_name];' > list
Iterate over every line... passing every line(industry names) of list as argument to the do while loop
tail -n +1 list | while IFS=' ' read -r industry_name
do
hive -hiveconf MY_VAR=$industry_name -f my_script.hql
done
save the shell script as test.sh
and in my_script.hql
use uvtest;
create table ${hiveconf:MY_VAR} (id INT, name CHAR(10));
you'll have to place both the test.sh and my_script.hql in the same folder.
Below command should create all the tables from list of column names.
sh test.sh
Follow this link for using hive in shell scripts:
https://www.mapr.com/blog/quick-tips-using-hive-shell-inside-scripts

I wound up achieving this using Hive's dynamic partitioning (each partition writes to a separate directory on disk, so I can just iterate through that file). The official Hive documentation on partitioning and this blog post were particularly helpful for me.

Bteq Scripts to copy data between two Teradata servers

How do I copy data from multiple tables within one database to another database residing on a different server?
Is this possible through a BTEQ Script in Teradata?
If so, provide a sample.
If not, are there other options to do this other than using a flat-file?

This is not possible using BTEQ since you have mentioned both the databases are residing in different servers.
There are two solutions for this.
Arcmain - You need to use Arcmain Backup first, which creates files containing data from your tables. Then you need to use Arcmain restore which restores the data from the files
TPT - Teradata Parallel Transporter. This is a very advanced tool. This does not create any files like Arcmain. It directly moves the data between two teradata servers.(Wikipedia)

If I am understanding your question, you want to move a set of tables from one DB to another.
You can use the following syntax in a BTEQ Script to copy the tables and data:
CREATE TABLE <NewDB>.<NewTable> AS <OldDB>.<OldTable> WITH DATA AND STATS;
Or just the table structures:
CREATE TABLE <NewDB>.<NewTable> AS <OldDB>.<OldTable> WITH NO DATA AND NO STATS;
If you get real savvy you can create a BTEQ script that dynamically builds the above statement in a SELECT statement, exports the results, then in turn runs the newly exported file all within a single BTEQ script.
There are a bunch of other options that you can do with CREATE TABLE <...> AS <...>;. You would be best served reviewing the Teradata Manuals for more details.

There are a few more options which will allow you to copy from one table to another.
Possibly the simplest way would be to write a smallish program which uses one of their communication layers (ODBC, .NET Data Provider, JDBC, cli, etc.) and use that to take a select statement and an insert statement. This would require some work, but it would have less overhead than trying to learn how to write TPT scripts. You would not need any 'DBA' permissions to write your own.
Teradata also sells other applications which hides the complexity of some of the tools. Teradata Data Mover handles provides an abstraction layer between tools like arcmain and tpt. Access to this tool is most likely restricted to DBA types.

If you want to move data from one server to another server then
We can do this with the flat file.
First we have fetch data from source table to flat file through any utility such as bteq or fastexport.
then we can load this data into target table with the help of mload,fastload or bteq scripts.

While taking table backup using mysqldump command, can we skip particular column..?

I need to know, is there any option to skip particular column and take remaining table backup using mysqldump command.
If yes please let me know.

I wanted to move a table from one host to another but only include some of the columns and replace others with dummy data (like password columns). So I made a shell script that makes it possible to run a SELECT query and get INSERT statements as result.
You find the script here: https://gist.github.com/1239299

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

BigQuery - add description to set of schemas - google-bigquery

Related

Duplicate several tables in bigquery project at once

Google BigQuery list tables

Hive - Is there a way to dynamically create tables from a list

Bteq Scripts to copy data between two Teradata servers

While taking table backup using mysqldump command, can we skip particular column..?

Categories

Resources