I have tables of stocks in my BigQuery. The names of the table is simply the ticker of the stocks: AAPL, AMZN, MSFT, etc.
"{dataset_name}.AAPL"
"{dataset_name}.AMZN"
"{dataset_name}.MSFT"
I am on Google Sheets that's trying to create a query to dynamically SELECT from the tables one by one, based on the stock names.
So I'm trying to use a parameter on the table name, and I tried to use the wildcard function from Bigquery.
SELECT * FROM `dataset_name.*` WHERE _TABLE_SUFFIX = 'AAPL'
The result is Query valid. Will process 0 bytes. and returns nothing.
I then tried modifying the query and make it like this
SELECT * FROM `dataset_name.A*` WHERE _TABLE_SUFFIX = 'APL'
And it works...
The problem is, I cannot wildcard the whole table name. I can only use it for a suffix name, with at least a letter at the front of the name. However in my case, the whole table name is dynamic. How to make a query for a dynamic whole table name?
Any help would be appreciated.
This is an expected behavior in BigQuery, as per this GCP Documentation:
In order to execute a standard SQL query that uses a wildcard table, BigQuery automatically infers the schema for that table. BigQuery uses the schema for the most recently created table that matches the wildcard as the schema for the wildcard table.
It is possible in your scenario that you have tables in your dataset that has different schema, and BigQuery references to the most recently created table.
One way to dynamically SELECT without using _TABLE_SUFFIX in Bigquery is to use execute_immediate and put the suffix into a variable. However, this seems to be a limitation in Google Sheets. To request to have this feature in Google Sheets, you may file a Feature Request.
Related
We have a partitioned table in google bigquery that we query using the _PARTITIONTIME field (otherwise the queries will cost too much).
How can I make Tableau use _PARTITIONTIME pseudo column?
When configuring your datasource in Tableau, select "Google BigQuery" from the list of available sources, go through the OAuth dance, and then select your project and dataset.
At this point, you will be presented with a list of tables in the dataset, as well as an option to use "New Custom SQL" at the bottom. Select this option, and enter your query exactly as you have been using it. Assuming that the query contains a segment similar to the below:
...WHERE _PARTITIONTIME BETWEEN TIMESTAMP(“2016-05-01”) AND TIMESTAMP(“2016-05-06”)
Now, highlight the dates within that where clause and click on the "Insert Parameter" dropdown menu in the query editor. This will allow you to parameterize your query and dynamically choose the dates you want to query from within your Tableau workbook!
Hopefully this helps!
A more flexible solution would be to just drag and drop the table as you would normally.
After that click 'datasource - convert to custom SQL'
This way tableau has written most of the SQL for you.
Last part....
DONT add a where clause and stop user exploration
Add this instead:
_PARTITIONTIME AS mypartitiondate
Now you can use this as you would any other date column but you are using partition time instead. Caveats to that are if you use this column you cant drill in below the level of the partitions (eg trying to see hourly trades on a daily partitioned dataset). That would require user knowledge to know to start using another date column for that part instead.
I accidentally added a wrong column to my BigQuery table schema.
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
remove bad rows (rows with values contains the wrong column) by running a "select *" query on the table with some kind of filter, and saving result to same table.
removing the (now) unused column.
Is this functionality (or similar) supported?
Possibly the "save result to table" functionality can have a "compact schema" option.
The smallest time-saving way to remove a column from Big Query according to the documentation.
ALTER TABLE [table_name] DROP COLUMN IF EXISTS [column_name]
If your table does not consist of record/repeated type fields - your simple option is:
Select valid columns while filtering out bad records into new temp table
SELECT < list of original columns >
FROM YourTable
WHERE < filter to remove bad entries here >
Write above to temp table - YourTable_Temp
Make a backup copy of "broken" table - YourTable_Backup
Delete YourTable
Copy YourTable_Temp to YourTable
Check if all looks as expected and if so - get rid of temp and backup tables
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free
In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you
Create a table with Record type column
create a table with a column type RECORD
I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself
Below is the code to do it. Lets say c is the column that you wants to delete.
CREATE OR REPLACE TABLE transactions.test_table AS
SELECT * EXCEPT (c) FROM transactions.test_table;
Or second method and my favorite is by following below steps.
Write Select query with the columns you want to exclude.
Go to Query Settings
Query Settings
In Destination setting Set destination table for query results, enter project name, Dataset name and table name exactly same as you entered in Step 1.
In Destination table write preference select Overwrite table.
Destination table settings
Save the Query Setting and run the query.
Save results to table is your way to go. Try on the big table with the selected columns you are interested, and you can apply a limit to make it small.
Is there a way to retrieve the field names from an actual query, similar to how you can retrieve field names from a table using the INFORMATION_SCHEMA? In essence, I'm wanting to accept an entire query as a parameter, and be able to build an empty table with the field names from said query.
Right now, I'm using a STUFF() function to replace the SELECT with SELECT TOP 1 and replacing the FROM with a INTO tablename FROM, and then truncating tablename. It works, but I need this to be able to handle complex queries where the first occurrence of FROM might not be the one I need (in case the user is using a subquery for a field, for example).
I was wondering if there was a way to search an entire SQLite database for one specific word. I do not know the column that it is in or even the table that it is in.
The table and row/column that contains this specific word also contains the other entries that i need to edit.
In-short:
Need to find a specific word
Can't query (i don't think i can atleast) since i don't know the table or column name that its located in.
I need to know where this specific word is referenced. In what table and row so I can access the others that are along side it.
Basically, is there a CTRL+F functionality of SQlite that searches the entirety of the SQLite file?
I have mac/windows/linux machines. I am not limited by software if that is a solution.
Any such functionality would essentially be running queries that check every column of every table. You can do that via a script that runs the following SQL:
1) Get a list of all the tables:
select name from sqlite_master where type = 'table'
2) For each table, get all of its columns (column name is available in the name field)
pragma table_info(cows)
3) Then for each table, generate a query that checks every field and run it:
select
*
from cows
where name like '%Daisy%'
or owner like '%Daisy%'
or farm like '%Daisy%'
Hi I have a table which was designed by a lazy developer who did not created it in 3rd normal form. He saved the arrays in the table instead of using MM relation . And the application is running so I can not change the database schema.
I need to query the table like this:
SELECT * FROM myTable
WHERE usergroup = 20
where usergroup field contains data like this : 17,19,20 or it could be also only 20 or only 19.
I could search with like:
SELECT * FROM myTable
WHERE usergroup LIKE 20
but in this case it would match also for field which contain 200 e.g.
Anybody any idea?
thanx
Fix the bad database design.
A short-term fix is to add a related table for the correct structure. Add a trigger to parse the info in the old field to the related table on insert and update. Then write a script to [parse out existing data. Now you can porperly query but you haven't broken any of the old code. THen you can search for the old code and fix. Once you have done that then just change how code is inserted or udated inthe orginal table to add the new table and drop the old column.
Write a table-valued user-defined function (UDF in SQL Server, I am sure it will have a different name in other RDBMS) to parse the values of the column containing the list which is stored as a string. For each item in the comma-delimited list, your function should return a row in the table result. When you are using a query like this, query against the results returned from the UDF.
Write a function to convert a comma delimited list to a table. Should be pretty simple. Then you can use IN().