Is it posible to insert into the bigQuery table rows with different fields? - google-bigquery

Using bigQuery UI I've created new table free_schem_table and haven't set any schema, then I tried to execute:
insert into my_dataset.free_schema_table (chatSessionId, chatRequestId,senderType,senderFriendlyName)
values ("123", "1234", "CUSTOMER", "Player")
But BigQuery UI demonsrtrated me the popup where written:
Column chatSessionId is not present in table my_dataset.free_schema_table at [1:43]
I expected that BiqQuery is a NoSql storage and I should be able to insert rows with different columns.
How could I achieve it ?
P.S.
schema:

BigQuery requires a schema with strong type.
If you need free schema, similar thing in BigQuery is to define a single column in STRING type and store JSON inside.
JSON functions will help you extract field from JSON string later, but you don't benefit from BigQuery's optimization if you predefine your schema and save data in different columns.

Related

How to set up custom fields in Superset?

Our database has a field with JSON data that we'd like to use in reports. E.g.
{
owner_type: "USER",
updated_at: 1641996749092389600,
version_no: 1,
entity_type: "INDIVIDUAL",
country:"ES",
}
How can one create dynamic fieldsĀ in Superset, e.g. to expose owner_type as its own field?
I'm coming from tools like Snowflake and Zoho Analytics where you could build Views, Dynamic Tables and Formula Fields based on aggregated raw data.
You can add columns to your table in Superset. Hover on 'Sources' on the header and select 'Tables'. Then from there, choose the option to edit the record of your table. In that you can add a calculated column/custom column.
To add a column for owner_type, lets name the custom column as owner_type. Fill the datatype for the new column as VARCHAR(100). Choose the table from the dropdown. In the expression, put json_column->"$.owner_type" and then hit save. This expression is for MySQL database. You can find the expression to parse JSON in your particular DB.

Convert the latest data pull of a raw Variant table into a normal table: Snowflake

I have a variant table where raw json data is stored in a column called "raw" as shown here.
Each row of this table is a full data pull from an API and ingested via snowpipe. Within the json there is a 'pxQueryTimestamp' key and value pair. The latest value for this field should have the most up to date data. How would I go about only normalizing this row?
Usually my way around this, is to only pipe over the latest data from "s3" so that this table has only one row, then I normalize that.
I'd like to have a historic table of all data pulls as show below but when normalizing we only care about the most relevant up to date data.
Any help is appreciated!
If you are saying that you want to flatten and retain everything in the most current variant record, then I'd suggest leveraging a STREAM object in Snowflake, which would then only have the latest variant record. You could then TRUNCATE your flattened table and run an insert from the STREAM object to your flattened table, which would then move the offset forward and your STREAM would then be empty.
Take a look at the documentation here:
https://docs.snowflake.net/manuals/user-guide/streams.html

Pixel tracking to BQ: how to save querystring parameter values directly to BQ table fields

I was setting up a serverless tracking pixel using this article: https://cloud.google.com/solutions/serverless-pixel-tracking-tutorial
This works but saves the entire pixel GET URL into one single field in BQ - as the pixel URL will carry multiple querystring paramter values with it and best is that these go into individual fields in BQ: I want to tweak it to save each querystring parameter value of the GET tracking pixel into its own BQ table field.
Assuming the names and number of the querystring parameters are known and they match 1-to-1 to the BQ table columns - what would be the recommended way to achieve this?
I was looking in the article if the logs query can be tuned to do this.
Also I saw that Dataflow may be the way to go but thinking if it is possible in a more direct & simple way.
The simple direct way is to create a query in BigQuery that will expose them into columns.
You can have this query first as a VIEW and you can query the view instead of the full query.
Then you can setup a scheduled query in BigQuery to run this query regularly and save into a new table. This way you have the old table getting the links as it is. The scheduler that will create a new table.
You can tune to remove existing rows from the incoming table, and append new rows to the materialized table.

How to add a new BigQuery table data to Tableau extract?

Is it possible to add data from a new BigQuery table to existing Tableau extract?
For example, there are BigQuery tables partitioned by date like access_20160101, access_20160102, ... and data from 2016/01/01 to 2016/01/24 is already in Tableau server extract. Now a new table for 2016/01/25, access_20160125 has been created and I want to add the data to the existing extract, but don't want to read the old tables because there is no change in them but loading them will be charged by Google.
If I understand correctly: you created an extract for a table in BigQuery and now you want to append data in a different table to that extract.
As long as both tables have exactly the same column names and data types in those columns you can do this:
Create an extract from the new table.
Append that extract to the old one. (see: add data from a file)
Now you have one extract with the data of both tables.

query a table not in normal 3rd form

Hi I have a table which was designed by a lazy developer who did not created it in 3rd normal form. He saved the arrays in the table instead of using MM relation . And the application is running so I can not change the database schema.
I need to query the table like this:
SELECT * FROM myTable
WHERE usergroup = 20
where usergroup field contains data like this : 17,19,20 or it could be also only 20 or only 19.
I could search with like:
SELECT * FROM myTable
WHERE usergroup LIKE 20
but in this case it would match also for field which contain 200 e.g.
Anybody any idea?
thanx
Fix the bad database design.
A short-term fix is to add a related table for the correct structure. Add a trigger to parse the info in the old field to the related table on insert and update. Then write a script to [parse out existing data. Now you can porperly query but you haven't broken any of the old code. THen you can search for the old code and fix. Once you have done that then just change how code is inserted or udated inthe orginal table to add the new table and drop the old column.
Write a table-valued user-defined function (UDF in SQL Server, I am sure it will have a different name in other RDBMS) to parse the values of the column containing the list which is stored as a string. For each item in the comma-delimited list, your function should return a row in the table result. When you are using a query like this, query against the results returned from the UDF.
Write a function to convert a comma delimited list to a table. Should be pretty simple. Then you can use IN().