Is the column of type JSON deprecated? - google-bigquery

In the bigquery console, when creating a table, there used to be type JSON as an option for the column types but weirdly enought it was never present in their docs We used this column type in our production tables, and discovered later on that you can't select it in queries otherwise bigquery throws an error, and the json functions also didn't work with it. So we simply stopped using this column in the queries but they still exist in our tables.
However, in the past couple of days, all queries against this table are failing with this error 400 Json is not enabled for current project. and this column type is not present in the bigquery console anymore. It seems it was removed or deprecated? I checked the release notes, but the latest release was way before the error occured. This broke our production environment, and we couldnt even export the data because exporting gave the same error. Instead we had to use a new table without this column which meant we lost all our history.
Did anyone face the same problem with any other column types before, is it normal that a type is deprecated without users being notified beforehand. This is making me question the reliability of bigquery.

Please reach out to Google Cloud support and we will help you fix your issue with that problematic table. You may also want to try fixing it yourself using the ALTER TABLE DROP COLUMN statement that is currently in public preview [1]. This will drop the erroneous column (the data in that column only will be lost). The rest of the data will remain usable.
[1] https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#alter_table_drop_column_statement

I ran into the same error message few days ago and was surprised to read about this policy change that's not backed up by a mitigation process. My attempt to use Vlad Grachev suggestion to drop this column did not prevail, as the console does not allow to query this table (same "Json is not enabled for current project." error).
My only remediation at this point is:
build a new table where the json column is switched to type string
create a pipeline that transforms the objects to strings
migrate the data through the pipeline to the new table

In BigQuery Json data can be stored in a column type "Record.Are you referring the same by JSON column type?
BigQuery uses the RECORD (or STRUCT) type to represent nested structure. A column of RECORD type is in fact a large column containing multiple child columns. For more information Refer the link below,
Json Data in BigQuery
if you are not refering to the Record Data type, The Json Column type might be a test feature that might not dependent on deprecation scheme

Related

Typeorm Migrations - How to deal with 'column "<column_name>" contains null values' errors in production

The main problem I want to discuss is the schema synchronization conflicts that occur when tables already have data and some new required attribute is added or I rename an required attribute. This question already has some possible solutions but these are not acceptable in a production environment where you already have user data since they simply suggest to delete the data. I also want to enforce required fields so setting the column to {nullable: true} is also not an option for me.
As an example suppose I had a column named "time" that I renamed to "minutes". When I synchronize the schemas TypeORM produces the following error:
QueryFailedError: column "minutes" contains null values
Is there a more elegant/automated way to deal with these errors other than just setting the column to {nullable: true}? I can imagine that you could write some custom SQL with the migration script to also modify the row values. Seems like a little to much manual effort for me though.

problem with auto-created bigquery field name that contains "."

I used a simple ETL tool to import QuickBooks data into Google BigQuery. Great! The only challenge notable limitation on this step is that I can't do any translation ... more like it's an EL tool.
That said, now I want to query the imported table. It's no problem at all for correctly named fields in BigQuery (like txndate). However, some of the fields are of the format abc.xyz (e.g., deposittoaccountref.value) and can't be queried. The "." in the name is apparently confusing BigQuery.
If I dump the whole table, I can see the "." name fields and the associated values.
However, I can't create a custom query against those fields. They don't show up in the auto-generated schema that allows one to drag and drop field names into the query.
Also, I tried to manually type the field name in and received the following error message: Missing column alias. Any expression in a SELECT statement that is not a column from the original data source must be followed by an alias, for example: AS my_alias.
I've tried quoting the field name and bracketing the field name but they still throw the same error.
I traced back to QB API documentation and this is indeed how Intuit labels the fields.
Finally, as long as I can query these fields at all, I can rename them to eliminate the "." problem.
Please advise and thank you!
ok, I solved this myself.
The way to fix this within bigquery query editor is to manually type in the field name (i.e., not available in the auto-generated schema) and to parenthesis the field name.
e.g, deposittoaccountref.value becomes (deposittoaccountref.value)
Now, this will label the column in the result set as "value", so you may want to relabel the data field to something without the ".". For example, I took the original
deposittoaccountref.value and modified it to
(deposittoaccountref.value) as deposittoaccountref_value
Hopefully, this will help someone else in the future!
the above answer works when there is a single dot in the name as in the example.
however, if there are multiple e.g., "line.value.amount" then the parenthesis trick doesn't work
i've tried nesting the parenthesis in different ways to no avail
e.g., (line.value.amount) = error error, ((line.value).amount) = error, (line.(value.amount)) = error

Doing a SELECT * from TABLE gives "Cannot read field 'records' of type INT64 as UINT64"

I am running into something that appears to be a global BigQuery issue that started maybe only a few days ago. It was definitely working on Jan 7th 2019. I narrowed down the issue to a simple SELECT * FROM TABLE which throws a Cannot read field 'records' of type INT64 as UINT64. The records field is declared as INTEGER in the schema and the table is a result of an aggregate query.
I am getting the same error both programmatically as well as in BigQuery UI.
If I explicitly list STRING fields, the query works. As soon as I reference records which is INTEGER, the query fails.
Job id is dulcet-outlook-94110:US.bquxjob_5883645e_16858aba0ae.
Alternatively, everyone can reproduce this using public data by saving the following query into a temp table and then doing a simple SELECT * from temp.
SELECT state, count(*) cnt FROM [bigquery-public-data:samples.natality]
group by state
This gives a slightly different but essentially the same error: Type mismatch for column 'cnt' in table temp. Expected type 'uint64', actual type 'int64' in file :mdb=cloud-dataengine.
(EDIT: Make sure to use "Allow Large Results" otherwise it will work fine).
Thank you for raising this up. This is indeed a bug in BigQuery, a fix has been completely rolled out now.
For the broken tables, although data is not lost, they have an inconsistent state with the schema. So please try to regenerate them if you can, as for now their schemas won't automatically fix themselves yet. We are working on ways to fix the schema of the existing affected tables, but it might take some time.
If you still have any problem feel free to report to the public issue tracker wpfwannabe created above.

Google BigQuery - Insert All with Table Suffix failing

I have a project where I was previously creating tables on Insert. I am attempting to instead perform an insertAll with a templateSuffix. It seems to work great with new tables, but I have this odd case.
The following URL (https://gist.github.com/dovy/b5b5b25e660ac037aaa130294ab42e3a) provides an example insert. I have some data from a source, the desired table (table_schema.txt) and a template schema (table_template_schema.txt). The only difference between the two schemas is the order of the last 2 columns:
|- cache_file: string
|- deduped: integer
The error I get is
HttpError:
https://www.googleapis.com/bigquery/v2/projects/flash-student-96619/datasets/log_data_v7/tables/day/insertAll?alt=json
returned "Provided Schema does not match Table
flash-student-96619:log_data_v7.day20160423. Template and generated
table schemas are incompatible"
Is insertAll really that picky? There's no way to re-order columns unless I do a query and replace on the same table. That seems incredibly painful.
Any clues from anyone out there?
I ended up doing a standard insert without tableSuffix and if it failed (try/catch) I did a insert with tableSuffix. That bypasses this insane requirement of perfect order and all works for me.
Just wish I didn't have to get around this.

Error: Schema changed for Timestamp field (additional)

I am getting an error message when I query a specific table in my data set that has a nullable timestamp field. In the BigQuery web tool, I run simple query, e.g.:
SELECT * FROM [reztrack.201401] LIMIT 100
The result I get is: Error: Schema changed for Timestamp field date
Example Job ID: esiteisthebomb:job_6WKi7ZhSi8D_Ewr8b5rKV-a5Eac
This is the exact same issue that was noted here: Error: Schema changed for Timestamp field.
Also logged this under: https://code.google.com/p/google-bigquery/issues/detail?id=307 but I was unsure since it said we should be logging everything in Stackoverlfow.
Any information on how to fix this for this or other tables would be greatly appreciated.
Note: The original answer states to contact google support, but Google support for BigQuery was moved to StackOverflow. Therefore I assume that means to open it as a new question in hopes the engineers will respond.
BigQuery recently improved the representation of its internal timestamp format (there had previously been a lot of cases where timestamps broke in strange ways and this change should fix that). Your table still was using the old timestamp format, and you tickled a bug in the old format when schemas changed (in this case, the field went from REQUIRED to OPTIONAL).
We have an automated process that coalesces tables to make their storage more efficient. I scheduled this to run over your table, and have verified that it has rewritten your table using the new timestamp format.
You should now be able to query this field of your table without further problems.