Fetching attribute from JSON string with JSON_VAL cause "<attribute> is invalid in the used context" error - sql

A proprietary third-party application stores JSON strings in it's database like this one:
{"state":"complete","timestamp":1614776473000}
I need the timestamp and found out that
DB2 offers JSON functions. Since it's stored as string in the PROF_VALUE column, I guess that converting with SYSTOOLS.JSON2BSON is required, before I can use JSON_VAL to fetch the timestamp:
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "timestamp", "f")
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'
This causes an error that timestamp is invalid in the used context ( SQLCODE=-206, SQLSTATE=42703, DRIVER=4.26.14). The same error is thown when I remove the JSON2BSON call like this
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE, "timestamp", "f")
Also not working with the same error (different data-types):
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "state", "s:1000")
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE) "state", "s:1000")
I don't understand this error. My syntax is like the documented JSON_VAL ( json-value , search-string , result-type) and it is the same like in the examples, where they show how to fetch the name field of an object.
I also played around a bit with JSON_TABLE to use raw input data for testing (instead of the database data), but it seems not suiteable for that.
SELECT *
FROM TABLE(SYSTOOLS.JSON_TABLE( SYSTOOLS.JSON2BSON('{"state":"complete","timestamp":1614776473000}'), 'state','s:32')) DATA
This gave me a table with one row: Type = 2 and Value = complete.

I had two problems in my query: First it seems that double quotes " are for object references. I wasn't aware that there is any difference, because in most databases I used yet, both single ' and double quotes " are equal.
The second problem is, that JSON_VAL needs to be called without SYSTOOLS, but the reference is still needed on SYSTOOLS.JSON2BSON(PROF_VALUE).
With those changes, the following query worked:
SELECT JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), 'timestamp', 'f')
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'

Related

dynamic split json string in bigquery [duplicate]

I have load the entire json file into a STRING column of BigQuery table. Now I am trying to access the keys using JSON_EXTRACT_SCALAR function, but I am getting null result for the child keys which contain special character period(".") within their name.
Here's the snippet of the data:
{"server_received_time":"2019-01-17 15:00:00.482000","app":161,"device_carrier":null,"$schema":12,"city":"Caro","user_id":null,"uuid":"9018","event_time":"2019-01-17 15:00:00.045000","platform":"Web","os_version":"49","vendor_id":711,"processed_time":"2019-01-17 15:00:00.817195","user_creation_time":"2018-11-01 19:16:34.971000","version_name":null,"ip_address":null,"paying":null,"dma":null,"group_properties":{},"user_properties":{"location.radio":"ca","vendor.userTier":"free","vendor.userID":"a989","user.id":"a989","user.tier":"free","location.region":"ca"},"client_upload_time":"2019-01-17 15:00:00.424000","$insert_id":"e8410","event_type":"LOADED","library":"amp\/4.5.2","vendor_attribution_ids":null,"device_type":"Mac","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.493000","event_id":64,"location_lat":null,"os_name":"Chrome","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.authenticated":false,"content.subsection1":"regions","custom.DNT":true,"content.subsection2":"ca","referrer.url":"","content.url":"","content.type":"index","content.title":"","custom.cookiesenabled":true,"app.pillar":"feed","content.area":"news","app.name":"oc"},"data":{},"device_id":"","language":"English","device_model":"Mac","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":15,"device_family":"Mac","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.987000"}
{"server_received_time":"2019-01-17 15:00:00.913000","app":161,"device_carrier":null,"$schema":12,"city":"Fo","user_id":null,"uuid":"9052","event_time":"2019-01-17 15:00:00.566000","platform":"Web","os_version":"71","vendor_id":797,"processed_time":"2019-01-17 15:00:01.301936","user_creation_time":"2019-01-17 15:00:00.566000","version_name":null,"ip_address":null,"paying":null,"dma":"CO","group_properties":{},"user_properties":{"user.tier":"free"},"client_upload_time":"2019-01-17 15:00:00.157000","$insert_id":"69ae","event_type":"START WEB SESSION","library":"amp\/4.5.2","vendor_attribution_ids":null,"device_type":"Android","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.925000","event_id":1,"location_lat":null,"os_name":"Chrome Mobile","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.subsection3":"home","content.subsection2":"archives","content.title":"","content.keywords.subject":["Lifestyle\/Recreation and leisure\/Outdoor recreation\/Boating","Lifestyle\/Relationships\/Couples","General news\/Weather","Oddities"],"content.publishedtime":154687,"app.name":"oc","referrer.url":"","content.subsection1":"archives","content.url":"","content.authenticated":false,"content.keywords.location":["Ot"],"content.originaltitle":"","content.type":"story","content.authors":["Archives"],"app.pillar":"feed","content.area":"news","content.id":"1.49","content.updatedtime":1546878600538,"content.keywords.tag":["24 1","boat house","Ot","Rockcliffe","River","m"],"content.keywords.person":["Ber","Shi","Jea","Jean\u00e9tien"]},"data":{"first_event":true},"device_id":"","language":"English","device_model":"Android","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":15477,"device_family":"Android","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.810000"}
{"server_received_time":"2019-01-17 15:00:00.913000","app":16,"device_carrier":null,"$schema":12,"city":"","user_id":null,"uuid":"905","event_time":"2019-01-17 15:00:00.574000","platform":"Web","os_version":"71","vendor_id":7973,"processed_time":"2019-01-17 15:00:01.301957","user_creation_time":"2019-01-17 15:00:00.566000","version_name":null,"ip_address":null,"paying":null,"dma":"DCO","group_properties":{},"user_properties":{"user.tier":"free"},"client_upload_time":"2019-01-17 15:00:00.157000","$insert_id":"d045","event_type":"LOADED","library":"am-js\/4.5.2","vendor_attribution_ids":null,"device_type":"Android","device_manufacturer":null,"start_version":null,"location_lng":null,"server_upload_time":"2019-01-17 15:00:00.925000","event_id":2,"location_lat":null,"os_name":"Chrome Mobile","vendor_event_type":null,"device_brand":null,"groups":{},"event_properties":{"content.subsection3":"home","content.subsection2":"archives","content.subsection1":"archives","content.keywords.subject":["Lifestyle\/Recreation and leisure\/Outdoor recreation\/Boating","Lifestyle\/Relationships\/Couples","General news\/Weather","Oddities"],"content.type":"story","content.keywords.location":["Ot"],"app.pillar":"feed","app.name":"oc","content.authenticated":false,"custom.DNT":false,"content.id":"1.4","content.keywords.person":["Ber","Shi","Jea","Je\u00e9tien"],"content.title":"","content.url":"","content.originaltitle":"","custom.cookiesenabled":true,"content.authors":["Archives"],"content.publishedtime":1546878600538,"referrer.url":"","content.area":"news","content.updatedtime":1546878600538,"content.keywords.tag":["24 1","boat house","O","Rockcliffe","River","pr"]},"data":{},"device_id":"","language":"English","device_model":"Android","country":"","region":"","is_attribution_event":false,"adid":null,"session_id":1547737199081,"device_family":"Android","sample_rate":null,"idfa":null,"client_event_time":"2019-01-17 14:59:59.818000"}
Here's the sample query against the table:
SELECT
CAST(JSON_EXTRACT_SCALAR(data,'$.uuid')AS INT64) AS uuid_id,
CAST(JSON_EXTRACT_SCALAR(data,'$.event_time') AS TIMESTAMP) AS event_time,
JSON_EXTRACT_SCALAR(data,'$[event_properties].app.name') AS app_name,
JSON_EXTRACT_SCALAR(data,'$[user_properties].user.tier') AS user_tier
FROM
mytable
Above query give null result for app_name & user_tier columns even though data exists for them.
Following the BigQuery JSON function documentation - JSON Functions in Standard SQL
In cases where a JSON key uses invalid JSONPath characters, you can escape those characters using single quotes and brackets, [' '].
and running the query as:
SELECT
CAST(JSON_EXTRACT_SCALAR(data,"$.uuid_id")AS INT64) AS uuid_id,
CAST(JSON_EXTRACT_SCALAR(data,"$.event_time") AS TIMESTAMP) AS event_time,
JSON_EXTRACT_SCALAR(data,"$.event_properties.['app.name']") AS app_name,
JSON_EXTRACT_SCALAR(data,"$.user_properties.['user.tier']") AS user_tier
FROM
mytable
result into following error:
Invalid token in JSONPath at: .['app.name']
Please advise. What am I missing here?
You have an extra . before the [. Use
"$.event_properties['app.name']"

timestamp VS TIMESTAMP_NTZ in snowflake sql

I am using an sql script to parse a json into a snowflake table using dbt.
One of the cols contain this datetime value: '2022-02-09T20:28:59+0000'.
What's the correct way to define ISO datetime's data type in Snowflake?
I tried date, timestamp and TIMESTAMP_NTZ like this in my dbt sql script:
JSON_DATA:",my_date"::TIMESTAMP_NTZ AS MY_DATE
but clearly, these aren't the correct one because later on when I test it in snowflake with select * , I get this error:
SQL Error [100040] [22007]: Date '2022-02-09T20:28:59+0000' is not recognized
or
SQL Error [100035] [22007]: Timestamp '2022-02-13T03:32:55+0100' is not recognized
so I need to know which Snowflake time/date data type suits the best for this one
EDIT:
This is what I am trying now.
SELECT
JSON_DATA:"date_transmission" AS DATE_TRANSMISSION
, TO_TIMESTAMP(DATE_TRANSMISSION:text, 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM') AS DATE_TRANSMISSION_TS_UTC
, JSON_DATA:"authorizerClientId"::text AS AUTHORIZER_CLIENT_ID
, JSON_DATA:"apiPath"::text API_PATH
, MASTERCLIENT_ID
, META_FILENAME
, META_LOAD_TS_UTC
, META_FILE_TS_UTC
FROM {{ source('INGEST_DATA', 'TABLENAME') }}
I get this error:
000939 (22023): SQL compilation error: error line 6 at position 4
10:21:46 too many arguments for function [TO_TIMESTAMP(GET(DATE_TRANSMISSION, 'text'), 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM')] expected 1, g
However, if I comment out the the first 2 lines(related to timpstamp types), the other two work perfectly fine. What's the correct syntax of parsing json with TO_TIMESTAMP?
Not that JSON_DATA:"apiPath"::text API_PATH gives the correct value for it in my snowflake tables.
Did some testing and it seems you have 2 options.
You can either get rid of the +0000 at the end: left(column_date, len(column_date)-5)
or try_to_timestamp with format
try_to_timestamp('2022-02-09T20:28:59+0000','YYYY-MM-DD"T"HH24:MI:SS+TZHTZM')
TZH and TZM are TimeZone Offset Hours and Minutes
So there are 2 main points here.
when getting data from JSON to pass to any of the timestamp functions that want a ::TEXT object, but the values to get from JSON are still ::VARIANT so they need to be cast. This is the cause of the error you quote
(22023): SQL compilation error: error line 6 at position 4
10:21:46 too many arguments for function [TO_TIMESTAMP(GET(DATE_TRANSMISSION, 'text'), 'YYYY-MM-DDTHH24:MI:SS.FFTZH:TZM')] expected 1, g
also your SQL is wrong there it should have been
TO_TIMESTAMP(DATE_TRANSMISSION::text,
How you handle the timezone format.As other have noted you (as I did in your last question) do you want to ignore the timezone values or read them. I forgot about the TZHTZM formatting. Given you have timezone data, you should use the TO_TIMESTAMP_TZ`TRY_TO_TIMESTAMP_TZto make sure the time zone data is keep, given you second example shows+0100`
putting those together (assuming you didn't want an extra date_transmission as a variant in you data) :
SELECT
TO_TIMESTAMP_TZ(JSON_DATA:"date_transmission"::text, 'YYYY-MM-DDTHH24:MI:SS+TZHTZM') AS DATE_TRANSMISSION_TS_UTC
, JSON_DATA:"authorizerClientId"::text AS AUTHORIZER_CLIENT_ID
, JSON_DATA:"apiPath"::text AS API_PATH
, MASTERCLIENT_ID
, META_FILENAME
, META_LOAD_TS_UTC
, META_FILE_TS_UTC
FROM {{ source('INGEST_DATA', 'TABLENAME') }}
You should use timestamp (not date which does not store the time information), but probably the format you are using is not autodetected. You can specify the input format as YYYY-MM-DD"T"HH24:MI:SSTZHTZM as shown here. The autodetected one has a : between the TZHTZM.

How to resolve this sql error of schema_of_json

I need to find out the schema of a given JSON file, I see sql has schema_of_json function
and something like this works flawlessly
> SELECT schema_of_json('[{"col":0}]');
ARRAY<STRUCT<`col`: BIGINT>>
But if I query for my table name, it gives me the following error
>SELECT schema_of_json(Transaction) as json_data from table_name;
Error in SQL statement: AnalysisException: cannot resolve 'schemaofjson(`Transaction`)' due to data type mismatch: The input json should be a string literal and not null; however, got `Transaction`.; line 1 pos 7;
The Transaction is one of the columns in my table and after checking it manually I can attest that it is of String type(json).
The SQL statement has it to give me the schema of the JSON, how to do it?
after looking further into the documentation that it is clear that the word foldable means that of the static one, and a column from a table JSON won't work
for minimal reroducible example here you go:
SELECT schema_of_json(CAST('{ "a": "b" }' AS STRING))
As soon as the cast is introduced in the above statement, the schema_of_json will fail......... It needs a static JSON as it's input

How to use sql statement in django?

I want to get the latest date from my database.
Here is my sql statement.
select "RegDate"
from "Dev"
where "RegDate" = (
select max("RegDate") from "Dev")
It works in my database.
But how can I use it in django?
I tried these codes but it return error. These code are in views.py.
Version 1:
lastest_date = Dev.objects.filter(reg_date=max(reg_date))
Error:
'NoneType' object is not iterable
Version 2:
last_activation_date = Dev.objects.filter(regdate='reg_date').order_by('-regdate')[0]
Error:
"'reg_date' value has an invalid format. It must be in YYYY-MM-DD HH:MM[:ss[.uuuuuu]][TZ] format."
I've defined reg_date at beginning of the class.
What should I do for this?
You make things too complicated, you simply order by regdate, that's all:
last_activation_dev = Dev.objects.order_by('-regdate').first()
The .first() will return such Dev object if it is available, or None if there are no Dev objects.
If you only are interested in the regdate column itself, you can use .values_list(..):
last_activation_date = Dev.objects.order_by('-regdate').values_list('regdate', flat=True).first()
By using .filter() you actually were filtering the Dev table by Dev records such that the regdate column had as value 'reg_date', since 'reg_date' is not a valid datetime format, this thus produced an error.

Parsing JSON file in Snowflake Database

Database :SNOWFLAKE
My table contains JSON data for example:
{
"bucket":"IN_Apps",
"bySeqno":56,
"cas":1527639206906626048,
"content":"eyJoaWdoQmluIjoiNTQ4NTA4MDkiLCJkb2N1bWVudFR5cGUiOiJJSU5ETyIsImNhcmRUeXBlIyayI6Ik1BU1RFUkNBUkQifQ==",
"event":"mutation",
"expiration":0,
"flags":33554432,
"key":"iin54850809",
"lockTime":0,
"partition":948,
"revSeqno":1,
"vBucketUuid":137987627737694
}
when i tried to parse it.
select
parse_json:bucket::string as bucket ,
parse_json:bySeqno::string as bySeqno ,
parse_json:cas::INT as cas ,
parse_json:content::string as content ,
parse_json:event::string as event
,parse_json:expiration::string as expiration
,parse_json:flags::string as flags
,parse_json:key::string as key
,parse_json:lockTime::string as lockTime
,parse_json:partition::string as partition
,parse_json:revSeqno::string as revSeqno
,parse_json:vBucketUuid::string as vBucketUuid
from STG_YS_APPS v
but it is throwing error like.
SQL compilation error: error line 2 at position 0 invalid identifier > >'PARSE_JSON'
may someone please help me.
Answer with a known schema
Update: Since you provided schema, which shows a VAR column of VARIANT type, here's what you need, couldn't be simpler:
select
var:bucket::string as bucket,
var:bySeqno::string as bySeqno,
var:cas::int as cas
...
from STG_YS_APPS v
Below the answer before the schema was known
I'll assume you have a VARCHAR (or similar) column in your table that is called json, and stores the values you presented. You didn't provide the schema, so please adjust the column name as necessary.
You're not using PARSE_JSON as a function in your SQL. You should write something like
select
parse_json(json):bucket::string as bucket,
parse_json(json):bySeqno::string as bySeqno,
parse_json(json):cas::int as cas
...
from STG_YS_APPS v