invalid identifier while parsing json - sql

I am compiling a dbt base model. Currently I get this error below. Line 6 looks the same as other lines above. Might a small syntax error that I could not spot.
15:40:22 Database Error in model base_datacenter_handling_unit (models/l10_staging_datacenter/base_unit.sql)
15:40:22 000904 (42000): SQL compilation error: error line 6 at position 3
15:40:22 invalid identifier 'VALUE'
15:40:22 compiled SQL at target/run/dbt/models/l10_staging_datacenter/base_unit.sql
This is how my file looks like:
SELECT
JSON_DATA:"key"::text AS KEY
, value:"description"::text AS DESCRIPTION
, value:"globalHandlingUnitId"::text AS GLOBAL_HANDLING_UNIT_ID
, value:"tareWeight"::NUMBER(38,0) AS TARTE_WEIGHT
, value:"tareWeight_unit"::text AS TARTE_WEIGHT_UNIT
, value:"width"::NUMBER(38,0) AS WIDTH
, value:"width_unit"::text AS WIDTH_UNIT
, value:"length"::NUMBER(38,0) AS LENGTH
, value:"validFrom"::TIMESTAMP_NTZ AS VALID_FROM_TS_UTC
, value:"validTo"::TIMESTAMP_NTZ AS VALID_TO_TS_UTC
, value:"lastModified"::TIMESTAMP_NTZ AS LAST_MODIFIED_TS_UTC
, value:"status"::text AS STATUS
, md5(KEY::STRING || MASTERCLIENT_ID) AS HANDLING_UNIT_KEY --different logic than in POSTGRESDWH!
,MASTERCLIENT_ID
,{{ extract_masterclientname_clause('META_FILENAME') }} AS MASTERCLIENT_NAME
,META_ROW_NUM
,META_FILENAME
,META_LOAD_TS_UTC
,META_FILE_TS_UTC
,CASE WHEN {{table_dedup_clause('HANDLING_UNIT_KEY')}}
THEN True
ELSE False
END AS IS_RECORD_CURRENT
FROM {{ source('INGEST_DATACENTER', 'HANDLING_UNIT') }} src
QUALIFY {{table_dedup_clause('HANDLING_UNIT_KEY')}}
It could also be because of the STRING type md5(KEY::STRING || MASTERCLIENT_ID) I am using with md5 but I have another file, which is based on the same pattern but it does not throw an error tho:
SELECT
JSON_DATA:"issueId"::NUMBER(38,0) AS ISSUE_ID
, value:"slaName"::text AS SLA_NAME
, value:"slaTimeLeft"::NUMBER(38,0) AS SLA_TIME_USED_SECONDS
, md5(ISSUE_ID::STRING || SLA_NAME) AS ISSUE_SLA_ID
,MASTERCLIENT_ID
,{{ extract_masterclientname_clause('META_FILENAME') }} AS MASTERCLIENT_NAME
,META_ROW_NUM
,META_FILENAME
,META_LOAD_TS_UTC
,META_FILE_TS_UTC
,CASE WHEN {{table_dedup_clause('ISSUE_SLA_ID')}}
THEN True
ELSE False
END AS IS_RECORD_CURRENT
FROM {{ source('INGEST_EMS', 'ISSUES') }} src
, lateral flatten ( input => JSON_DATA:slas)
QUALIFY {{table_dedup_clause('ISSUE_SLA_ID')}}
I don't see any significant difference between the two

value is the output columns of a FLATTEN which you have in your second SQL. But not your first.
This is where putting an alias of every table, and using it on EVERY usage, you would see something like
SELECT t.json_data:"key",
f.value:"json_prop_name"
FROM table AS t;
and be like, where does f come from...

The most likely reason is the column is not named "tareWeight_unit". Snowflake creates column names in upper case regardless of how they are written unless the original create statement puts the columns names in double quotes (e.g. "MyColumn") in which case it will create the column names with the exact case specified. Use SHOW COLUMNS IN TABLE and check the actual column name.

Related

JSON stored in SUPER type fails to select camelcase element. Too long to be serialized. How can I select?

Summary:
I am working with a large JSON that is stored in a redshift SUPER type.
Context
This issue is near identical to the question posted here for TSQL. My schema:
chainId BIGINT
properties SUPER
Sample data:
{
"chainId": 5,
"$browser": "Chrome",
"token": "123x5"
}
I have this as a column in my table called properties.
Desired behavior
I want to be able to retrieve the value 5 from the chainId key and store it in a BIGINT column.
What I've tried
I have referenced the following aws docs:
https://docs.aws.amazon.com/redshift/latest/dg/JSON_EXTRACT_PATH_TEXT.html
https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html
https://docs.aws.amazon.com/redshift/latest/dg/super-overview.html
I have tried the following which haven't worked for me:
SELECT
properties.chainId::varchar as test1
, properties.chainId as test2
, properties.chainid as test3
, properties."chainId" as test4
, properties."chainid" as test5
, json_extract_path_text(json_serialize(properties), 'chainId') serial_then_extract
, properties[0].chainId as testval1
, properties[0]."chainId" as testval2
, properties[0].chainid as testval3
, properties[0]."chainid" as testval4
, properties[1].chainId as testval5
, properties[1]."chainId" as testval6
FROM clean
Of these, the attempt, serial_then_extract returned a not null, correct value, but not all of the values in my properties field are short enough to serialize, so this only works on some of the rows.
All others return null.
Referencing the following docs: https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#unnest I have also attempted to iterate over the super type using partisql:
SELECT ps.*
, p.chainId
from clean ps, ps.properties p
where 1=1
But this returns no rows.
I also tried the following:
select
properties
, properties.token
, properties."$os"
from base
And this returned rows with values. I know that there is a chainId value as I've checked the corresponding key and am working with sample data.
What am I missing? What else should I be trying?
Does anyone know if this has to do with the way that the JSON key is formatted? [camelcase]
You need to enable case sensitive identifiers. By default Redshift maps everything to lower case for table and column names. If you have mixed case identifiers like in your super field you need to enable case sensitivity with
SET enable_case_sensitive_identifier TO true;
See: https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html

Fetching attribute from JSON string with JSON_VAL cause "<attribute> is invalid in the used context" error

A proprietary third-party application stores JSON strings in it's database like this one:
{"state":"complete","timestamp":1614776473000}
I need the timestamp and found out that
DB2 offers JSON functions. Since it's stored as string in the PROF_VALUE column, I guess that converting with SYSTOOLS.JSON2BSON is required, before I can use JSON_VAL to fetch the timestamp:
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "timestamp", "f")
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'
This causes an error that timestamp is invalid in the used context ( SQLCODE=-206, SQLSTATE=42703, DRIVER=4.26.14). The same error is thown when I remove the JSON2BSON call like this
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE, "timestamp", "f")
Also not working with the same error (different data-types):
SELECT SYSTOOLS.JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), "state", "s:1000")
SELECT SYSTOOLS.JSON_VAL(PROF_VALUE) "state", "s:1000")
I don't understand this error. My syntax is like the documented JSON_VAL ( json-value , search-string , result-type) and it is the same like in the examples, where they show how to fetch the name field of an object.
I also played around a bit with JSON_TABLE to use raw input data for testing (instead of the database data), but it seems not suiteable for that.
SELECT *
FROM TABLE(SYSTOOLS.JSON_TABLE( SYSTOOLS.JSON2BSON('{"state":"complete","timestamp":1614776473000}'), 'state','s:32')) DATA
This gave me a table with one row: Type = 2 and Value = complete.
I had two problems in my query: First it seems that double quotes " are for object references. I wasn't aware that there is any difference, because in most databases I used yet, both single ' and double quotes " are equal.
The second problem is, that JSON_VAL needs to be called without SYSTOOLS, but the reference is still needed on SYSTOOLS.JSON2BSON(PROF_VALUE).
With those changes, the following query worked:
SELECT JSON_VAL(SYSTOOLS.JSON2BSON(PROF_VALUE), 'timestamp', 'f')
FROM EMPINST.PROFILE_EXTENSIONS ext
WHERE PROF_PROPERTY_ID = 'touchpointState'

Invalid column issue

I created a view as below from my existing tables:
Create View LostCase
As
Select L.LNO,Count(*) "NoOfCasesLost"
From L, LWCS,C
Where L.LNO=LWCS.LNO
And LWCS.CASENO=C.CNO
And C.OUTCOME='LOSE'
Group By L.LNO ;
However, whenever I run query as per below , I always get
invalid column name for LostCase.NoOfCasesLost
Select L.LNO,L.LNAME,LostCase.NoOfCasesLost
From L, LostCase
Where L.LNO = LostCase.LNO;
I don't get it why it happens.
If you have specified the column name in the quotation marks, you have to refer it afterwards as:
Select L.LNO,L.LNAME,LostCase."NoOfCasesLost"
From L,LostCase
Where L.LNO=LostCase.LNO;

Error in select statement, with union all in a subquery

In Oracle 11g, I came across an error for a query and cannot figure why it is erroring on me. Here is the query:
select
main_data.issue_number,
main_data.transaction_number
from
(
select
p1.payment_date,
p1.media_number,
p1.payment_amount,
p1.issue_number,
p1.advice_na_number,
name.name_address_line_1,
name.name_address_line_2,
name.name_address_line_3,
name.name_address_line_4,
name.name_address_line_5,
name.name_address_line_6,
name.name_address_line_7,
name.name_address_city,
name.state_code,
name.address_country_code,
name.zip_code,
name.tax_id_number,
p1.output_tx_number_prin,
p1.output_tx_number_int,
'' as "transaction_number",
p1header.check_account_number
from
p1
left join name on p1.name_address_number = name.name_address_number
left join p1header on p1.issue_number = p1header.issue_number
UNION ALL
select
check.date_of_payment,
check.media_number,
check.payment_amount,
check.issue_number,
check.payee_na_number,
name.name_address_line_1,
name.name_address_line_2,
name.name_address_line_3,
name.name_address_line_4,
name.name_address_line_5,
name.name_address_line_6,
name.name_address_line_7,
name.name_address_city,
name.state_code,
name.address_country_code,
name.zip_code,
name.tax_id_number,
'' as "output_tx_number_prin",
'' as "output_tx_number_int",
check.transaction_number,
check.dda_number as "check_account_number"
from check
left join name on check.payee_na_number = name.name_address_number
) main_data
Selecting individual fields like above will give me an "invalid identifier error". If I do select * then it gives me back the data without any error. What am I doing wrong here? Thank you.
The old quoted identifier problem... see point 9 in the database object naming documentation, and note that Oracle does not recommend using quoted identifiers.
You've put your column alias as lower case inside double-quotes. That means that any references to it also have to be quoted and exactly match the case. So this would work:
select
main_data.issue_number,
main_data."transaction_number"
from
...
But unless you have a burning need to have that alias like that - and I doubt you do as all the identifier names from the actual table columns are not quoted - it would be simpler to remove the double quotes from the inner selects:
select
main_data.issue_number,
main_data.transaction_number
from
(
select
...
'' as transaction_number,
p1header.check_account_number
...
UNION ALL
select
...
'' as output_tx_number_prin,
'' as output_tx_number_int,
check.transaction_number,
check.dda_number as check_account_number
...
You don't actually need to alias the columns in the second branch of the union; the column identifiers will all be taken from the first branch.

using multiple parameters in append query in Access 2010

I have been trying to get an append query to work but I keep getting an error stating that 0 rows are being appended whenever I use more than 1 parameter in the query. This is for a
The table in question has 1 PK which is a GUID [which is generating values with newid()] and one required field (Historical) which I am explictly defining in the query.
INSERT INTO dbo_sales_quotas ( salesrep_id
, [year]
, territory_id
, sales_quota
, profit_quota
, product_super_group_uid
, product_super_group_desc
, class_9
, Historical
, sales_quotas_UID )
SELECT dbo_sales_quotas.salesrep_id
, dbo_sales_quotas.Year
, dbo_sales_quotas.territory_id
, dbo_sales_quotas.sales_quota
, dbo_sales_quotas.profit_quota
, dbo_sales_quotas.product_super_group_uid
, dbo_sales_quotas.product_super_group_desc
, dbo_sales_quotas.class_9
, dbo_sales_quotas.Historical
, dbo_sales_quotas.sales_quotas_UID
FROM dbo_sales_quotas
WHERE (((dbo_sales_quotas.salesrep_id)=[cboSalesRepID])
AND ((dbo_sales_quotas.Year)=[txtYear])
AND ((dbo_sales_quotas.territory_id)=[txtTerritoryID])
AND ((dbo_sales_quotas.sales_quota)=[txtSalesQuota])
AND ((dbo_sales_quotas.profit_quota)=[txtProfitQuota])
AND ((dbo_sales_quotas.product_super_group_uid)=[cboProdSuperGroup])
AND ((dbo_sales_quotas.product_super_group_desc)=[txtProductSuperGroupDesc])
AND ((dbo_sales_quotas.class_9)=[cboClass9])
AND ((dbo_sales_quotas.Historical)='No')
AND ((dbo_sales_quotas.sales_quotas_UID)='newid()'));
Even if I assign specific values, I still get a 0 rows error except when I reduce the number of parameters to 1 (which it then works perfectly regardless of which parameter) I have verified that the parameters have the correct formats.
Can anyone tell me what I'm doing wrong?
Break out the SELECT part of your query and examine it separately. I'll suggest a simplified version which may be easier to study ...
SELECT
dsq.salesrep_id,
dsq.Year,
dsq.territory_id,
dsq.sales_quota,
dsq.profit_quota,
dsq.product_super_group_uid,
dsq.product_super_group_desc,
dsq.class_9,
dsq.Historical,
dsq.sales_quotas_UID
FROM dbo_sales_quotas AS dsq
WHERE
dsq.salesrep_id=[cboSalesRepID]
AND dsq.Year=[txtYear]
AND dsq.territory_id=[txtTerritoryID]
AND dsq.sales_quota=[txtSalesQuota]
AND dsq.profit_quota=[txtProfitQuota]
AND dsq.product_super_group_uid=[cboProdSuperGroup]
AND dsq.product_super_group_desc=[txtProductSuperGroupDesc]
AND dsq.class_9=[cboClass9]
AND dsq.Historical='No'
AND dsq.sales_quotas_UID='newid()';
I wonder about the last 2 conditions in the WHERE clause. Is the Historical field type bit instead of text? Does the string 'newid()' match sales_quotas_UID in any rows in the table?