How to Split JSON Variant Column in Snowflake - sql

We have Json Variant column in the table. The column D has json variant value like this:
[
"[{\"xyz_id\":0001,\"abc_id\":10032,\"dis_name\":\" AP 20%\",\"dis_type_name\":\"Subtotal Dis\",\"disc_rate\":20.0,\"discount_total\":-1.0000}]"
]
We want to Create new Column E as xyz_id in that Column. we need to Strip out this values ()

Is that a valid JSON sample you've posted as copied from Snowflake? I'm not to sure how to interpret it. If I strip the outer [" and "] the below code can be used to extract the field your looking for.
select parse_json('[{"xyz_id":0001,"abc_id":10032,"dis_name":" AP 20%","dis_type_name":"Subtotal Dis","disc_rate":20.0,"discount_total":-1.0000}]') COL_D,
COL_D[0]:"xyz_id" COL_E;

Related

How to extract string between characters in Hive

I have a Hive table with a column which includes a string with multiple topic names. I am looking to split out the first topic name (and if possible the second and third). The string can contain up to 8 topic names.
The format of the string is:
["T.Topic1", "T.Topic2", "T.Topic3", "S.Topic4", "S.Topic5"]
I have tried the following but wanted to know if there was a better way that would not involve the need to remove the left characters " and the right character " in a subsequent line or a possibility to extract more than the first topic.
SELECT SUBSTR(split(l.Intent, '[\\,]')[0], 2) AS TOPIC_1
FROM Table l
Results:
"T.Topic1"
Thank you
You are really close to the solution.
I suggest that you try and tackle it in 2 stages.
Remove the Array
Split the string.
regexp_extract(l.Intent,'^\\["(.*)"\\]' ) This will get the text inside the array.
split ( text , '", "' ) will split the string into the array you want.
putting it together:
with l as (select '["T.Topic1", "T.Topic2", "T.Topic3", "S.Topic4", "S.Topic5"]' as Intent)
select
split (
regexp_extract(l.Intent,'^\\["(.*)"\\]')
, '", "' ) as array_of_topics
from l as topics;
You can now access these rows topics.array_of_topics[0],topics.array_of_topics[1],topics.array_of_topics[3]

how to use db column name in regex_replace() - presto

I am new to presto, I am looking to use regex_replace on a particular db column instead of a string.
E.g: Replace all entries from a column "Description" that starts with digit and followed by space from "table1"
Can someone please help with an example?
I tried this :
select(regexp_replace(col("Description"), '\d+\s')) from table1
but getting error: "Function not found", function "Col" is not registered.
Thanks in advance!
Its enough to put column name and a replacement string should be added. So,
col("Description") -> Description
-> ,'replace_string'
Final query should be something like this :
select regexp_replace(Description, '\d+\s','replace_string' ) from table1 b
I think this does what you want:
SELECT
REGEXP_REPLACE(`Description`, '\d+\s')
FROM `table1`
They're optional in SQL, but you use backticks for column and table names, apostrophes for strings.

SQL: Extract from messy JSON nested field with backslashes

I have a table that has some rows with normal JSON and some with escaped values in the JSON field (backslashes)
id
obj
1
{"is_from_shopping_bag":true,"products":[{"price":{"amount":"18.00","currency":"USD","offset":100,"amount_with_offset":"1800"},"product_id":"1234","quantity":1}],"source":"cart"}
2
{"is_from_shopping_bag":"","products":"[{\ "product_id\ ":\ "2345\ ",\ "price\ ":{\ "currency\ ":\ "USD\ ",\ "amount\ ":\ "140.00\ ",\ "offset\ ":100},\ "quantity\ ":1}]"}
(Note: I needed to include a space after the backslashes in the above table so that they would show up in the github generated markdown table -- my actual table does not include those spaces between the backslash and the quote character)
I am doing a sql query in Hive to get the 'currency' field.
Currently I can run
SELECT
id,
JSON_EXTRACT(obj, '$.products[0].price.currency')
FROM my_table
Which will give me the correct output for the first row, but gives me a NULL in the second row
id
obj
1
"USD"
2
NULL
What is the best way to get currency field from the second row? Is there a way to clean up the field and remove the backslashes before trying to JSON_EXTRACT the relevant data?
I could use REPLACE to swap the '\ ' for '', but is that the most efficient method?
Replace \" with " using regexp_replace like this:
regexp_replace(obj,'\\\\"','"')

How to Split String {} with Multiple Values

Let's say that I have a type STRING column 'debugdata'. An example value for a given user looks like this:
{"TITLE_DESCRIPTION":"approve","CATEGORY":"approve"}
However, let's say there can be multiple values for the TITLE_DESCRIPTION
{"TITLE_DESCRIPTION":"No, name does not match,No, summary is not clear","CATEGORY":"Yes"}
How can I split out the "No, name does not match" and "No, summary is not clear" into two columns?
I've tried using JSON_EXTRACT and JSON_ARRAY_GET and other JSON syntax, but I can't quite break this up into two columns. Thanks!
lets say x is your map from your example
let x = {"TITLE_DESCRIPTION":"No, name does not match,No, summary is not clear","CATEGORY":"Yes"}
so all you need to do is this:
let b = (x.TITLE_DESCRIPTION).split(',')
edit: in you example you split the sentences with ',' but have ',' in the string himself so use char escape for ',' or use other char for split the sentences and send it to the split function instead of ','.
What about first use json_extract and then use the String function split?

Find substring in string

Is it possible to check if a specific substring which is in SQL Server column, is contained in a user provided string?
Example :
SELECT * FROM Table WHERE 'random words to check, which are in a string' CONTAINS Column
From my understanding, CONTAINS can't do such kind of search.
EDIT :
I have a fully indexed text and would like to search (by the fastest method) if a string provided by me contains words that are present in a column.
You can use LIKE:
SELECT * FROM YourTable t
WHERE 'random words ....' LIKE '%' + t.column + '%'
Or
SELECT * FROM YourTable t
WHERE t.column LIKE '%random words ....%'
Depends what did you mean, first one select the records that the column has a part of the provided string. The second one is the opposite.
Just use the LIKE syntax together with % around the string you are looking for:
SELECT
*
FROM
table
WHERE
Column LIKE '%some random string%'
This will return all rows in the table table in which the column Column contains the text "some random string".
1) If you want to get data starting with some letter you can use % this operator like this in your where clause
WHERE
Column LIKE "%some random string"
2) If you want to get data contains any letter you can use
WHERE
Column LIKE "%some random string%"
3)if you want to get data ending with some letter you can use
WHERE
Column LIKE "some random string%"