SQL Remove \n and parse JSON in one command - sql

The data is formatted like so:
Query:
select X from DB
Output:
{\n "_id": "5a7e4b7cf36d3920dd24bc0e",\n "price": 0,\n "name": "XXX"\n}
What I'm trying to do is both remove the \n characters and parse the response itself. I'd like to grab just the _id field.
My current query is not quite right:
Step 1: Remove the \n characters:
SELECT REPLACE(REPLACE(X, CHAR(13), ''), CHAR(10), '') from DB
Output:
{"_id": "5a7e4b7cf36d3920dd24bc0e", "price": 0,"name": "XXX"}
Question: How can I tweak this query to parse the JSON and return the _id field all at once? I've tried this with no luck:
SELECT PARSE_JSON(REPLACE(REPLACE(X, CHAR(13), ''), CHAR(10), '')) from DB
^ This query just outputs the same as the first query.

Have you tried
SELECT X:_id FROM DB

Related

How to remove all \ from nested json in SQL Redshift?

I've got some problems with extracting values from nested json values in column.
I've got a column of data with values that looks almost like nested json, but some of jsons got \ between values and I need to clean them.
JSON looks like this:
{"mopub_json":
"{\"currency\":\"USD\",
\"country\":\"US\",
\"publisher_revenue\":0.01824}
"}
I need to get currency and publisher revenue as different columns and try this:
SET json_serialization_enable TO true;
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, "\t", '')
FROM table_name)
I receive the next error:
[Amazon](500310) Invalid operation: column "\t" does not exist in events
When I'm trying this:
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, chr(92), '')
FROM table_name)
I receive
Invalid operation: JSON parsing error
When I'm trying to extract values without replacing , I'm receiving empty columns.
Thank you for your help!
So your json isn't valid. JSON doesn't allow multiline text strings but I expect that the issue. Based on your query I think you don't want a single key and string but the whole structure. The reason the that quotes are backslashed is because they are inside a string. The json should look like:
{
"mopub_json": {
"currency": "USD",
"country": "US",
"publisher_revenue": 0.01824
}
}
Then the SQL you have should work.

How can I convert a string to JSON in Snowflake?

I have this string {id: evt_1jopsdgqxhp78yqp7pujesee, created: 2021-08-14t16:38:17z} and would like to convert it to a JSON, I tried parse_json but got an error, to_variant and converted to "{id: evt_1jopsdgqxhp78yqp7pujesee, created: 2021-08-14t16:38:17z}"
To Gokhan & Simon's point, the original data isn't valid JSON.
If you're 100% (1000%) certain it'll "ALWAYS" come that way, you can treat it as a string parsing exercise and do something like this, but once someone changes the format a bit it'll have an issue.
create temporary table abc (str varchar);
insert into abc values ('{id: evt_1jopsdgqxhp78yqp7pujesee, created: 2021-08-14t16:38:17z}');
select to_json(parse_json(json_str)) json_json
FROM (
select split_part(ltrim(str, '{'), ',', 1) as part_a,
split_part(rtrim(str, '}'), ',', 2) as part_b,
split_part(trim(part_a), ': ', 1) part_a_name,
split_part(trim(part_a), ': ', 2) part_a_val,
split_part(trim(part_b), ': ', 1) part_b_name,
split_part(trim(part_b), ': ', 2) part_b_val,
'{"'||part_a_name||'":"'||part_a_val||'", "'||part_b_name||'":"'||part_b_val||'"}' as json_str
FROM abc);
which returns a valid JSON
{"created":"2021-08-14t16:38:17z","id":"evt_1jopsdgqxhp78yqp7pujesee"}
Overall this is very fragile, but if you must do it, feel free to.
Your JSON is not valid, as you can validate it using any online tool:
https://jsonlint.com/
This is a valid JSON version of your data:
{
"id": "evt_1jopsdgqxhp78yqp7pujesee",
"created": "2021-08-14t16:38:17z"
}
And you can parse it successfully using parse_json:
select parse_json('{ "id": "evt_1jopsdgqxhp78yqp7pujesee", "created": "2021-08-14t16:38:17z"}');

Using JSON_VALUE() when value contains unescaped double quotes

I have a table in the database where in one field (name of the field - JSONDetail) JSON is stored. Recently we encountered a problem where in this field in one of the values there are unescaped double quotes. It's due to migration from another system which allowed double quotes to be stored in the database without backslash before them.
Example (see field "comment"):
{
"noteId": "a34f17c4-f4fd-45ea-b4da-732ef8126a6b",
"memberName": "Test LINKOUS",
"tenantId": "548bead1-bdab-e811-bce7-0003ff21d46b",
"noteType": "General Note",
"memberId": "84cf0adb-850d-e711-80c8-000d3a103f46",
"createdOn": "2020-09-13T17:47:33.2864868Z",
"comment": "test "word" test",
"contacts": [
{
"otherContactType": "",
"communicationType": ""
}
]
}
We need to identify such cases in the database. I tried:
select JSON_VALUE (JSONDetail, '$.comment') as Comment
But instead of test "word" test, it returned
How can I return what is actually stored in key "comment"?
SQL-server does not have "fix_json" function
To find junk records
select *
from table
where ISJSON(json_col) = 0
Fix founded records via back-end language (php, c#, etc)
To prevent such behavior in future add constraint
ALTER TABLE table
ADD CONSTRAINT [record should be formatted as JSON]
CHECK (ISJSON(json_col)=1)
If comment keys are followed by contacts key throught the table within the JSONDetail column, then you can use the following code block which contains SUBSTRING(), PATINDEX(), TRIM() and LEN() functions to extract the whole value of the comment key in order to compare with the value extracted from JSON_VALUE (JSONDetail, '$.comment') :
WITH t(json_extracted,str) AS
(
SELECT JSON_VALUE (JSONDetail, '$.comment'),
SUBSTRING(
JSONDetail,
PATINDEX('%"comment"%', JSONDetail),
PATINDEX('%"contacts"%', JSONDetail)-PATINDEX('%"comment"%', JSONDetail)
)
FROM tab
), t2(json_extracted,str) AS
(
SELECT json_extracted,
TRIM(
SUBSTRING( str, PATINDEX('%:%', str) + 1,
PATINDEX('%,%', str) - PATINDEX('%:%', str) - 1 ) )
FROM t
)
SELECT SUBSTRING(str,2,LEN(str)-2) AS extracted_comment,
CASE WHEN json_extracted = SUBSTRING(str,2,LEN(str)-2)
THEN
'No'
ELSE
'Yes'
END AS "is_it_corrupted"
FROM t2
Demo
[EDIT] It wasn't practical to infer the field location in the JSON string based on length. Based on CHARINDEX search for the field names, this code finds and fixes the 'comments' in the JSON.
Data
drop table if exists #json_to_fix;
go
create table #json_to_fix(
json_col nvarchar(max));
declare #json nvarchar(max)=N'
{
"noteId": "a34f17c4-f4fd-45ea-b4da-732ef8126a6b",
"memberName": "Test LINKOUS",
"tenantId": "548bead1-bdab-e811-bce7-0003ff21d46b",
"noteType": "General Note",
"memberId": "84cf0adb-850d-e711-80c8-000d3a103f46",
"createdOn": "2020-09-13T17:47:33.2864868Z",
"comment": "test "word" test",
"contacts": [
{
"otherContactType": "",
"communicationType": ""
}
]
}';
insert #json_to_fix(json_col) values (#json);
Query
select s.not_escaped, fix.string_to_fix,
replace(fix.string_to_fix, '"', '') fixed
from #json_to_fix j
cross apply
(select charindex('"comment":', j.json_col, 1) strt_ndx) c_start
cross apply
(select charindex('"contacts"', j.json_col, c_start.strt_ndx) end_ndx) c_end
cross apply
(select substring(json_col, c_start.strt_ndx, c_end.end_ndx-c_start.strt_ndx-11) not_escaped) s
cross apply
(select substring(s.not_escaped, 13, len(s.not_escaped)-13) string_to_fix) fix
Output
not_escaped string_to_fix fixed
"comment": "test "word" test" test "word" test test word test

how to read key/value from a column which values are JSON type in postgreSQL

I'm trying to read the column which type is json, values in column look like this
column1
---------------------------------------------
"[{'name': 'Kate', 'position': 'painter'}]"
Im using this query, but all I get is null, what can I do to get the values for each keys?
SELECT
column1 ->> 'name' AS name
FROM
table1;
Then you use jsonb_pretty that Returns from_json as indented JSON text.
select jsonb_pretty('[{"name": "Kate", "position": "painter"}]');
Output display you:
jsonb_pretty
-------------------------------
[ +
{ +
"name": "Kate", +
"position": "painter"+
} +
]
so in your case you use
SELECT jsonb_pretty(column1) AS name FROM table1;
Use json_array_elements function:
SELECT json_array_elements(t) -> 'name'
FROM table1;

Issues with JSON_EXTRACT in Presto for keys containing ' ' character

I'm using Presto(0.163) to query data and am trying to extract fields from a json.
I have a json like the one given below, which is present in the column 'style_attributes':
"attributes": {
"Brand Fit Name": "Regular Fit",
"Fabric": "Cotton",
"Fit": "Regular",
"Neck or Collar": "Round Neck",
"Occasion": "Casual",
"Pattern": "Striped",
"Sleeve Length": "Short Sleeves",
"Tshirt Type": "T-shirt"
}
I'm unable to extract field 'Short Sleeves'.
Below is the query i'm using:
Select JSON_EXTRACT(style_attributes,'$.attributes.Sleeve Length') as length from table;
The query fails with the following error- Invalid JSON path: '$.attributes.Sleeve Length'
For fields without ' '(space), query is running fine.
I tried to find the resolution in the Presto documentation, but with no success.
presto:default> select json_extract_scalar('{"attributes":{"Sleeve Length": "Short Sleeves"}}','$.attributes["Sleeve Length"]');
_col0
---------------
Short Sleeves
or
presto:default> select json_extract_scalar('{"attributes":{"Sleeve Length": "Short Sleeves"}}','$["attributes"]["Sleeve Length"]');
_col0
---------------
Short Sleeves
JSON Function Changes
The :func:json_extract and :func:json_extract_scalar functions now
support the square bracket syntax:
SELECT json_extract(json, '$.store[book]');
SELECT json_extract(json,'$.store["book name"]');
As part of this change, the set of characters
allowed in a non-bracketed path segment has been restricted to
alphanumeric, underscores and colons. Additionally, colons cannot be
used in a un-quoted bracketed path segment. Use the new bracket syntax
with quotes to match elements that contain special characters.
https://github.com/prestodb/presto/blob/c73359fe2173e01140b7d5f102b286e81c1ae4a8/presto-docs/src/main/sphinx/release/release-0.75.rst
SELECT
tags -- It is column with Json string data
,json_extract(tags , '$.Brand') AS Brand
,json_extract(tags , '$.Portfolio') AS Portfolio
,cost
FROM
TableName
Sample data for tags - {"Name": "pxyblob", "Owner": "", "Env": "prod", "Service": "", "Product": "", "Portfolio": "OPSXYZ", "Brand": "Limo", "AssetProtectionLevel": "", "ComponentInfo": ""}
Here is your Correct answer.
Let Say:
JSON : {"Travel Date":"2017-9-22", "City": "Seattle"}
Column Name: ITINERARY
And i wana extract 'Travel Date' form the current JSON then:
Query: SELECT JSON_EXTRACT(ITINERARY, "$.\"Travel Date\"") from Table
Note: Just add \" at starting and end of the key name.
Hope this will surely work for you need. :)