PostgreSQL update JSONB column with value from another column - sql

I want to migrate data from one column (varchar) to another column (jsonb)
Column | Type | Modifiers
------------+-----------------------------+--------------------------------------------------------
id | integer | not null default nextval('merchants_id_seq'::regclass)
name | character varying | not null
nameb | jsonb | not null default '{}'::jsonb
So that nameb will became {"en": "$name"} where $name is a value in name field.
For example:
SELECT name, nameb
before:
name | nameb
--------------------------------------+------------
hello | {}
world | {}
after:
name | nameb
--------------------------------------+------------
hello | {"en": "hello"}
world | {"en": "world"}
With regualar types I can do UPDATE SET whatever = (SELECT ...), but How to do this with jsonb?
UPDATE merchants SET nameb = (SELECT '{"en": "fillme!"}'::jsonb); works, but how to set "fillme!" value from another field?

This can be done with jsonb_build_object function which allows you to build json objects from simple data types.
So to do what you want:
update merchants set nameb = nameb || jsonb_build_object('en', name)
With json_build_object we are making {"en": "hello"}, {"en": "world"} ..dynamically based on value from "name" column.
After that we can simply concat to jsonb values with || operator.
This will not work if nameb is NULL because NULL will "eat" everything and result will be NULL again.
In that case I'd suggest to use COALESCE:
update merchants set nameb = COALESCE(nameb, '{}') || jsonb_build_object('en', name)
The other way to achieve the same is to use jsonb_set function. For this particluar case it's overkill, however it may be handy if you need to set some keys somewhere deeply in json:
update merchants set nameb = jsonb_set(nameb, '{en}', ('"' || name || '"')::jsonb)
This looks weird because we have to construct string surrounded of quotes , i.e: '"hello"' to set it as value for 'en' key. In case if you need to set some json, jsonb_build_object is more handy.

I found solution:
UPDATE merchants AS m1
SET nameb = (
SELECT row_to_json(t) FROM (
SELECT name as en FROM merchants AS m2 WHERE m1.id = m2.id
) t
)::jsonb;
Not sure if it's right, but it works

Yes, jsonb_build_object is best choice.
UPDATE merchants
SET nameb = jsonb_build_object('en', "name",
'cs', '')
WHERE ...
create
name | nameb
--------------------------+------------------------------
hello | {"en": "hello", "cs": ""}
world | {"en": "world", , "cs": ""}

Related

Get value from json dimensional array in oracle

I have below JSON from which i need to fetch the value of issuedIdentValue where issuedIdentType = PANCARD
{
"issuedIdent": [
{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},
{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"},
{"issuedIdentType":"PANCARD","issuedIdentValue":"078-01-8877"}
]
}
I can not hard-code the index value [2] in my below query as the order of these records can be changed. So want to get rid off any hardcoded index.
select json_value(
'{"issuedIdent": [{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"}, {"issuedIdentType":"PANCARDSctyNb","issuedIdentValue":"078-01-8877"}]}',
'$.issuedIdent[2].issuedIdentValue'
) as output
from d1entzendev.ExternalEventLog
where
eventname = 'CustomerDetailsInqSVC'
and applname = 'digitalBANKING'
and requid = '4fe1fa1b-abd4-47cf-834b-858332c31618';
What changes will need to apply in json_value function to achieve the expected result
In Oracle 12c or higher, you can use JSON_TABLE() for this:
select value
from json_table(
'{"issuedIdent": [{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"}, {"issuedIdentType":"PANCARD","issuedIdentValue":"078-01-8877"}]}',
'$.issuedIdent[*]' columns
type varchar(50) path '$.issuedIdentType',
value varchar(50) path '$.issuedIdentValue'
) t
where type = 'PANCARD'
This returns:
| VALUE |
| :---------- |
| 078-01-8877 |

How do I identify problematic documents in S3 when querying data in Athena?

I have a basic Athena query like this:
SELECT *
FROM my.dataset LIMIT 10
When I try to run it I get an error message like this:
Your query has the following error(s):
HIVE_BAD_DATA: Error parsing field value for field 2: For input string: "32700.000000000004"
How do I identify the S3 document that has the invalid field?
My documents are JSON.
My table looks like this:
CREATE EXTERNAL TABLE my.data (
`id` string,
`timestamp` string,
`profile` struct<
`name`: string,
`score`: int>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'ignore.malformed.json' = 'true'
)
LOCATION 's3://my-bucket-of-data'
TBLPROPERTIES ('has_encrypted_data'='false');
Inconsistent schema
Inconsistent schema is when values in some rows are of different data type. Let's assume that we have two json files
// inside s3://path/to/bad.json
{"name":"1Patrick", "age":35}
{"name":"1Carlos", "age":"eleven"}
{"name":"1Fabiana", "age":22}
// inside s3://path/to/good.json
{"name":"2Patrick", "age":35}
{"name":"2Carlos", "age":11}
{"name":"2Fabiana", "age":22}
Then a simple query SELECT * FROM some_table will fail with
HIVE_BAD_DATA: Error parsing field value 'eleven' for field 1: For input string: "eleven"
However, we can exclude that file within WHERE clause
SELECT
"$PATH" AS "source_s3_file",
*
FROM some_table
WHERE "$PATH" != 's3://path/to/bad.json'
Result:
source_s3_file | name | age
---------------------------------------
s3://path/to/good.json | 1Patrick | 35
s3://path/to/good.json | 1Carlos | 11
s3://path/to/good.json | 1Fabiana | 22
Of course, this is the best case scenario when we know which files are bad. However, you can employ this approach to somewhat manually infer which files are good. You can also use LIKE or regexp_like to walk through multiple files at a time.
SELECT
COUNT(*)
FROM some_table
WHERE regexp_like("$PATH", 's3://path/to/go[a-z]*.json')
-- If this query doesn't fail, that those files are good.
The obvious drawback of such approach is cost to execute query and time spent, especially if it is done file by file.
Malformed records
In the eyes of AWS Athena, good records are those which are formatted as a single JSON per line:
{ "id" : 50, "name":"John" }
{ "id" : 51, "name":"Jane" }
{ "id" : 53, "name":"Jill" }
AWS Athena supports OpenX JSON SerDe library which can be set to evaluate malformed records as NULL by specifying
-- When you create table
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
when you create table. Thus, the following query will reveal files with malformed records:
SELECT
DISTINCT("$PATH")
FROM "some_database"."some_table"
WHERE(
col_1 IS NULL AND
col_2 IS NULL AND
col_3 IS NULL
-- etc
)
Note: you can use only a single col_1 IS NULL if you are 100% sure that it doesn't contain empty fields other then in corrupted rows.
In general, malformed records are not that big of a deal provided that 'ignore.malformed.json' = 'true'. For example the following query will still succeed
For example if a file contains:
{"name": "2Patrick","age": 35,"address": "North Street"}
{
"name": "2Carlos",
"age": 11,
"address": "Flowers Street"
}
{"name": "2Fabiana","age": 22,"address": "Main Street"}
the following query will still succeed
SELECT
"$PATH" AS "source_s3_file",
*
FROM some_table
Result:
source_s3_file | name | age | address
-----------------------------|----------|-----|-------------
1 s3://path/to/malformed.json| 2Patrick | 35 | North Street
2 s3://path/to/malformed.json| | |
3 s3://path/to/malformed.json| | |
4 s3://path/to/malformed.json| | |
5 s3://path/to/malformed.json| | |
6 s3://path/to/malformed.json| | |
7 s3://path/to/malformed.json| 2Fabiana | 22 | Main Street
While with 'ignore.malformed.json' = 'false' (which is the default behaviour) exactly the same query will throw an error
HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2 [character 3 line 1]

How to get JSON value from varchar field

*outdated Oracle version
I have a table for receipt data.
I want to get some data from field EXT_ATTR. such as PAYMENT_RECEIPT_NO
The field "EXT_ATTR" is varchar(4000) stored JSON value
SerialId | EXT_ATTR
1 |
{
"PAYMENT_RECEIPT_NO": "PS00000000000000001",
"IS_CORPOR": "1",
"POSTCODE1": "51000",
"POSTCODE2": "51000",
"BILLADDR1PART1": "BILLADDR1PART1_DATA",
"BILLADDR1PART2": "BILLADDR1PART2_DATA",
"NEED_PRINT_WHT": "1",
"WHT_AMT": "0",
"TRXAMT": "2340600",
"LOCATIONID": "02140",
"PAYMENT_METHOD_NAME": "Cash",
"WITH_TAX": "1"
}
2 |
{
"PAYMENT_RECEIPT_NO": "PS00000000000000055",
"IS_CORPOR": "1",
"POSTCODE1": "51000",
"POSTCODE2": "51000",
"BILLADDR1PART1": "BILLADDR1PART1_DATA",
"BILLADDR1PART2": "BILLADDR1PART2_DATA",
"NEED_PRINT_WHT": "1",
"WHT_AMT": "0",
"TRXAMT": "2340600",
"LOCATIONID": "02140",
"PAYMENT_METHOD_NAME": "Cash",
"WITH_TAX": "1"
}
How can I extract varchar filed to get only value.
SerialId | PAYMENT_RECEIPT_NO
1 | PS00000000000000001
2 | PS00000000000000055
Thank you very much.
to work with json documents you can use PL/JSON
if you want to parse it without json Tools, than you can use substr, instr function in Oracle.
depending on what your string looks like, you have to adjust string positions.
create table tab (json varchar2(1000));
insert into tab values('{"PAYMENT_RECEIPT_NO": "PS00000000000000001","IS_CORPOR": "1","POSTCODE1": "51000","POSTCODE2": "51000","BILLADDR1PART1": "BILLADDR1PART1_DATA","BILLADDR1PART2": "BILLADDR1PART2_DATA","NEED_PRINT_WHT": "1","WHT_AMT": "0","TRXAMT": "2340600","LOCATIONID": "02140","PAYMENT_METHOD_NAME": "Cash","WITH_TAX": "1"}');
insert into tab values('{"PAYMENT_RECEIPT_NO": "PS00000000000000055","IS_CORPOR": "1","POSTCODE1": "51000","POSTCODE2": "51000","BILLADDR1PART1": "BILLADDR1PART1_DATA","BILLADDR1PART2": "BILLADDR1PART2_DATA","NEED_PRINT_WHT": "1","WHT_AMT": "0","TRXAMT": "2340600","LOCATIONID": "02140","PAYMENT_METHOD_NAME": "Cash","WITH_TAX": "1"}');
select substr(json,instr(json,': ',1,1)+3,instr(json,',',1,1)-instr(json,': ',1,1)-4)
from tab;
| SUBSTR(JSON,INSTR(JSON,':',1,1)+3,INSTR(JSON,',',1,1)-INSTR(JSON,':',1,1)-4) |
| :--------------------------------------------------------------------------- |
| PS00000000000000001 |
| PS00000000000000055 |
db<>fiddle here
JSON functions are defined for Database Oracle12c+ version. APEX_JSON package with release 5.0+ should be installed for the previous releases. Whenever installation complete, then the following code might be used as an XML data type manner through APEX_JSON.TO_XMLTYPE() function in order to extract the desired values :
WITH t AS
(
SELECT SerialId, APEX_JSON.TO_XMLTYPE(Payment_Receipt_No) AS xml_data
FROM tab
)
SELECT SerialId, Payment_Receipt_No
FROM t
CROSS JOIN
XMLTABLE('/json'
PASSING xml_data
COLUMNS
Payment_Receipt_No VARCHAR2(100) PATH 'PAYMENT_RECEIPT_NO'
)

How to select a row from any hstore values?

I've a table Content in a PostgreSQL (9.5) database, which contains the column title. The title column is a hstore. It's a hstore, because the title is translated to different languages. For example:
example=# SELECT * FROM contents;
id | title | content | created_at | updated_at
----+---------------------------------------------+------------------------------------------------+----------------------------+----------------------------
1 | "de"=>"Beispielseite", "en"=>"Example page" | "de"=>"Beispielinhalt", "en"=>"Example conten" | 2016-07-17 09:20:23.159248 | 2016-07-17 09:20:23.159248
(1 row)
My question is, how can I select the content which title contains Example page?
SELECT * FROM contents WHERE title = 'Example page';
This query unfortunately doesn't work.
example=# SELECT * FROM contents WHERE title = 'Example page';
ERROR: Syntax error near 'p' at position 8
LINE 1: SELECT * FROM contents WHERE title = 'Example page';
The avals() function returns an array of all values in a hstore column. You can then match your value using any against that array:
select *
from contents
where 'Example page' = any(avals(title))
You should use like in where clause
SELECT * FROM contents WHERE title like '%Example page%';
Hope it helps you.

PostgreSQL: Sub-select inside insert

I have a table called map_tags:
map_id | map_license | map_desc
And another table (widgets) whose records contains a foreign key reference (1 to 1) to a map_tags record:
widget_id | map_id | widget_name
Given the constraint that all map_licenses are unique (however are not set up as keys on map_tags), then if I have a map_license and a widget_name, I'd like to perform an insert on widgets all inside of the same SQL statement:
INSERT INTO
widgets w
(
map_id,
widget_name
)
VALUES (
(
SELECT
mt.map_id
FROM
map_tags mt
WHERE
// This should work and return a single record because map_license is unique
mt.map_license = '12345'
),
'Bupo'
)
I believe I'm on the right track but know right off the bat that this is incorrect SQL for Postgres. Does anybody know the proper way to achieve such a single query?
Use the INSERT INTO SELECT variant, including whatever constants right into the SELECT statement.
The PostgreSQL INSERT syntax is:
INSERT INTO table [ ( column [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
[ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]
Take note of the query option at the end of the second line above.
Here is an example for you.
INSERT INTO
widgets
(
map_id,
widget_name
)
SELECT
mt.map_id,
'Bupo'
FROM
map_tags mt
WHERE
mt.map_license = '12345'
INSERT INTO widgets
(
map_id,
widget_name
)
SELECT
mt.map_id, 'Bupo'
FROM
map_tags mt
WHERE
mt.map_license = '12345'
Quick Answer:
You don't have "a single record" you have a "set with 1 record"
If this were javascript: You have an "array with 1 value" not "1 value".
In your example, one record may be returned in the sub-query,
but you are still trying to unpack an "array" of records into separate
actual parameters into a place that takes only 1 parameter.
It took me a few hours to wrap my head around the "why not".
As I was trying to do something very similiar:
Here are my notes:
tb_table01: (no records)
+---+---+---+
| a | b | c | << column names
+---+---+---+
tb_table02:
+---+---+---+
| a | b | c | << column names
+---+---+---+
|'d'|'d'|'d'| << record #1
+---+---+---+
|'e'|'e'|'e'| << record #2
+---+---+---+
|'f'|'f'|'f'| << record #3
+---+---+---+
--This statement will fail:
INSERT into tb_table01
( a, b, c )
VALUES
( 'record_1.a', 'record_1.b', 'record_1.c' ),
( 'record_2.a', 'record_2.b', 'record_2.c' ),
-- This sub query has multiple
-- rows returned. And they are NOT
-- automatically unpacked like in
-- javascript were you can send an
-- array to a variadic function.
(
SELECT a,b,c from tb_table02
)
;
Basically, don't think of "VALUES" as a variadic
function that can unpack an array of records. There is
no argument unpacking here like you would have in a javascript
function. Such as:
function takeValues( ...values ){
values.forEach((v)=>{ console.log( v ) });
};
var records = [ [1,2,3],[4,5,6],[7,8,9] ];
takeValues( records );
//:RESULT:
//: console.log #1 : [1,2,3]
//: console.log #2 : [4,5,7]
//: console.log #3 : [7,8,9]
Back to your SQL question:
The reality of this functionality not existing does not change
just because your sub-selection contains only one result. It is
a "set with one record" not "a single record".