How to select a row from any hstore values? - sql

I've a table Content in a PostgreSQL (9.5) database, which contains the column title. The title column is a hstore. It's a hstore, because the title is translated to different languages. For example:
example=# SELECT * FROM contents;
id | title | content | created_at | updated_at
----+---------------------------------------------+------------------------------------------------+----------------------------+----------------------------
1 | "de"=>"Beispielseite", "en"=>"Example page" | "de"=>"Beispielinhalt", "en"=>"Example conten" | 2016-07-17 09:20:23.159248 | 2016-07-17 09:20:23.159248
(1 row)
My question is, how can I select the content which title contains Example page?
SELECT * FROM contents WHERE title = 'Example page';
This query unfortunately doesn't work.
example=# SELECT * FROM contents WHERE title = 'Example page';
ERROR: Syntax error near 'p' at position 8
LINE 1: SELECT * FROM contents WHERE title = 'Example page';

The avals() function returns an array of all values in a hstore column. You can then match your value using any against that array:
select *
from contents
where 'Example page' = any(avals(title))

You should use like in where clause
SELECT * FROM contents WHERE title like '%Example page%';
Hope it helps you.

Related

How to extract text from jsonb array in PostgreSQL

I want to be able to extract text from jsonb array in a new column as text.
This is my SQL table.
Id.
ErrorCode.
101
["exit code: 1"]
102
["exit code: 3"]
103
["OOMKILLED"]
This is my column definition '[]'::jsonb
I needed help understanding which select command I could use in that case. I used the query above but with no success.
Select reasons -> ' ' as TTT from my_table
I want to get these results after a select command., so I can do SQL filters like
where error = 'exit code: 1' or where error = 'OOMKILLED'
Id.
Error
101
exit code: 1
102
exit code: 3
103
OOMKILLED
Use ->> to extract the first array element as text:
Select id, reasons ->> 0 as reason
from my_table
where reasons ->> 0 in ('exit code: 1','OOMKILLED');
If you don't want to repeat the expression use a derived table:
select *
from (
select id, reasons ->> 0 as reason
from my_table
)
where reason in ('exit code: 1','OOMKILLED');
try this :
SELECT Id, jsonb_array_elements_text(ErrorCode :: jsonb) AS Error
FROM my_table
For more info about json functions see the manual
When you need check is JSONB array contains some value, just use ? operator:
select * from t where err ? 'OOMKILLED';
https://sqlize.online/s/eW

How would I remove specific words or text from KQL query?

I have the following query which provides me with all the data I need exported but I would like text '' removed from my final query. How would I achieve this?
| where type == "microsoft.security/assessments"
| project id = tostring(id),
Vulnerabilities = properties.metadata.description,
Severity = properties.metadata.severity,
Remediations = properties.metadata.remediationDescription
| parse kind=regex id with '/virtualMachines/' Name '/providers/'
| where isnotempty(Name)
| project Name, Severity, Vulnerabilities, Remediations ```
You could use replace_string() (https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/replace-string-function) to replace any substring with an empty string

Get value from json dimensional array in oracle

I have below JSON from which i need to fetch the value of issuedIdentValue where issuedIdentType = PANCARD
{
"issuedIdent": [
{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},
{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"},
{"issuedIdentType":"PANCARD","issuedIdentValue":"078-01-8877"}
]
}
I can not hard-code the index value [2] in my below query as the order of these records can be changed. So want to get rid off any hardcoded index.
select json_value(
'{"issuedIdent": [{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"}, {"issuedIdentType":"PANCARDSctyNb","issuedIdentValue":"078-01-8877"}]}',
'$.issuedIdent[2].issuedIdentValue'
) as output
from d1entzendev.ExternalEventLog
where
eventname = 'CustomerDetailsInqSVC'
and applname = 'digitalBANKING'
and requid = '4fe1fa1b-abd4-47cf-834b-858332c31618';
What changes will need to apply in json_value function to achieve the expected result
In Oracle 12c or higher, you can use JSON_TABLE() for this:
select value
from json_table(
'{"issuedIdent": [{"issuedIdentType":"DriversLicense","issuedIdentValue":"9797979797979797"},{"issuedIdentType":"SclSctyNb","issuedIdentValue":"078-01-8877"}, {"issuedIdentType":"PANCARD","issuedIdentValue":"078-01-8877"}]}',
'$.issuedIdent[*]' columns
type varchar(50) path '$.issuedIdentType',
value varchar(50) path '$.issuedIdentValue'
) t
where type = 'PANCARD'
This returns:
| VALUE |
| :---------- |
| 078-01-8877 |

How do I identify problematic documents in S3 when querying data in Athena?

I have a basic Athena query like this:
SELECT *
FROM my.dataset LIMIT 10
When I try to run it I get an error message like this:
Your query has the following error(s):
HIVE_BAD_DATA: Error parsing field value for field 2: For input string: "32700.000000000004"
How do I identify the S3 document that has the invalid field?
My documents are JSON.
My table looks like this:
CREATE EXTERNAL TABLE my.data (
`id` string,
`timestamp` string,
`profile` struct<
`name`: string,
`score`: int>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'ignore.malformed.json' = 'true'
)
LOCATION 's3://my-bucket-of-data'
TBLPROPERTIES ('has_encrypted_data'='false');
Inconsistent schema
Inconsistent schema is when values in some rows are of different data type. Let's assume that we have two json files
// inside s3://path/to/bad.json
{"name":"1Patrick", "age":35}
{"name":"1Carlos", "age":"eleven"}
{"name":"1Fabiana", "age":22}
// inside s3://path/to/good.json
{"name":"2Patrick", "age":35}
{"name":"2Carlos", "age":11}
{"name":"2Fabiana", "age":22}
Then a simple query SELECT * FROM some_table will fail with
HIVE_BAD_DATA: Error parsing field value 'eleven' for field 1: For input string: "eleven"
However, we can exclude that file within WHERE clause
SELECT
"$PATH" AS "source_s3_file",
*
FROM some_table
WHERE "$PATH" != 's3://path/to/bad.json'
Result:
source_s3_file | name | age
---------------------------------------
s3://path/to/good.json | 1Patrick | 35
s3://path/to/good.json | 1Carlos | 11
s3://path/to/good.json | 1Fabiana | 22
Of course, this is the best case scenario when we know which files are bad. However, you can employ this approach to somewhat manually infer which files are good. You can also use LIKE or regexp_like to walk through multiple files at a time.
SELECT
COUNT(*)
FROM some_table
WHERE regexp_like("$PATH", 's3://path/to/go[a-z]*.json')
-- If this query doesn't fail, that those files are good.
The obvious drawback of such approach is cost to execute query and time spent, especially if it is done file by file.
Malformed records
In the eyes of AWS Athena, good records are those which are formatted as a single JSON per line:
{ "id" : 50, "name":"John" }
{ "id" : 51, "name":"Jane" }
{ "id" : 53, "name":"Jill" }
AWS Athena supports OpenX JSON SerDe library which can be set to evaluate malformed records as NULL by specifying
-- When you create table
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
when you create table. Thus, the following query will reveal files with malformed records:
SELECT
DISTINCT("$PATH")
FROM "some_database"."some_table"
WHERE(
col_1 IS NULL AND
col_2 IS NULL AND
col_3 IS NULL
-- etc
)
Note: you can use only a single col_1 IS NULL if you are 100% sure that it doesn't contain empty fields other then in corrupted rows.
In general, malformed records are not that big of a deal provided that 'ignore.malformed.json' = 'true'. For example the following query will still succeed
For example if a file contains:
{"name": "2Patrick","age": 35,"address": "North Street"}
{
"name": "2Carlos",
"age": 11,
"address": "Flowers Street"
}
{"name": "2Fabiana","age": 22,"address": "Main Street"}
the following query will still succeed
SELECT
"$PATH" AS "source_s3_file",
*
FROM some_table
Result:
source_s3_file | name | age | address
-----------------------------|----------|-----|-------------
1 s3://path/to/malformed.json| 2Patrick | 35 | North Street
2 s3://path/to/malformed.json| | |
3 s3://path/to/malformed.json| | |
4 s3://path/to/malformed.json| | |
5 s3://path/to/malformed.json| | |
6 s3://path/to/malformed.json| | |
7 s3://path/to/malformed.json| 2Fabiana | 22 | Main Street
While with 'ignore.malformed.json' = 'false' (which is the default behaviour) exactly the same query will throw an error
HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2 [character 3 line 1]

Hive Explode and extract a value from a String

Folks, I'm trying to extract value of 'status' from below string(column name: people) in hive. The problem is, the column is neither a complete JSON nor stored as an Array.
I tried to make it look like a JSON by replacing '=' with ':', which didnt help.
[{name=abc, org=true, self=true, status=accepted, email=abc#gmail.com}, {name=cab abc, org=false, self=false, status=needsAction, email=cab#google.com}]
Below is the query I used:
SELECT
str.name,
str.org,
str.status
FROM table
LATERAL VIEW EXPLODE (TRANSLATE(people,'=',':')) exploded as str;
but I'm getting below error:
FAILED: UDFArgumentException explode() takes an array or a map as a parameter
Need output something like this:
name | org | status
-------- ------- ------------
abc | true | accepted
cab abc | false | needsAction
Note: There is a table already, the datatype is string, and I
can't change the table schema.
Solution for Hive. It possibly can be optimized. Read comments in the code:
with your_table as ( --your data example, you select from your table instead
select "[{name=abc, org=true, self=true, status=accepted, email=abc#gmail.com}, {name=cab abc, org=false, self=false, status=needsAction, email=cab#google.com}]" str
)
select --get map values
m['org'] as org ,
m['name'] as name ,
m['self'] as self ,
m['status'] as status ,
m['email'] as email
from
(--remove spaces after commas, convert to map
select str_to_map(regexp_replace(a.s,', +',','),',','=') m --map
from your_table t --replace w your table
lateral view explode(split(regexp_replace(str,'\\[|\\{|]',''),'}, *')) a as s --remove extra characters: '[' or '{' or ']', split and explode
)s;
Result:
OK
true abc true accepted abc#gmail.com
false cab abc false needsAction cab#google.com
Time taken: 1.001 seconds, Fetched: 2 row(s)