How to Query JSON Within A Database - sql

I would like to query information from databases that were created in this format:
index
label
key
data
1
sneaker
UPC
{“size”: “value”, “color”: “value”, “location”: “shelf2”}
2
location
shelf2
{“height”: “value”, “row”: “value”, “column”: “value”}
Where a large portion of the data is in one cell stored in a json array. To make matters a bit tricky, the attributes in json aren’t in any particular order, and sometimes they reference other cells. Ie in the above example there is a “location” attribute which has more data in another row. Additionally sometimes the data cell is a multidimensional array where values are nested inside another json array.
I’m seeking to do certain query tasks like
Find all locations that have a sneaker
Or find all sneakers with a particular color etc
What’s the industry accepted solution on how to do this?
These are sqlite databases that I’m currently using DB Browser for SQLite to query. Definitely open to better solutions if they exist.

The design that you have needs SQLite's JSON1 extension.
The tasks that you mention in your question can be accomplished with the use of functions like json_extract().
Find all locations that have a sneaker
SELECT t1.*
FROM tablename t1
WHERE t1.label = 'location'
AND EXISTS (
SELECT 1
FROM tablename t2
WHERE t2.label = 'sneaker'
AND json_extract(t2.data, '$.location') = t1.key
)
Find all sneakers with a particular color
SELECT *
FROM tablename
WHERE label = 'sneaker'
AND json_extract(data, '$.color') = 'blue'
See the demo.
For more complicated tasks, such as getting values out of json arrays there are other functions like json_each().

Related

Parsing location from a search query in postgresql

I have a table of location data that is stored in json format with an attributes column that contains data as below:-
{
"name" : "Common name of a place or a postcode",
"other_name":"Any aliases",
"country": "country"
}
This is indexed as follows:-
CREATE INDEX location_jsonb_ts_vector
ON location
USING gin (jsonb_to_tsvector('simple'::regconfig, attributes,'["string","numeric"]'::jsonb));
I can search this for a location using the query:-
SELECT *
FROM location
WHERE jsonb_to_tsvector('simple'::regconfig, attributes, '["string", "numeric"]'::jsonb) ## plainto_tsquery('place name')
This works well if just using place names. But I want to search using more complex text strings such as:-
'coffee shops with wifi near charing cross'
'all restaurants within 10 miles of swindon centre'
'london nightlife'
I want to get the location found first and then strip it from the search text and go looking for the items in other tables using my location record to narrow down the scope.
This does not work with my current search mechanism as the intent and requirement pollute the text search vector and can cause odd results. I know this is a NLP problem and needs proper parsing of the search string, but this is for a small proof of concept and needs to work entirely in postgres via SQL or PL/PGSQL.
How can I modify my search to get better matches? I've tried splitting into keywords and looking for them individually, but they risk not bring back results unless combined. For example; "Kings Cross" will bring back "Kings".
I've come up with a cheap and cheerful solution:-
WITH tsv AS (
SELECT to_tsquery('english', 'football | matches | in | swindon') AS search_vector,
'football matches in swindon' AS search_text
)
SELECT * FROM
(
SELECT attributes,
position(lower(ATTRIBUTES->>'name1') IN lower(search_text)) AS name1_position
FROM location,tsv
WHERE jsonb_to_tsvector('simple'::regconfig, attributes, '["string", "numeric"]'::jsonb) ## search_vector
) loc
ORDER BY name1_position DESC

How can I assign pre-determined codes (1,2,3, etc,) to a JSON-type column in PostgreSQL?

I'm extracting a table of 2000+ rows which are park details. One of the columns is JSON type. Image of the table
We have about 15 attributes like this and we also have a documentation of pre-determined codes assigned to each attribute.
Each row in the extracted table has a different set of attributes that you can see in the image. Right now, I have cast(parks.services AS text) AS "details" to get all the attributes for a particular park or extract just one of them using the code below:
CASE
WHEN cast(parks.services AS text) LIKE '%uncovered%' THEN '2'
WHEN cast(parks.services AS text) LIKE '%{covered%' THEN '1' END AS "details"
This time around, I need to extract these attributes by assigning them the codes. As an example, let's just say
Park 1 - {covered, handicap_access, elevator} to be {1,3,7}
Park 2 - {uncovered, always_open, handicap_access} to be {2,5,3}
I have thought of using subquery to pre-assign the codes, but I cannot wrap my head around JSON operators - in fact, I don't know how to extract them on 2000+ rows.
It would be helpful if someone could guide me in this topic. Thanks a lot!
You should really think about normalizing your tables. Don't store arrays. You should add a mapping table to map the parks and the attribute codes. This makes everything much easier and more performant.
step-by-step demo:db<>fiddle
SELECT
t.name,
array_agg(c.code ORDER BY elems.index) as codes -- 3
FROM mytable t,
unnest(attributes) WITH ORDINALITY as elems(value, index) -- 1
JOIN codes c ON c.name = elems.value -- 2
GROUP BY t.name
Extract the array elements into one record per element. Add the WITH ORDINALITY to save the original order.
Join your codes on the elements
Create code arrays. To ensure the correct order, you can use the index values created by the WITH ORDINALITY clause.

Convert Xpath to SQL

Previously we were fetching a number of countries from a table. Now the table changed to support multiple languages inside the text cell. The previous SQL statement was:
select b.text, a.iso_num, a.iso2, a.iso3, a.kfz, a.ind_member
from schema.zl_countrycode a, schema.zl_countrycode_lsd b
where a.ind_land = 1 and a.kfz is not NULL and dat_end = to_date('99991231','YYYYMMDD') and a.ZL_COUNTRYCODE_id = b.ZL_COUNTRYCODE_LSD_ID order by text
The list previously outputted each country in a row.
With the addition of the language selection, the list suddenly doubled in size - because it lists every county in two different languages.
This is how the table looks with the GUI
I'm trying to figure out, how to expend the statement to only select each "DE:" value in the text column. After having a quick indirect discussion with the developers, they provided me this XPATH statement:
Parentofparent/Parent/Valuestable[name="ZL_COUNTRYCODE"]/Valueclassification[name="TEXT"]/Value/ValueLng[lng="DE"]
I don't know the first thing about XPATH and no matter what I have tried so far, it failed.
Please help me out if you're knowledgeable in this field. It's probably an Oracle database, as I have experienced a plethora of Oracle error codes in the last few days.
I think this should do the trick:
SELECT b.TEXT
,a.iso_num
,a.iso2
,a.iso3
,a.kfz
,a.ind_member
,a.de
FROM SCHEMA.zl_countrycode a
JOIN SCHEMA.zl_countrycode_lsd b
ON a.ZL_COUNTRYCODE_id = b.ZL_COUNTRYCODE_LSD_ID AND b.spec_lng = 'DE'
WHERE a.ind_land = 1
AND a.kfz IS NOT NULL
AND dat_end = to_date('99991231', 'YYYYMMDD')
ORDER BY TEXT

SQL MIN() returns multiple values?

I am using SQL server 2005, querying with Web Developer 2010, and the min function appears to be returning more than one value (for each ID returned, see below). Ideally I would like it to just return the one for each ID.
SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber) AS Expr1,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
FROM Production.WorksOrderOperations
INNER JOIN Production.Resources
ON Production.WorksOrderOperations.ResourceID = Production.Resources.ResourceID
INNER JOIN Production.WorksOrderExcel_ExcelExport_View
ON Production.WorksOrderOperations.WorksOrderNumber = Production.WorksOrderExcel_ExcelExport_View.WorksOrderNumber
WHERE Production.WorksOrderOperations.WorksOrderNumber IN
( SELECT WorksOrderNumber
FROM Production.WorksOrderExcel_ExcelExport_View AS WorksOrderExcel_ExcelExport_View_1
WHERE (WorksOrderSuffixStatus = 'Proposed'))
AND Production.Resources.ResourceCode IN ('1303', '1604')
GROUP BY Production.WorksOrderOperations.WorksOrderNumber,
Production.Resources.ResourceCode,
Production.Resources.ResourceDescription,
Production.WorksOrderExcel_ExcelExport_View.PartNumber,
Production.WorksOrderOperations.PlannedQuantity,
Production.WorksOrderOperations.PlannedSetTime,
Production.WorksOrderOperations.PlannedRunTime
If you can get your head around it, I am selecting certain columns from multiple tables where the WorksOrderNumber is also contained within a subquery, and numerous other conditions.
Result set looks a little like this, have blurred out irrelevant data.
http://i.stack.imgur.com/5UFIp.png (Wouldn't let me embed image).
The highlighted rows are NOT supposed to be there, I cannot explicitly filter them out, as this result set will be updated daily and it is likely to happen with a different record.
I have tried casting and converting the OperationNumber to numerous other data types, varchar type returns '100' instead of the '30'. Also tried searching search engines, no one seems to have the same problem.
I did not structure the tables (they're horribly normalised), and it is not possible to restructure them.
Any ideas appreciated, many thanks.
The MIN function returns the minimum within the group.
If you want the minimum for each ID you need to get group on just ID.
I assume that by "ID" you are referring to Production.WorksOrderOperations.WorksOrderNumber.
You can add this as a "table" in your SQL:
(SELECT Production.WorksOrderOperations.WorksOrderNumber,
MIN(Production.WorksOrderOperations.OperationNumber)
FROM Production.WorksOrderOperations
GROUP BY Production.WorksOrderOperations.WorksOrderNumber)

Having trouble returning rows of table using 'LIKE' operator

I am quite new at database programming and I am having trouble doing searches in my database. I have a table with a column named Required_Items, it is just a list of required items separated by ';'. I can't get the server to return the rows when querying :
'SELECT * FROM The_Table WHERE Required_Items LIKE '%item1%' '
It seems that the database can't find that item in the column. The problem is that I want to be able to return rows that contain ALL the items. I would try something like:
'SELECT *
FROM The_Table
WHERE Requiered_Items LIKE '%item1%' AND
Requiered_Items LIKE '%item2%' AND
Requiered_Items LIKE '%item3%' AND//etc...
How can I do that knowing that there will be a variable number of these "items" to test ?
When I have problems like this it invariably turns out to be that my comparison strings simply don't match. (That's not to say they don't look like they match.) Common reasons are spelling mistakes, upper vs lower case issues, characters (particularly spaces) aren't what they appear to be. Do you have spaces in any of your 'like' comparisons?
How did you get the data into the database? Did you copy it from Word or Excel and paste it into the SQL query builder, or somethng of that nature? That can cause problems if you're not careful.
And of course you know that ALL of your 'like' comparisons must match in order to get data...?
Here's an example of what may be happening:
If the 'Required Items' field = 'Bat, Ball, Glove, Cap, Helmet, Water Battle'
then these will both fail:
...where Required_Items like '%Bat%'
and ...'%Water Bottle%'
...where Required_Items like '%Glove%'
and ...'%Water Bottle%'
(Because 'Water Bottle' is spelled incorrectly in the database)
You can troubleshoot for this kind of problem by having one item at a time in your where clause until you find the one that fails.
Regarding a variable number of items, using the data the way you have it set up (all items in one csv field) your code might be cleanest if you used dynamic sql. That's where you build a query in a string vaiable and execute the variable. Search for "Dynamic SQL".
All that said, the preferred method of storing this kind of data in a relational database is to create maintainable relationships between entities. Your data would be much friendlier if you broke the items out into a structure like this:
__The_Table_______ __Thing_Items____ __Items_______________
Thing_ID Thing Thing_ID Item_ID Item_ID Item
-------- -------- -------- ------- ------- --------------
T1 Baseball T1 i1 i1 Ball
T2 Football T1 i2 i2 Bat
T3 Fishing T1 i5 i3 Shoulder Pads
T2 i1 i4 Worms
T2 i3 i5 Water Bottle
T2 i5
T3 i4
T3 i5
This structure would make handling unknown numbers of items very easy to deal with.
If you can't solve this, post actual code and actual data if you can.
Scott
You can't just throw "any number" of arguments into a fixed SQL statement, or it's all one string and you require exact or partial match on the whole string. If you need to query for different numbers of items per query, you'll need a software middle layer, that is able to count number of search terms and construct the appropriate SQL statement on-the-fly.
This query looks right.
You can use Full text index and full text query's to get result too.
eg:
Select * from table where contains(Columns_list,'item1')
Consider reading how to get results faster.