Select row with all related child rows as array in one query - sql

I'm using Postgres (latest) with node (latest) PG (latest). Some endpoint is receiving json which looks like:
{
"id": 12345,
"total": 123.45,
"items": [
{
"name": "blue shirt",
"url": "someurl"
},
{
"name": "red shirt",
"url": "someurl"
}
]
}
So I'm storing this in two tables:
CREATE TABLE orders (
id INT NOT NULL,
total NUMERIC(10, 2) DEFAULT 0 NOT NULL,
PRIMARY KEY (id)
);
CREATE INDEX index_orders_id ON orders(id);
CREATE TABLE items (
id BIGSERIAL NOT NULL,
order_id INT NOT NULL,
name VARCHAR(128) NOT NULL,
url VARCHAR(128) DEFAULT '' NOT NULL,
PRIMARY KEY (id),
FOREIGN KEY (order_id) REFERENCES orders(id) ON DELETE CASCADE
);
CREATE INDEX index_items_id ON items(id);
The items table has a FK of order_id to relate the id of the order to its respective items.
Now, the issue is I almost always need to fetch the order along with the items.
How do I get an output similar to my input json in one query?
I know it can be done in two queries, but this pattern will be all over the place and needs to be efficient. My last resort would be to store the items as JSONB column directly in the orders table, but then if I need to query on the items or do joins with them it won't be as easy.

One of many ways:
SELECT jsonb_pretty(
to_jsonb(o.*) -- taking whole row
|| (SELECT jsonb_build_object('items', jsonb_agg(i))
FROM (
SELECT name, url -- picking columns
FROM items i
WHERE i.order_id = o.id
) i
)
)
FROM orders o
WHERE o.id = 12345;
This returns formatted text similar to the displayed input. (But keys are sorted, so 'total' comes after 'items'.)
If an order has no items, you get "items": null.
For a jsonb value, strip the jsonb_pretty() wrapper.
I chose jsonb for its additional functionality - like the jsonb || jsonb → jsonb operator and the jsonb_pretty() function.
Related:
Return multiple columns of the same row as JSON array of objects
If you want a json value instead, you can cast the jsonb directly (without format) or the formatted text (with format). Or build a json value with rudimentary formatting directly (faster):
SELECT row_to_json(sub, true)
FROM (
SELECT o.*
, (SELECT json_agg(i)
FROM (
SELECT name, url -- pick columns to report
FROM items i
WHERE i.order_id = o.id
) i
) AS items
FROM orders o
WHERE o.id = 12345
) sub;
db<>fiddle here
It all depends on what you need exactly.
Aside:
Consider type text (or varchar) instead of the seemingly arbitrary varchar(128). See:
Should I add an arbitrary length limit to VARCHAR columns?

Related

how to get data on arrays of values

I have pairs of values test_name, test_surname.
How can I delete these rows from the table with one query. I assumed it could be done this way, but it can't be done this way.
DELETE FROM test_info
WHERE id_name = ($1::uuid[])
AND id_surname = ($2::uuid[])
this is schema
create table test_info
(
id_name uuid not null,
id_surname uuid not null,
);
Example of unnesting 2 arrays in one query (arrays must have the same size and dimension)
select unnest(array['1','2']),unnest(array['3','4']);
Delete rows
delete from test_info
where
(id_name,id_surname) in (select unnest($1::uuid[]),unnest($2::uuid[]))

How to select many rows from 1 table and insert into a specific JSONB field of a specific row in another table? But in a single raw SQL query

postgres 10.3
I have about 1000 rows inside a table called sites
If I query like this
SELECT id, name from sites;
I will get the 1000 rows.
I also have another table called jsonindexdocument with a single row where the id is 1 and a field called index that is JSONB
Is it possible that in a single query I take out all the 1000 rows in sites table and then update the field called index under id 1?
The format of the json would be
[
{
"id": 10,
"name": "somename"
},
{
"id": 11,
"name": "another name"
} // and the rest of the 1000 rows
]
I am also okay if it uses more than 1 raw SQL statement.
UPDATE
I want to add that if the result is empty set, then default to empty array in the json field
Assuming you're OK with fully replacing the index value in the jsonindexdocument table:
UPDATE jsonindexdocument
SET index = (
-- using json_agg(row_to_json(sites.*)) would also work here, if you want to copy
-- all columns from the sites table into the json value
SELECT COALESCE(json_agg(json_build_object(
'id', id,
'name', name
)), '[]'::json)
FROM sites
)
WHERE id = 1;
As an example:
CREATE TEMP TABLE sites (
id INT,
name TEXT
);
CREATE TEMP TABLE jsonindexdocument (
id INT,
index JSON
);
INSERT INTO sites
VALUES (1, 'name1')
, (2, 'name2');
INSERT INTO jsonindexdocument
VALUES (1, NULL);
UPDATE jsonindexdocument
SET index = (
SELECT COALESCE(json_agg(json_build_object(
'id', id,
'name', name
)), '[]'::json)
FROM sites
)
WHERE id = 1;
SELECT * FROM jsonindexdocument;
returns
+--+------------------------------------------------------------+
|id|index |
+--+------------------------------------------------------------+
|1 |[{"id" : 1, "name" : "name1"}, {"id" : 2, "name" : "name2"}]|
+--+------------------------------------------------------------+

Json query vs SQL query using JSON in Oracle 12c (Performance)

I am using oracle 12c and Sql Developer with json
For this example I have the follow JSON:
{
"id": "12",
"name": "zhelon"
}
So I have created the follow table for this:
create table persons
id number primary key,
person clob,
constraint person check(person is JSON);
The idea is persist in person column the previous JSON and use a the follow query to get that data
SELECT p.person FROM persons p WHERE json_textvalue('$name', 'zhelon')
Talking about perfonce, I am intresting to extract some json field and add new a colum to the table to improve the response time (I don't know if that is possible)
create table persons
id number primary key,
name varchar(2000),
person clob,
constraint person check(person is JSON);
To do this:
SELECT p.person FROM persons p WHERE p.name = 'zhelon';
My question is:
What's the best way to make a query to get data? I want to reduce the response time.
Which query get the data faster ?
SELECT p.person FROM persons p WHERE json_textvalue('$name', 'zhelon')
or
SELECT p.person FROM persons p WHERE p.name = 'zhelon';
You can create a virtual column like this:
ALTER TABLE persons ADD (NAME VARCHAR2(100)
GENERATED ALWAYS AS (JSON_VALUE(person, '$name' returning VARCHAR2)) VIRTUAL);
I don't know the correct syntax of JSON_VALUE but I think you get an idea.
If needed you can also define a index on such columns like any other column.
However, when you run SELECT p.person FROM persons p WHERE p.name = 'zhelon';
I don't know which value takes precedence, p.person from JSON or the column.
Better use a different name in order to be on the safe side:
ALTER TABLE persons ADD (NAME_VAL VARCHAR2(100)
GENERATED ALWAYS AS (JSON_VALUE(person, '$name' returning VARCHAR2)) VIRTUAL);
SELECT p.person FROM persons p WHERE p.NAME_VAL= 'zhelon';

Tricky PostgreSQL join and order query

I've got four tables in a PostgreSQL 9.3.6 database:
sections
fields (child of sections)
entries (child of sections)
data (child of entries)
CREATE TABLE section (
id serial PRIMARY KEY,
title text,
"group" integer
);
CREATE TABLE fields (
id serial PRIMARY KEY,
title text,
section integer,
type text,
"default" json
);
CREATE TABLE entries (
id serial PRIMARY KEY,
section integer
);
CREATE TABLE data (
id serial PRIMARY KEY,
data json,
field integer,
entry integer
);
I'm trying to generate a page that looks like this:
section title
field 1 title | field 2 title | field 3 title
entry 1 | data 'as' json | data 1 json | data 3 json <-- table
entry 2 | data 'df' json | data 5 json | data 6 json
entry 3 | data 'gh' json | data 8 json | data 9 json
The way I have it set up right now each piece of 'data' has an entry it's linked to, a corresponding field (that field has columns that determine how the data's json field should be interpreted), a json field to store different types of data, and an id (1-9 here in the table).
In this example there are 3 entries, and 3 fields and there is a data piece for each of the cells in between.
It's set up like this because one section can have different field types and quantity than another section and therefore different quantities and types of data.
Challenge 1:
I'm trying to join the table together in a way that it's sortable by any of the columns (contents of the data for that field's json column). For example I want to be able to sort field 3 (the third column) in reverse order, the table would look like this:
section title
field 1 title | field 2 title | field 3 title
entry 3 | data 'gh' json | data 8 json | data 9 json
entry 2 | data 'df' json | data 5 json | data 6 json
entry 1 | data 'as' json | data 1 json | data 3 json <-- table
I'm open to doing it another way too if there's a better one.
Challenge 2:
Each field has a 'default value' column - Ideally I only have to create 'data' entries when they have a value that isn't that default value. So the table might actually look like this if field 2's default value was 'asdf':
section title
field 1 title | field 2 title | field 3 title
entry 3 | data 'gh' json | data 8 json | data 9 json
entry 2 | data 'df' json | 'asdf' | data 6 json
entry 1 | data 'as' json | 'asdf' | data 3 json <-- table
The key to writing this query is understanding that you just need to fetch all the data for single section and the rest you just join. You also can't with your schema directly filter data by section so you'll need to join entry just for that:
SELECT d.* FROM data d JOIN entries e ON (d.entry = e.id)
WHERE e.section = ?
You can then join field to each row to get defaults, types and titles:
SELECT d.*, f.title, f.type, f."default"
FROM data d JOIN entries e ON (d.entry = e.id)
JOIN fields f ON (d.field = f.id)
WHERE e.section = ?
Or you can select fields in a separate query to save some network traffic.
So this was an answer, here come bonuses:
Use foreign keys instead of integers to refer to other tables, it will make database check consistency for you.
Relations (tables) should be called in singular by convention, so it's section, entry and field.
Referring fields are called <name>_id, e.g. field_id or section_id also by convention.
The whole point of JSON fields is to store a collection with not statically defined data, so it would made much more sense to not use entries and data tables, but single table with JSON containing all the fields instead.
Like this:
CREATE TABLE row ( -- less generic name would be even better
id int primary key,
section_id int references section (id),
data json
)
With data fields containing something like:
{
"title": "iPhone 6",
"price": 650,
"available": true,
...
}
#Suor has provided good advice, some of which you already accepted. I am building on the updated schema.
Schema
CREATE TABLE section (
section_id serial PRIMARY KEY,
title text,
grp integer
);
CREATE TABLE field (
field_id serial PRIMARY KEY,
section_id integer REFERENCES section,
title text,
type text,
default_val json
);
CREATE TABLE entry (
entry_id serial PRIMARY KEY,
section_id integer REFERENCES section
);
CREATE TABLE data (
data_id serial PRIMARY KEY,
field_id integer REFERENCES field,
entry_id integer REFERENCES entry,
data json
);
I changed two more details:
section_id instead of id, etc. "id" as column name is an anti-pattern that's gotten popular since a couple of ORMs use it. Don't. Descriptive names are much better. Identical names for identical content is a helpful guideline. It also allows to use the shortcut USING in join clauses:
Don't use reserved words as identifiers. Use legal, lower-case, unquoted names exclusively to make your life easier.
Are PostgreSQL column names case-sensitive?
Referential integrity?
There is another inherent weakness in your design. What stops entries in data from referencing a field and an entry that don't go together? Closely related question on dba.SE
Enforcing constraints “two tables away”
Query
Not sure if you need the complex design at all. But to answer the question, this is the base query:
SELECT entry_id, field_id, COALESCE(d.data, f.default_val) AS data
FROM entry e
JOIN field f USING (section_id)
LEFT JOIN data d USING (field_id, entry_id) -- can be missing
WHERE e.section_id = 1
ORDER BY 1, 2;
The LEFT JOIN is crucial to allow for missing data entries and use the default instead.
SQL Fiddle.
crosstab()
The final step is cross tabulation. Cannot show this in SQL Fiddle since the additional module tablefunc is not installed.
Basics for crosstab():
PostgreSQL Crosstab Query
SELECT * FROM crosstab(
$$
SELECT entry_id, field_id, COALESCE(d.data, f.default_val) AS data
FROM entry e
JOIN field f USING (section_id)
LEFT JOIN data d USING (field_id, entry_id) -- can be missing
WHERE e.section_id = 1
ORDER BY 1, 2
$$
,$$SELECT field_id FROM field WHERE section_id = 1 ORDER BY field_id$$
) AS ct (entry int, f1 json, f2 json, f3 json) -- static
ORDER BY f3->>'a'; -- static
The tricky part here is the return type of the function. I provided a static type for 3 fields, but you really want that dynamic. Also, I am referencing a field in the json type that may or may not be there ...
So build that query dynamically and execute it in a second call.
More about that:
Dynamic alternative to pivot with CASE and GROUP BY

SQLite, Many to many relations, How to aggregate?

I have the classic arrangement for a many to many relation in a small flashcard like application built using SQLite. Every card can have multiple tags, and every tag can have multiple cards. This two entities having each a table with a third table to link records.
This is the table for Cards:
CREATE TABLE Cards (CardId INTEGER PRIMARY KEY AUTOINCREMENT,
Text TEXT NOT NULL,
Answer INTEGER NOT NULL,
Success INTEGER NOT NULL,
Fail INTEGER NOT NULL);
This is the table for Tags:
CREATE TABLE Tags (TagId INTEGER PRIMARY KEY AUTOINCREMENT,
Name TEXT UNIQUE NOT NULL);
This is the cross reference table:
CREATE TABLE CardsRelatedToTags (CardId INTEGER,
TagId INTEGER,
PRIMARY KEY (CardId, TagId));
I need to get a table of cards with their associated tags in a column separated by commas.
I can already get what I need for a single row knowing its Id with the following query:
SELECT Cards.CardId, Cards.Text,
(SELECT group_concat(Tags.Name, ', ') FROM Tags
JOIN CardsRelatedToTags ON CardsRelatedToTags.TagId = Tags.TagId
WHERE CardsRelatedToTags.CardId = 1) AS TagsList
FROM Cards
WHERE Cards.CardId = 1
This will result in something like this:
CardId | Text | TagsList
1 | Some specially formatted text | Tag1, Tag2, TagN...
How to get this type of result (TagsList from group_concat) for every row in Cards using a SQL query? It is advisable to do so from the performance point of view? Or I need to do this sort of "presentation" work in application code using a simpler request to the DB?
Answering your code question:
SELECT
c.CardId,
c.Text,
GROUP_CONCAT(t.Name,', ') AS TagsList
FROM
Cards c
JOIN CardsRelatedToTags crt ON
c.CardId = crt.CardId
JOIN Tags t ON
crt.TagId = t.TagId
WHERE
c.CardId = 1
GROUP BY c.CardId, c.Text
Now, to the matter of performance. Databases are a powerful tool and do not end on simple SELECT statements. You can definitely do what you need inside a DB (even SQLite). It is a bad practice to use a SELECT statement as a feed for one column inside another SELECT. It would require scanning a table to get result for each row in your input.