How to apply LIMIT only to parent rows - sql

In my Postgres database I have a table that holds a simply hierarchy, something like this:
id | parent_id
---------------
When an item in the table is a "top-level" item, its parent_id is set to NULL
However, when I query my table I retrieve the top-level items and the child items that belong to those items. E.g. if there is a single top-level item with two children my query returns three rows. My query is super simple, it looks something like this:
SELECT
*
FROM
my_table
LIMIT
_limit
OFFSET
_offset
;
When the above returns the three rows, in my business logic I then transform that result into a JSON structure that is then serialized to the client. It looks something like this:
items: [
{
id: 1,
parent_id: null,
items: [
{
id: 2,
parent_id: 1
},
{
id: 3,
parent_id: 1
}
]
}
]
However, as you can see my query has OFFSET and LIMIT for, you guessed it, pagination. The table is quite large and I want to restrict the amount of items that can be requested in a single request.
The problem is that, and continuing to use my single top-level item as an example, if the LIMIT is set to 1 then the children of the top-level item will never be returned.
What I am basically looking for is a way to exclude child rows from counting towards the LIMIT, or, to expand the LIMIT with the total number of child rows found.

You're going to have to do two things:
Get the top level entries to include (paginated)
Run another query for the descendants of the top level
This is a fully recursive example
create table t (id int primary key, parent_id int);
insert into t (id, parent_id) values
(1, null), (2, null), (3, null), (4, 1),
(5, 1), (6, 4), (7, 2), (8, 2),
(9, 8), (10, 3), (11, null), (12, null);
with recursive entries (id, parent_id) as (
(
select
id, parent_id
from t
where parent_id is null
order by id limit 2 -- add offset N here
)
union all
(
select
t.id, t.parent_id
from entries inner join t on (t.parent_id = entries.id)
)
)
select * from entries;
https://www.db-fiddle.com/f/g3G2t3mVo7fBhQa9QCA71P/0

Related

Can you sort the result in GROUP BY?

I have two tables one is objects with the attribute of id and is_green.The other table is object_closure with the attributes of ancestor_id, descendant_od, and created_at. ie.
Objects: id, is_green
Object_closure: ancestor_id, descendant_od, created_at
There are more attributes in the Object table but not necessary to mention in this question.
I have a query like this:
-- create a table
CREATE TABLE objects (
id INTEGER PRIMARY KEY,
is_green boolean
);
CREATE TABLE object_Closure (
ancestor_id INTEGER ,
descendant_id INTEGER,
created_at date
);
-- insert some values
INSERT INTO objects VALUES (1, 1 );
INSERT INTO objects VALUES (2, 1 );
INSERT INTO objects VALUES (3, 1 );
INSERT INTO objects VALUES (4, 0 );
INSERT INTO objects VALUES (5, 1 );
INSERT INTO objects VALUES (6, 1 );
INSERT INTO object_Closure VALUES (1, 2, 12-12-2020 );
INSERT INTO object_Closure VALUES (1, 3, 12-13-2020 );
INSERT INTO object_Closure VALUES (2, 3, 12-14-2020 );
INSERT INTO object_Closure VALUES (4, 5, 12-15-2020 );
INSERT INTO object_Closure VALUES (4, 6, 12-16-2020 );
INSERT INTO object_Closure VALUES (5, 6, 12-17-2020 );
-- fetch some values
SELECT
O.id,
P.id,
group_concat(DISTINCT P.id ) as p_ids
FROM objects O
LEFT JOIN object_Closure OC on O.id=OC.descendant_id
LEFT JOIN objects P on OC.ancestor_id=P.id AND P.is_green=1
GROUP BY O.id
The result is
query result
I would like to see P.id for O.id=6 is also 5 instead of null. Afterall,5 is still a parentID (p.id). More importantly, I also want the id shown in P.id as the first created id if there are more than one. (see P.created_at).
I understand the reason why it happens is that the first one the system pick is null, and the null was created by the join with the condition of is_green; however, I need to filter out those objects that are green only in the p.id.
I cannot do an inner join (because I need the other attributes of the table and sometimes both P.id and p_ids are null, but still need to show in the result) I cannot restructure the database. It is already there and cannot be changed. I also cannot just use a Min() or Max() aggregation because I want the ID that is picked is the first created one.
So is there a way to skip the null in the join?
or is there a way to filter the selection in the select clause?
or do an order by before the grouping?
P.S. My original code concat the P.id by the order of P.created_at. For some reason, I cannot replicate it in the online SQL simulator.

Query for value matching in multiple arrays

I have a table containing user experiences, table contains multiple records of same user
JSON example of data
{
user_id : 1,
location: 'india',
company_id: 5,
...other fields
}
{
user_id : 1,
location: 'united kingdom',
company_id: 6
...other fields
}
I want to run a query that gives me results of users who has worked in companies that satisfies IN condition of multiple arrays
E.g
Array1 of company Id: 1,2,4,5,6,7,8,10
Array2 of company Id: 2,6,50,100,12,4
The query should return users who have worked in one of the companies from both arrays, so IN condition of both the arrays should be satisfied
I tried the following query with no luck
select * from <table> where company_id IN(5,7,8) and company_id IN(1,4,3)
and 2 records of a user with company_id 5 and 4 exists in table
create table my_table (user_id int, company_id int);
insert into my_table (user_id, company_id)
values (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 5);
select user_id from my_table where company_id in (5, 7, 8)
intersect
select user_id from my_table where company_id in (1, 4, 3);
As you described, you need to get intersection of users, who are working in two sets of companies.

SQL - Select column and add records based on enum

I'm having some trouble figuring out how to set up my query.
I have a simple 2-column table matching an object id(int) to a tag(string). There's also a legacy data-type, object type(int) that I would like to convert into a tag from the query. For example:
TAG TABLE := { ID, TAG } : (1, FOO), (1, MINT), (2, BAR), (3, FOOBAR), (5, SAUCY)
OBJECT TABLE := { ID, ..., TYPE } : (1, ..., 0), (2, ..., 0), (3, ..., 1),(4, ..., SAUCY)
And the types transfer to tags in the following way (again, an example)
[ 0 -> AWESOME ], [ 1 -> SUPER]
So my goal is to make a query that, using this data, returns:
RETURN TABLE := { ID, TAG_NAME } : (1, AWESOME), (1, FOO), (1, MINT), (2, AWESOME), (2, BAR), (3, FOOBAR), (3, SUPER), (4, SAUCY), (5, SAUCY)
How would I go about setting this up? I tried using case statements for the object type but couldn't get the query to compile... I'm hoping this isn't too tough to create.
Looks to me like a simple UNION ALL:
SELECT ID, TAG FROM TagTable
UNION ALL
SELECT ID, CASE
WHEN TYPE=0 THEN 'AWESOME'
WHEN TYPE=1 THEN 'SUPER'
{etc}
END AS TAG
FROM ObjectTable
Although maybe you need to do some extra join to get your TypeName using the Type in the Object Table. You don't mention where "Awesome" and "Super" come from in your database.
Assuming that
TRANSER_TABLE := {ID, Name} : (0, AWESOME), (1, SUPER)
you can write this:
select ID, TAG
from TAG_TABLE
UNION ALL
select o.ID, t.Name
from
OBJECT_TABLE o
join TRANSER_TABLE t on o.TYPE = t.ID

Querying Parent-Child relationship in a consecutive way

I'm trying to write an import tool to convert my database from one schema to another.
So now I've come across a table that uses a Parent-Child relationship (via PK ID FK ParentID) and I want to select all records consecutively.
The risk of my query is that I might try to import a child element, whose parent element is not already imported. This would result in a recordset that's not going to be imported and is therefore to avoid.
My query I've worked on is as following:
SELECT * FROM Table a INNER JOIN Table b ON (b.ParentID=a.ID and a.ID= b.ParentID)
Unfortunately that doesn't work (it doesn't give me all the records in the table), so I need a query that gives me all rows in the table, ordered by child and parent elements, that I just can loop over to import.
Can someone guide me the way?
What you're looking for is a recursive common table expression which can be found at this link:
http://technet.microsoft.com/en-us/library/ms186243%28v=sql.105%29.aspx
You can use this to tell your downstream ETL the sequence things should be loaded in. For instance, all 1's go first and 2's second and so on.
DECLARE #Table TABLE (
ID INT,
ParentId INT)
INSERT INTO #Table
VALUES
(1, 0),
(2, 1),
(3, 1),
(4, 0),
(5, 4),
(6, 4),
(7, 1),
(8, 7)
--This is the anchor query and selects top level records
;WITH cte_Recursive AS (
SELECT ID, ParentId, 1 [Depth]
FROM #Table
WHERE ParentId = 0
UNION ALL
SELECT T.ID, T.ParentId
,R.Depth + 1 [Depth]
FROM #Table T
INNER JOIN cte_Recursive R ON R.ID = T.ParentId
)
SELECT *
FROM cte_Recursive

Selecting leaf id + root name from a table in oracle

I have a table that is self referencing, with id, parentid (referencing id), name, ordering as columns.
What I want to do is to select the first leaf node of each root and have a pairing of the id of the leaf node with the name of the root node.
The data can have unbounded levels, and siblings have an order (assigned by the "ordering" column). "First leaf node" means the first child's first child's first child's (etc..) child.
The data looks something like this, siblings ordered by ordering:
A
--a
--b
----b.1
----b.2
----b.3
B
--c
----c.1
----c.2
--d
C
--e
----e.1
------e.1.1
I want to be able to produce a mapping as follows:
name of A, id of a
name of B, id of c.1
name of C, id of e.1.1
This is the sql I'm using to achieve this, but I'm not too sure if it will recurse correctly for unbounded levels:
select id,
connect_by_root name name
from table
where connect_by_isleaf = 1
and ((level = 2 and ordering = 1)
or (level > 2 and ordering = 1 and prior ordering = 1))
start with parentid is null
connect by prior id = parentid;
Is there any way I can make rewrite the sql to make it unbounded?
I would use a subquery:
SQL> SELECT root_name, MIN(leaf_name) first_leaf
2 FROM (SELECT id, connect_by_root(r.NAME) root_name, r.NAME leaf_name
3 FROM recurse r
4 WHERE connect_by_isleaf = 1
5 START WITH parentid IS NULL
6 CONNECT BY PRIOR id = parentid)
7 GROUP BY root_name;
ROOT_NAME FIRST_LEAF
---------- ----------
A a
B c.1
C e.1.1
This will give you the first leaf (ordered by the leaf name) for each root.
Update
This is the script I used to generate your data:
CREATE TABLE recurse (
ID NUMBER PRIMARY KEY,
name VARCHAR2(10),
parentid NUMBER REFERENCES recurse (ID));
INSERT INTO recurse VALUES (1, 'A', '');
INSERT INTO recurse VALUES (3, 'b', 1);
INSERT INTO recurse VALUES (4, 'b.1', 3);
INSERT INTO recurse VALUES (5, 'b.2', 3);
INSERT INTO recurse VALUES (6, 'b.3', 3);
INSERT INTO recurse VALUES (7, 'B', '');
INSERT INTO recurse VALUES (8, 'c', 7);
INSERT INTO recurse VALUES (9, 'c.1', 8);
INSERT INTO recurse VALUES (10, 'c.2', 8);
INSERT INTO recurse VALUES (11, 'd', 7);
INSERT INTO recurse VALUES (12, 'C', '');
INSERT INTO recurse VALUES (13, 'e', 12);
INSERT INTO recurse VALUES (14, 'e.2', 13);
INSERT INTO recurse VALUES (15, 'e.1', 13);
INSERT INTO recurse VALUES (16, 'a', 1);
INSERT INTO recurse VALUES (20, 'e.1.1', 15);
As you can see I anticipated that your ordering would not be by name (this is really unclear from your question though).
Now suppose you want to order by ID (or really any other column it doesn't matter), you want to use analytics, for example:
SQL> SELECT DISTINCT root_name,
2 first_value(leaf_name)
3 over(PARTITION BY root_name ORDER BY ID) AS first_leaf_name
4 FROM (SELECT id, connect_by_root(r.NAME) root_name, r.NAME leaf_name
5 FROM recurse r
6 WHERE connect_by_isleaf = 1
7 START WITH parentid IS NULL
8 CONNECT BY PRIOR id = parentid)
9 ORDER BY root_name;
ROOT_NAME FIRST_LEAF_NAME
---------- ---------------
A b.1
B c.1
C e.2