How to concatenate field values with recursive query in postgresql? - sql

I have a table in PostgreSQL database that contains parts of addresses in a form of a tree and looks like this:
Id | Name | ParentId
1 | London | 0
2 | Hallam Street| 1
3 | Bld 26 | 2
4 | Office 5 | 3
I would like to make a query to return an address, concatenated from all ancestor names. I need the result table to be like this:
Id | Address
1 | London
2 | London, Hallam Street
3 | London, Hallam Street, Bld 26
4 | London, Hallam Street, Bld 26, Office 5
I guess I have to use WITH RECURSIVE query, but all the examples I've found use the where clause, so I have to put WHERE name='Office 5' to get the result only for that particular row. But I need a concatenated address for each row of my initial table. How can this be done?

The trick with recursive queries is that you need to specify a seed query. This is the query that determines your root node, or the starting point to descend or ascend the tree that you are building.
The reason the WHERE clause is there is to establish the seed ID=1 or Name=Bld 26. If you want every record to have the tree ascended or descended (depending on what you specify in the unioned select), then you should just scrap the WHERE statement so all records are seeded.
Although, the example you give... you might want to start with WHERE ID=1 in the seed, write out the child ID and parent ID. Then in the Union'd SELECT join your derived Recursive table with your table from which you are selecting and join on the Derived Recursive table's Child to your table's parent.
Something like:
WITH RECURSIVE my_tree AS (
-- Seed
SELECT
ID as Child,
ParentID as Parent,
Name,
Name as Address
FROM <table>
WHERE <table>.ID = 1
UNION
-- Recursive Term
SELECT
table.id as Child,
table.parent_id as Parent,
table.name,
t.address || ', ' || table.name as Address
FROM my_tree as t
INNER JOIN <table> table ON
t.Child = table.Parent_Id
)
SELECT Child, Address from my_tree;
I've not used PostgreSQL before, so you might have to fuss a bit with the syntax, but I think this is pretty accurate for that RDBMS.

Related

Is chaining rows in the same table a bad pattern?

I want to create a tree structure of categories and need to find a proper way to store it into the database. Think of the following animal tree, which pretty accurately describes how it should look like:
My question now is whether chaining those entries within the same table is a good idea or not. SQLite doesn't allow me to add a FOREIGN KEY constraint to a value in the same table, so I have to make sure manually that I don't create inconsistencies. This is what I currently plan to have:
id | parent | name
---+--------+--------
1 | null | Animal
2 | 1 | Reptile
3 | 2 | Lizard
4 | 1 | Mammal
5 | 4 | Equine
6 | 4 | Bovine
parent references to an id in the same table, going up all the way until null is found, which is the root. Is this a bad pattern? And if so, what are common alternatives to put a tree structure into a relational database?
If your version of SQLite supports recursive CTE, then this is one option:
WITH RECURSIVE cte (n) AS (
SELECT id FROM yourTable WHERE parent IS NULL
UNION ALL
SELECT t1.id
FROM yourTable t1
INNER JOIN cte t2
ON t1.parent = t2.n AND t1.name NOT LIKE '%Lizard%'
)
SELECT *
FROM yourTable
WHERE id IN cte;
This is untested, but the check on t1.name in the recursive portion of the above CTE (hopefully) should stop the recursion as soon we reach a record which matches the name in the LIKE expression. In the case of searching for Lizard, the recursion should stop one level above Lizard, meaning that every record above it in the hierarchy should be returned.

MS Access Database tables comparison

I am trying to compare three MS Access tables for any given field. For example, I have a Main Table, which holds the record for school children. It has the fields Student ID and Name. Then there are 3 sub-tables schools, but they have some data discrepancy. So lets call these schools, A, B and C. These schools have somehow mixed up Student ID with Name, so I need a way to return any Student ID, which has a mismatch for Name. The Main table has student ID as the PKey, and the other; A, B & C have student ID as PKey as well. But the problem is that when I build relationships in Access, it only returns IDs that are common in all 3 tables - INNER JOIN. I need an efficient way to match schools, A -> B & A -> C and concatenate the results. I think JOINING each of these in pairs might take far too long. Please let me know if you have any other alternatives.
So, you have two problems:
You have bad data that needs to be fixed Student_ID and NAme mixed
up
Your schema is not good.
Addressing the data issue:
If your student_ids are all numeric, you could try something like:
UPDATE subA SET student_id = [name], [name]=student_id WHERE isnumeric([name]);
And repeat for the other mixed up sub tables.
Addressing the schema issue:
You have three "Subtables" one for each school. These three tables should be a single table, and "School" should be a field in that table. So your data looks something like:
+--------+------------+---------+
| School | Student_Id | Name |
+--------+------------+---------+
| A | 1 | John |
| A | 2 | Jasmine |
| B | 3 | Fred |
| C | 5 | Harold |
| C | 6 | Donna |
+--------+------------+---------+
This way you only join in a single table, and your data only grows in rows as new schools are brought into your database.
Second, if I'm reading your question correctly, you have both student_id and name in the main table as well as the three sub-tables? It seems like you should only keep these in a single table, maybe named student.
Lastly, you can combine the three subtables into a single view that will make it 9000% (guesstimate) easier to join for future queries, using a UNION query:
SELECT 'A' as school, student_id, name FROM subA
UNION ALL
SELECT 'B', student_id, name FROM subB
UNION ALL
SELECT 'C', student_id, name FROM subC
This will stack all three tables on top of each other and give you a schema similar to the example above. You can join to your main table like:
SELECT *
FROM mainTable
INNER JOIN
(
SELECT 'A' as school, student_id, name FROM subA
UNION ALL
SELECT 'B', student_id, name FROM subB
UNION ALL
SELECT 'C', student_id, name FROM subC
) AS subs ON
mainTable.student_id = subs.student_id

Selecting only rows containing at least the given elements in SQL

Using PostgreSQL, and given the following sample table, how do I select all parents that have at least a child 10 and a child 20?
parent | child
--------+-------
1 | 10
1 | 20
1 | 30
2 | 10
2 | 20
3 | 10
In other words, this is the expected result:
parent
--------
1
2
In general, how do I select all parents that have at least all of the given children x1, x2, ..., xn? What is the most efficient way to do this?
Thanks!
SELECT parent FROM table WHERE child IN(10,20)
GROUP BY parent
HAVING COUNT(DISTINCT child)>=2
Fiddle
It's not completely clear what your asking. However, I shall give it a crack.
If you're going to manually define the children you can do a simple select statement:
SELECT DISTINCT parent
FROM table1
WHERE child IN ('10', '20')
This would select all Parents that have 10 or 20 as there child. To add more, just add the number to the IN() part.
If however you want to do this for a large number of children or perhaps an unknown number of children then you can create a temp table to store the children search values and join it to your main table. Something like:
CREATE TABLE #SearchChildren
(
Child int
)
Then input your search values into #SearchChildren. Need to know more about what your doing to do this bit.
SELECT DISTINCT a.parent
FROM table1 as a
JOIN #SearchChildren as s
ON a.child = s.Child
Without knowing more about what your trying to do it's difficult to give a full answer but hopefully this helps.

How to add aggregate value to SELECT?

I'm selecting data from multiple tables and I also need to get maximum "timestamp" on those tables. I will need that to create custom cache control.
tbl_name tbl_surname
id | name id | surname
--------- ------------
0 | John 0 | Doe
1 | Jane 1 | Tully
... ...
I have following query:
SELECT name, surname FROM tbl_name, tbl_surname WHERE tbl_name.id = tbl_surname.id
and I need to add following info to result set:
SELECT MAX(ora_rowscn) FROM (SELECT ora_rowscn FROM tbl_name
UNION ALL
SELECT ora_rowscn FROM tbl_surname);
I was trying to use UNION but I get error - mixing group and not single group data - or something like that, I know why I cannot use the union.
I don't want to split this into 2 calls, because I need the timestamp of the current snapshot I took from DB for my cache management. And between select and the call for MAX the DB could change.
Here is result I want:
John | Doe | 123456
Jane | Tully | 123456
where 123456 is approximate time of last change (insert, update, delete) of tables tbl_name and tbl_surname.
I have read only access to DB, so I cannot create triggers, stored procedures, extra tables etc...
Thanks for any suggestions.
EDIT: The value *ora_rowscn* is assigned per block of rows. So in one table this value can differ per row. I need the maximal value from both (all) tables involved in query.
Try:
SELECT name,
surname,
max(greatest(tbl_name.ora_rowscn, tbl_surname.ora_rowscn)) over () as max_rowscn
FROM tbl_name, tbl_surname
WHERE tbl_name.id = tbl_surname.id
There's no need to aggregate here - just include both ora_rowscn values in your query and take the max:
SELECT
n.name,
n.ora_rowscn as n_ora_rowscn,
s.surname,
s.ora_rowscn as s_ora_rowscn,
greatest(n.ora_rowscn, s.ora_rowscn) as last_ora_rowscn
FROM tbl_name n
join tbl_surname s on n.id = s.id
BTW, I've replaced your old-style joins with ANSI style - better readable, IMHO.

I DISTINCTly hate MySQL (help building a query)

This is staight forward I believe:
I have a table with 30,000 rows. When I SELECT DISTINCT 'location' FROM myTable it returns 21,000 rows, about what I'd expect, but it only returns that one column.
What I want is to move those to a new table, but the whole row for each match.
My best guess is something like SELECT * from (SELECT DISTINCT 'location' FROM myTable) or something like that, but it says I have a vague syntax error.
Is there a good way to grab the rest of each DISTINCT row and move it to a new table all in one go?
SELECT * FROM myTable GROUP BY `location`
or if you want to move to another table
CREATE TABLE foo AS SELECT * FROM myTable GROUP BY `location`
Distinct means for the entire row returned. So you can simply use
SELECT DISTINCT * FROM myTable GROUP BY 'location'
Using Distinct on a single column doesn't make a lot of sense. Let's say I have the following simple set
-id- -location-
1 store
2 store
3 home
if there were some sort of query that returned all columns, but just distinct on location, which row would be returned? 1 or 2? Should it just pick one at random? Because of this, DISTINCT works for all columns in the result set returned.
Well, first you need to decide what you really want returned.
The problem is that, presumably, for some of the location values in your table there are different values in the other columns even when the location value is the same:
Location OtherCol StillOtherCol
Place1 1 Fred
Place1 89 Fred
Place1 1 Joe
In that case, which of the three rows do you want to select? When you talk about a DISTINCT Location, you're condensing those three rows of different data into a single row, there's no meaning to moving the original rows from the original table into a new table since those original rows no longer exist in your DISTINCT result set. (If all the other columns are always the same for a given Location, your problem is easier: Just SELECT DISTINCT * FROM YourTable).
If you don't care which values come from the other columns you can use a (bad, IMHO) MySQL extension to SQL and do:
SELECT * FROM YourTable GROUP BY Location
which will give a result set with one row per location and values for the other columns derived from the original data in an undefined fashion.
Multiple rows with identical values in all columns don't have any sense. OK - the question might be a way to correct exactly that situation.
Considering this table, with id being the PK:
kram=# select * from foba;
id | no | name
----+----+---------------
2 | 1 | a
3 | 1 | b
4 | 2 | c
5 | 2 | a,b,c,d,e,f,g
you may extract a sample for every single no (:=location) by grouping over that column, and selecting the row with minimum PK (for example):
SELECT * FROM foba WHERE id IN (SELECT min (id) FROM foba GROUP BY no);
id | no | name
----+----+------
2 | 1 | a
4 | 2 | c