Find out what group id contains all relevant attributes in SQL - sql

So lets say in this case, the group that we have is groups of animals.
Lets say I have the following tables:
animal_id | attribute_id | animal
----------------------------------
1 | 1 | dog
1 | 4 | dog
2 | 1 | cat
2 | 3 | cat
3 | 2 | fish
3 | 5 | fish
id | attribute
------------------
1 | four legs
2 | no legs
3 | feline
4 | canine
5 | aquatic
Where the first table contains the attributes that define an animal, and the second table keeps track of what each attribute is. Now lets say that we run a query on some data and get the following result table:
attribute_id
------------
1
4
This data would describe a dog, since it is the only animal_id that has both attributes 1 and 4. I want to be able to somehow get the animal_id (which in this case would be 1) based on the third table, which is essentially a table that has already been generated that contains the attributes of an animal.
EDIT
So the third table that has 1 and 4 doesn't have to be 1 and 4. It could return 2 and 5 (for fish), or 1 and 3 (cat). We can assume that it's result will always match one animal completely, but we don't know which one.

You can use group by and having:
with a as (
select 1 as attribute_id from dual union all
select 4 as attribute_id from dual
)
select t.animal_id, t.animal
from t join
a
on t.attribute_id = a.attribute_id
group by t.animal_id, t.animal
having count(*) = (select count(*) from a);
The above will find all animals that have those attributes and any others. If you want animals that have exactly those 2 attributes:
with a as (
select 1 as attribute_id from dual union all
select 4 as attribute_id from dual
)
select t.animal_id, t.animal
from t left join
a
on t.attribute_id = a.attribute_id
group by t.animal_id, t.animal
having count(*) = (select count(*) from a) and
count(*) = count(a.attribute_id);

Related

How to find count differences for IDs in two large tables

I have two large tables. Both containing around 17M rows each. They should have same exact number of rows but I am finding that the counts are different by 343. I want to find out where the counts are different. Tables look like this:
Table A
ID | color
---| ---------
1 | red
1 | green
1 | blue
2 | white
3 | black
3 | red
Tabls B
ID | sale_dates
---| ----------
1 | 2020-10-01
1 | 2020-01-10
2 | 2018-01-09
3 | 2017-08-08
Based on above I would like an output like below:
ID | Table A | Table B | Difference
---| --------| --------| ----------
1 | 5 | 2 | 3
2 | 1 | 1 | 0
3 | 2 | 1 | 1
Or even only find out the ones where the difference is not 0
If the two tables will always have the same set of ID values, you can just JOIN two derived tables of COUNT(*) values to get your desired output:
SELECT A.ID,
"Table A",
"Table B",
"Table A" - "Table B" AS Difference
FROM (
SELECT ID, COUNT(*) AS "Table A"
FROM A
GROUP BY ID
) A
JOIN (
SELECT ID, COUNT(*) AS "Table B"
FROM B
GROUP BY ID
) B ON A.ID = B.ID
ORDER BY A.ID
Output:
id Table A Table B difference
1 3 2 1
2 1 1 0
3 2 1 1
Demo on dbfiddle
If you only want the ID values which have a non-zero difference, add
WHERE "Table A" - "Table B" > 0
before the ORDER BY clause.
Demo on dbfiddle
This is a tweak on Nick's answer. I think a full join is very important in this type of situation, because it is possible that some ids are missing from one table or the other:
SELECT ID, a.cnt, b.cnt,
(COALESCE(a.cnt, 0) - COALESCE(b.cnt, 0)) as difference
FROM (SELECT UPPER(ID) as id, COUNT(*) AS cnt
FROM A
GROUP BY UPPER(ID)
) A FULL JOIN
(SELECT UPPER(ID) as id, COUNT(*) AS cnt
FROM B
GROUP BY UPPER(ID)
) B
USING (ID)
ORDER BY difference DESC;
Add:
WHERE COALESCE(a.cnt, 0) <> COALESCE(b.cnt)
if you only want ids where the counts are not the same.

SQL recursive query for complex table

I have a pretty complex table structure with parent-child relations.
The idea behind the structure is that some object in child_id can trigger a parent_id.
Assume this data;
Table 1 - map
map_id | parent_id | child_id
1 | 1 | 2
2 | 1 | 3
3 | 1 | 4
Table 2 - attributes
attribute_id | child_id | id_to_trigger
1 | 2 | 5
2 | 5 | 6
Example: A questionnaire system is a master. It can contain sub groups to be answered; in which case the sub groups become child of the master. Some answers in the sub groups can trigger an additional sub group within it.
I want to now be able to fetch all the sub group id's for a given master. A sub group can be triggered from multiple sub groups but that isn't a problem since I need just the sub group id's.
As you can tell, master with id 1 has 3 sub groups 2, 3, 4. In the attributes table we can see that sub group 2 can trigger sub group 5; similarly 5 can trigger 6 and so on.
I need 2, 3, 4, 5, 6 in my output. How do i achieve this?
Think about your design, i suggest that you dont need 2 tables if you add these 2 recs to your table 1
map_id | parent_id | child_id
1 | 1 | 2
2 | 1 | 3
3 | 1 | 4
4 | 2 | 5
5 | 5 | 6
you can now use a standard CTE to walk the tree
like this
with Tree as (select child_id from table_1 where parent_id = 1
union all
select table_1.child_id from table_1
inner join Tree on Tree.child_id = table_1.parent_id)
select * from Tree
if you cant change schema this will work
with
table_1 as ( select Parent_id , child_id from map
union all
select child_id as Parent_id, id_to_trigger as child_id from attributes)
,Tree as (select child_id from table_1 where parent_id = 1
union all
select table_1.child_id from table_1
inner join Tree on Tree.child_id = table_1.parent_id)
select * from Tree
Try this :
SELECT
map.parent_id,
map.child_id
FROM
map
UNION
SELECT
attributes.child_id,
attributes.id_to_trigger
FROM
map
Inner JOIN attributes ON map.child_id = attributes.child_id
UNION
SELECT
T1.child_id,
T1.id_to_trigger
FROM
attributes T1
Inner JOIN attributes T2 ON T1.child_id = T2.id_to_trigger
Result :
parent_id | child_id
1 | 2
1 | 3
1 | 4
2 | 5
5 | 6

Fill table with data based on other table?

I have 2 tables in Oracle database: document and document_closure.
document:
- id
- name
- parent_id
document_closure:
- id
- parent_id
- child_id
- level
document table has a lot of data (10k~20k). document_closure is empty.
Question: How to fill document_closure table with data based on document table. What sql script needs to be my that task?
Lets say I have such tree. Example:
A
|
- B
|
- C
document table:
id | parent_id | name
1 | | A
2 | 1 | B
3 | 2 | C
Finally document_closure must be:
id | parent_id | child_id | level
1 | 1 | 1 | 0
2 | 2 | 2 | 0
3 | 3 | 3 | 0
4 | 1 | 2 | 1
5 | 2 | 3 | 1
6 | 1 | 3 | 2
While parent_id and name fields in document_closure are the same coming from document table, you didn't mention what to put in the name field...
However assuming you can use the same (or null) value for the name field you just need an INSERT INTO SELECT statement like this
INSERT INTO table2 (column1, column2, column3, ...) SELECT column1, column2, column3, ... FROM table1 WHERE condition;
that copies data from document table and inserts them into document_closure table. In your case you just need the following SQL:
INSERT INTO document_closure (id,parent_id) SELECT id,parent_id FROM document;
and you'll have all your records in document table copied in document_closure table.
This can be done USING Oracle's connect by for hierarchical queries. This comes with a number of handy functions, including level to indicate how far down the hierarchy you are and connect_by_root() which returns the root value of the hierarchy (i.e. the top level value).
The query to generate the data based on the documents table looks something like:
WITH documents AS (SELECT 1 ID, NULL parent_id, 'A' NAME FROM dual UNION ALL
SELECT 2 ID, 1 parent_id, 'B' NAME FROM dual UNION ALL
SELECT 3 ID, 2 parent_id, 'C' NAME FROM dual)
-- end of mimicking a table with your sample data in it.
-- Since you already have this table, you don't need to bother defining the above subquery.
SELECT row_number() OVER (ORDER BY LEVEL, connect_by_root(ID), ID) ID,
connect_by_root(ID) parent_id,
ID child_id,
LEVEL -1 lvl
FROM documents d
CONNECT BY PRIOR ID = parent_id;
ID PARENT_ID CHILD_ID LVL
---------- ---------- ---------- ----------
1 1 1 0
2 2 2 0
3 3 3 0
4 1 2 1
5 2 3 1
6 1 3 2

Treat Multiple Columns as 1 in SQL to Get Aggregates

Is it possible to get counts, too? My UNIONs get me all distinct values across all 4 columns but now I need to know how many times each value appears across all 4 columns. Need to stay with stock SQL, if possible.
(SELECT DISTINCT classify1 AS classified FROM class) UNION
(SELECT DISTINCT classify2 AS classified FROM class) UNION
(SELECT DISTINCT classify3 AS classified FROM class) UNION
(SELECT DISTINCT classify4 AS classified FROM class)
ORDER BY classified
Returns:
A
B
C
D
E
F
H
Need:
A | 3
B | 3
C | 4
D | 3
E | 1
F | 1
H | 1
SQL Fiddle
SELECT a.classified, COUNT(*)
FROM
(
(SELECT classify1 AS classified FROM class) UNION ALL
(SELECT classify2 AS classified FROM class) UNION ALL
(SELECT classify3 AS classified FROM class) UNION ALL
(SELECT classify4 AS classified FROM class)) a
GROUP BY a.classified
Result
| CLASSIFIED | COLUMN_1 |
-------------------------
| A | 3 |
| B | 3 |
| C | 4 |
| D | 3 |
| E | 1 |
| F | 1 |
| H | 1 |
When you use DISTINCT you eliminate the extra 'A' in classify3
Use UNION ALL instead of UNION, the embed the result in a sub-query to perform your aggregate on.
SELECT
classified,
COUNT(*)
FROM
(
(SELECT classify1 AS classified FROM class) UNION ALL
(SELECT classify2 AS classified FROM class) UNION ALL
(SELECT classify3 AS classified FROM class) UNION ALL
(SELECT classify4 AS classified FROM class)
)
AS unified_data
GROUP BY
classified
ORDER BY
classified

Returning a row if and only if a sibling row doesn't exist

I'm having an Idiot Day today. I'm sure this is relatively simple, but my brain just isn't giving me the answer.
I have a table whose rows are types of object. Looks something like this:
id name foo bar house_id
1 Cat 12 4 1
2 Cat 9 4 2
3 Dog 8 23 1
4 Bird 9 54 1
5 Bird 78 2 2
6 Bird 29 32 3
This isn't how I'd choose to implement it, but it's what I'm working with. Objects (cats, dogs and birds, in real life they're actual business things) have been added to the table on an ad-hoc basis. When house_id 1 needs cats in it, a record for cats gets put in. When house_id 3 gets dogs, a record gets put in for dogs.
I now need to update this table so every type of object (Cat, Dog, Bird) has a record for a given house_id. I want to do this by inserting the result from a select query that returns a single record for each type, with the earliest values for 'foo' and 'bar' from a row of that type, if and only if there is no existent record for that type with the given house_id.
So for the above example data, where the given house_id = 3, the select query would return the following:
name foo bar house_id
Cat 12 4 3
Dog 8 23 3
which I can then insert straight into the table.
Basically, return the first row of each distinct name if there are no rows of that name with a given house_id.
Suggestions welcome. DB engine is postgres if that helps.
SET search_path= 'tmp';
DROP TABLE dogcat CASCADE;
CREATE TABLE dogcat
( id serial NOT NULL
, zname varchar
, foo INTEGER
, bar INTEGER
, house_id INTEGER NOT NULL
, PRIMARY KEY (zname,house_id)
);
INSERT INTO dogcat(zname,foo,bar,house_id) VALUES
('Cat',12,4,1)
,('Cat',9,4,2)
,('Dog',8,23,1)
,('Bird',9,54,1)
,('Bird',78,2,2)
,('Bird',29,32,3)
;
-- Carthesian product of the {name,house_id} domains
WITH cart AS (
WITH beast AS (
SELECT distinct zname AS zname
FROM dogcat
)
, house AS (
SELECT distinct house_id AS house_id
FROM dogcat
)
SELECT beast.zname AS zname
,house.house_id AS house_id
FROM beast , house
)
INSERT INTO dogcat(zname,house_id, foo,bar)
SELECT ca.zname, ca.house_id
,fb.foo, fb.bar
FROM cart ca
-- find the animal with the lowes id
JOIN dogcat fb ON fb.zname = ca.zname AND NOT EXISTS
( SELECT * FROM dogcat nx
WHERE nx.zname = fb.zname
AND nx.id < fb.id
)
WHERE NOT EXISTS (
SELECT * FROM dogcat dc
WHERE dc.zname = ca.zname
AND dc.house_id = ca.house_id
)
;
SELECT * FROM dogcat;
Result:
SET
DROP TABLE
NOTICE: CREATE TABLE will create implicit sequence "dogcat_id_seq" for serial column "dogcat.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "dogcat_pkey" for table "dogcat"
CREATE TABLE
INSERT 0 6
INSERT 0 3
id | zname | foo | bar | house_id
----+-------+-----+-----+----------
1 | Cat | 12 | 4 | 1
2 | Cat | 9 | 4 | 2
3 | Dog | 8 | 23 | 1
4 | Bird | 9 | 54 | 1
5 | Bird | 78 | 2 | 2
6 | Bird | 29 | 32 | 3
7 | Cat | 12 | 4 | 3
8 | Dog | 8 | 23 | 2
9 | Dog | 8 | 23 | 3
(9 rows)
As is usually the case, I struggle with a question all morning, post it to Stack Overflow and figure it out myself within the next half hour
select name, foo, bar, 3
from table
where id in
(
select min(id) from table where name not in
(
select name from table where house_id = 3
)
group by name
);