Select postgresql for combinations - sql

I have the following 2 tables:
SQL Tables
I have a list of ids (1,2,3,4,5,6,7,8, ..., 10000).
Unique combination of those ids is inserted into another table.
So, how can I find these inique combination If I pass the list of ids to search.
E.g., I search for ARRAY([2,3,4]). The combination exisst only for the unique_combnation 1, so the result will be as follows:
1 3
1 2
1 4
There is no any unique_comb which contains ids ARRAY([2,3,4]).
If I search for [1,4], the results will be as follows:
1 3
1 2
1 4
2 2
2 4
2 5
How can I do it? I know how to do it in a bad way:
CREATE TEMPORARY TABLE t1
Iterate over given ids: SELECT * FROM .. where id = ANY(ARRAY[1,4]) and get all rows, insert into t1 all rows.
Then group everything by unique_comb.
Then count the number of groups. If the number of unique combinations is not more than 1, then return the id of the unique combination, else (unique combinations > 1) return nothing
Is it a way to make it with 1-2 sql lines? I am using postgresql 9.3
select unique_comb t2 where id = ANY(ARRAY[1, 4]) group by unique_comb ...
The answer below is correct. I have modifed just a little bit the query and it began to work.
It will choose several ids from table unique thing.
The result will be of select unique_comb, array_agg(id) t2 where id = ANY(ARRAY[1, 4]) group by unique_comb will be as follows:

The process that you describe is seems to be something like a group by:
select unique_comb
from t2
where id = ANY(ARRAY[1, 4])
group by unique_comb
having count(*) = array_length(ARRAY[1, 4], 1);

For big tables, and long arrays (not for your examples with just 2 or 3 elements), a more sophisticated query with a recursive CTE would be faster.
In any case you need an index on (id, unique_comb) - in this order! A primary key serves nicely.
WITH RECURSIVE cte AS (
SELECT unique_comb, id, 2 AS i -- start with index for 2nd array elem
FROM tbl
WHERE id = 5 -- *first* element array
UNION ALL
SELECT t.unique_comb, t.id, c.i + 1
FROM cte c
JOIN tbl t USING (unique_comb)
WHERE t.id = ('{5, ... long array ... , 4}'::int[])[c.i] -- your array here
)
SELECT unique_comb
FROM cte
WHERE id = 4; -- *last* element of array
The advantage of this approach is to rule out most (or all) rows early in the game. If you have information on value frequencies, you would put the rarest elements first.

Related

How I can improve SQL gaps searching code for long tables?

How I can improve (speed up) my code for long tables (1M rows)?
I have a table named names. The data in id's column is 1, 2, 5, 7.
ID | NAME
1 | Homer
2 | Bart
5 | March
7 | Lisa
I need to find the missing sequence numbers from the table.
My SQL query found the missing sequence numbers from my table.
It is similar with problem asked here. But my solution is different. I am expecting results like:
id
----
3
4
6
(3 rows)
so, my code (for postgreSql):
SELECT series AS id
FROM generate_series(1, (SELECT ID FROM names ORDER BY ID DESC LIMIT 1), 1)
series LEFT JOIN names ON series = names.id
WHERE id IS NULL;
Use max(id) to get the biggest one
Result here
SELECT series AS id
FROM generate_series(1, (select max(id) from names), 1)
series LEFT JOIN names ON series = names.id
WHERE id IS NULL;

Why does this sql snippet return 8 or 1 always?

What is the result of:
WITH Tbl AS (SELECT 5 AS A UNION SELECT 6 AS A)
SELECT COUNT(*) AS Tbl FROM Tbl AS A, Tbl AS B, Tbl AS C;
I know the result is supposed to be 8 but I don't know why. Also when I change both values (the 5 or 6) to the same thing it returns a table with the value 1 instead of 8 but all other instances it returns 8 no matter what numbers if they are different. I tested it out with an online sql executor.
Here is what the query does:
the common table expression (the subquery within the with clause) generates a derived table made of two rows
then, in the outer query, the from clause generates a cartesian product of this resultset twice: that's a total of 8 rows (2 * 2 * 2)
the select clause counts the number of rows - that's 8
The content of the rows in the with clause does not matter: this 5 and 6 could very well be foo and bar, or null and null, the result would be the same.
What makes a difference is the number of rows that the with clause generates. If it was generating just one row, you would get 1 as a result (1 * 1 * 1). If it was generating 3 rows, you would get 27 - and so on.
This expression:
WITH Tbl AS (SELECT 5 AS A UNION SELECT 6 AS A)
creates a (derived) table with two rows.
This expression:
WITH Tbl AS (SELECT 5 AS A UNION SELECT 5 AS A)
creates a (derived) table with one row, because UNION removes duplicates.
The rest of the query just counts the number of rows in the 3-way Cartesian product, which is either 111 or 222.

How to check the possibility of groups union in a sequence order

I have a table with the following columns:
ID_group, ID_elements
For example with the following records:
1, 1
1, 2
2, 2
2, 4
2, 5
2, 6
3, 7
And I have sets of the elements, for example: 1,2,5; 1,5,2; 1,2,4; 2,7;
I need to check (true or false) that exist a common group for the pairs of adjacent elements.
For example elements:
1,2,5 -> true [i.e. elements 1,2 has common group 1 and elements 2,5 has common group 2]
1,5,2 -> false [i.e. 1,5 do not have a common group unlike 5,2 (but the result is false due to 1,5 - false)]
1,2,4 -> true
2,7 -> false
First, we need a list of pairs. We can get this by taking your set as an array, turning each element into a row with unnest and then making pairs by matching each row with its previous row using lag.
with nums as (
select *
from unnest(array[1,2,5]) i
)
select lag(i) over() a, i b
from nums
offset 1;
a | b
---+---
1 | 2
2 | 5
(2 rows)
Then we join each pair with each matching row. To avoid counting duplicate data rows twice, we count only the distinct rows.
with nums as (
select *
from unnest(array[1,2,5]) i
), pairs as (
select lag(i) over() a, i b
from nums
offset 1
)
select
count(distinct(id_group,id_elements)) = (select count(*) from pairs)
from pairs
join foo on foo.id_group = a and foo.id_elements = b;
This works on any size array.
dbfiddle
Your query to check if elements in a set evaluate to true or not can be done via procedures/function. Set representation can be taken as a string and then splitting it to substring then returning the required result can use a record for multiple entries. For sql query, below is a sample that can be used as a workaround, you can try changing the below query based on your requirement.
select case when ( Select count(*)
from ( SELECT
id_group, count(distinct id_elements)
from table where
id_group
in (1,2,5)
group by ID_group having
id_elements
in (1,2,5)) =3 ) then "true" else "false"
end) from table;
#Schwern, thank you, it helped. But I have changed the condition join foo on foo.id_group = a, because as I understand, a is element's ID, not group's. I have changed the following section:
join foo fA on fA.id_elements = a
join foo fB on fB.id_elements = b and fA.group_id = fB.group_id;

Alternative to INTERSECT Given Arbitrary Number of conditions

If I have a table similar to the following:
MyIds MyValues
----- --------
1 Meat
1 Fruit
1 Veggies
2 Fruit
2 Meat
3 Meat
How do I create a query such that if I am given an arbitrary list of distinct MyValues, it will give me all the MyIds that match all of MyValues.
Example: If list of MyValues contained [Meat, Fruit, Veggies], I'd like to get MyIds of 1 back because 1 has an entry in the table for each value in MyValues.
I know that this can be done with an INTERSECT if I'm given a specific list of MyValues. But I don't know how it can be done with an arbitrary number of MyValues
You need to count the total number of instances of each MyID which satisfies the condition and that it matches to the number of value supplied in the IN clause.
SELECT MyID
FROM tableName
WHERE MyValues IN ('Meat', 'Fruit', 'Veggies')
GROUP BY MyID
HAVING COUNT(DISTINCT myVAlues) = 3
SQLFiddle Demo
A big question is how the list is being represented. The following gives one approach, representing the list in a table:
with l as (
select 'Meat' as el union all
select 'Fruit' union all
select 'Veggies'
)
select MyId
from t join
l
on t.MyValues = l.el
group by MyId
having count(distinct t.myvalues) = (select count(*) from l)

Select values in SQL that do not have other corresponding values except those that i search for

I have a table in my database:
Name | Element
1 2
1 3
4 2
4 3
4 5
I need to make a query that for a number of arguments will select the value of Name that has on the right side these and only these values.
E.g.:
arguments are 2 and 3, the query should return only 1 and not 4 (because 4 also has 5). For arguments 2,3,5 it should return 4.
My query looks like this:
SELECT name FROM aggregations WHERE (element=2 and name in (select name from aggregations where element=3))
What do i have to add to this query to make it not return 4?
A simple way to do it:
SELECT name
FROM aggregations
WHERE element IN (2,3)
GROUP BY name
HAVING COUNT(element) = 2
If you want to add more, you'll need to change both the IN (2,3) part and the HAVING part:
SELECT name
FROM aggregations
WHERE element IN (2,3,5)
GROUP BY name
HAVING COUNT(element) = 3
A more robust way would be to check for everything that isn't not in your set:
SELECT name
FROM aggregations
WHERE NOT EXISTS (
SELECT DISTINCT a.element
FROM aggregations a
WHERE a.element NOT IN (2,3,5)
AND a.name = aggregations.name
)
GROUP BY name
HAVING COUNT(element) = 3
It's not very efficient, though.
Create a temporary table, fill it with your values and query like this:
SELECT name
FROM (
SELECT DISTINCT name
FROM aggregations
) n
WHERE NOT EXISTS
(
SELECT 1
FROM (
SELECT element
FROM aggregations aii
WHERE aii.name = n.name
) ai
FULL OUTER JOIN
temptable tt
ON tt.element = ai.element
WHERE ai.element IS NULL OR tt.element IS NULL
)
This is more efficient than using COUNT(*), since it will stop checking a name as soon as it finds the first row that doesn't have a match (either in aggregations or in temptable)
This isn't tested, but usually I would do this with a query in my where clause for a small amount of data. Note that this is not efficient for large record counts.
SELECT ag1.Name FROM aggregations ag1
WHERE ag1.Element IN (2,3)
AND 0 = (select COUNT(ag2.Name)
FROM aggregatsions ag2
WHERE ag1.Name = ag2.Name
AND ag2.Element NOT IN (2,3)
)
GROUP BY ag1.name;
This says "Give me all of the names that have the elements I want, but have no records with elements I don't want"