T-SQL Compare a group of rows with other groups of rows - sql

So I'm trying to filter one table by the values of multiple rows grouped by one column which match multiple rows of another table which are grouped by a column. For Exmaple:
###Table1###
+--------+-------+
| Symbol | Value |
+--------+-------+
| A | 1 |
| A | 2 |
| A | 3 |
| B | 9 |
| B | 8 |
+--------+-------+
###Table2###
+--------+-------+
| Symbol | Value |
+--------+-------+
| C | 9 |
| C | 8 |
| D | 1 |
| D | 2 |
| D | 4 |
| E | 9 |
| E | 8 |
| F | 1 |
| F | 2 |
| F | 3 |
+--------+-------+
The query needs to return C, E, and F but not D because the values for A match the values of F, and the values of B match the values of C and E.
I hope this makes sense.

You can get the match by joining the tables on the value and then counting the symbols. For your data, this should work:
select t2.symbol, t1.symbol
from (select t1.*, count(*) over (partition by symbol) as cnt
from table1 t1
) t1 join
table2 t2
on t1.value = t2.value
group by t1.symbol, t2.symbol, t1.cnt;
having count(*) = t1.cnt
This assumes:
No duplicates in either table.
You are looking for rows in table2 that match table1, but table2 could have additional values not in table1.

Related

How to join a grouped table in sql?

Novice in SQL here but hopefully someone can help. I have two tables. For the simplicity here is how the tables are structured.
Table 1:
+------------+-------+-----------+------------+
| department | sales | date | sales_code |
+------------+-------+-----------+------------+
| 1 | 50 | 5/26/2021 | A |
+------------+-------+-----------+------------+
| 2 | 150 | 5/26/2021 | B |
+------------+-------+-----------+------------+
| 1 | 200 | 5/25/2021 | C |
+------------+-------+-----------+------------+
| 2 | 250 | 5/24/2021 | D |
+------------+-------+-----------+------------+
Table 2:
+------+------------+-------+-----------+-----------------------+
| item | department | sales | date | column I want to join |
+------+------------+-------+-----------+-----------------------+
| 31 | 1 | 50 | 5/26/2021 | x |
+------+------------+-------+-----------+-----------------------+
| 30 | 2 | 150 | 5/26/2021 | x |
+------+------------+-------+-----------+-----------------------+
| 29 | 1 | 200 | 5/25/2021 | x |
+------+------------+-------+-----------+-----------------------+
| 28 | 2 | 250 | 5/24/2021 | x |
+------+------------+-------+-----------+-----------------------+
I need to join table 2 to table 1 - however it needs to be aggregated by department sales first, this is because table 2 is already aggregated by department sales. Here is what I was thinking but cannot seem to get it to work.
SELECT t1.*, t2.*
FROM table1 as t1
JOIN (
SELECT department, date, column_i_want, sum(sales)
FROM table2
GROUP BY department ) as t2
ON t2.department = t1.department AND t1.date = t2.date
Desired Output:
+------------+-------+-----------+------------+-----------------------+
| department | sales | date | sales_code | column I want to join |
+------------+-------+-----------+------------+-----------------------+
| 1 | 50 | 5/26/2021 | A | x |
+------------+-------+-----------+------------+-----------------------+
| 2 | 150 | 5/26/2021 | B | x |
+------------+-------+-----------+------------+-----------------------+
| 1 | 200 | 5/25/2021 | C | x |
+------------+-------+-----------+------------+-----------------------+
| 2 | 250 | 5/24/2021 | D | x |
+------------+-------+-----------+------------+-----------------------+
Any help would be appreciated.
There are several ways to go about doing that, the easiest one is to create a view
CREATE VIEW t2 AS
SELECT department, date, column_i_want, sum(sales)
FROM table2
GROUP BY department;
then it's easier to join them (you can also use a With clause instead of a view but it can get messy)
SELECT *
FROM table1 NATURAL JOIN t2
here is what you want:
select t2.*, t1.sales_code
from table2 t2
join table1 t1
on t1.department = t2.department
and t1.date = t2.date

How to sort rows with duplicated id's by specific value in PostgreSQL?

I made a select with joins from DB table and as a result I have records with duplicated IDs. For ranking records (pagination purposes) I use dense_rank() over () in PostgreSQL.
The question is: how can I order results by specific column value and its appropriate value from another column?
SELECT *
FROM
(SELECT crm_leads.id,
f.name,
fv.value,
dense_rank() OVER (ORDER BY crm_leads.id) AS offset_
FROM crm_leads
INNER JOIN crm_modules AS m ON crm_leads.module_id = m.id
INNER JOIN crm_fields AS f ON f.module_id = m.id
LEFT JOIN crm_field_values AS fv ON fv.lead_id = crm_leads.id
AND fv.field_id = f.id
LEFT JOIN crm_field_type_values AS ftv ON ftv.field_id = f.id
WHERE crm_leads.domain_uuid = '6191af69-9cb5-44f7-b455-3eae6f81d01d'
AND m.id = 41 )
AS result_offset
What I have after select:
| ID | NAME | VALUE |
| 3 | name1 | 13 |
| 3 | name2 | 23 |
| 3 | name3 | 44 |
| 4 | name2 | 55 |
| 4 | name1 | 12 |
| 5 | name2 | 89 |
| 5 | name1 | 14 |
For example, I want to order by NAME: name1 value and its appropriate value. What I expect after sorting (values: 12, 13, 14):
| ID | NAME | VALUE |
| 4 | name2 | 55 |
| 4 | name1 | 12 |
| 3 | name1 | 13 |
| 3 | name2 | 23 |
| 3 | name3 | 44 |
| 5 | name2 | 89 |
| 5 | name1 | 14 |
You can use window functions in the order by. So I think you want an order by on the outer query:
order by min(value) over (partition by id), id
Note that you do not have an order by in the outermost query. Hence, you are not guaranteed that the results will be in any particular order. order by in subqueries does not guarantee ordering in the outer query.

SQL: For each item in one column, count its corresponding values in another

I have a table Pair(id1, id2) that looks like this:
+---------+---------+
| id1 | id2 |
+---------+---------+
| 1 | 2 |
| 2 | 2 |
| 3 | 1 |
| 4 | 1 |
| 5 | 3 |
| 6 | 2 |
+---------+---------+
I need to create a statement that will print each individual value form id2, along with a counter of the number of values that correspond to it. The output should look like this:
+---------+---------+
| id2 | Count |
+---------+---------+
| 1 | 2 |
| 2 | 3 |
| 3 | 1 |
+---------+---------+
you can use this
select id2, count(*) cnt
from mytable group by id2
Here's how you do it - replace #temp with your table name:
select a.id1 id2, count(*) [count]from #temp a
join #temp b on a.id1=b.id2
group by a.id1

join two lists with columns showing values from each table without duplicates

I have two tables. Each table has two columns. The first column of each table is the matching/mapping column. I have no idea how to explain what I am trying to do so I'll use an example.
table 1
| col1 | col2 |
|------|-------|
| a | one |
| a | two |
| b | three |
| c | four |
table 2
| col1 | col2 |
|------|-------|
| a | five |
| b | six |
| b | seven |
| d | eight |
desired output
| col1 | table1 | table2 |
|------|--------|--------|
| a | one | five |
| a | two | |
| b | three | six |
| b | | seven |
| c | four | |
| d | | eight |
(the empty cells are null)
Basically I am looking for a summary table that shows all the col2 options for that col1 from each table. I hope this makes sense...
You need FULL OUTER JOIN and ROW_NUMBER
SELECT COALESCE(a.col1, b.col2),
COALESCE(a.col2, ''),
COALESCE(b.col, '')
FROM (SELECT *,
Rn = Row_number()OVER(partition BY col1 ORDER BY ##SPID)
FROM table1) a
FULL JOIN (SELECT *,
Rn = Row_number()OVER(partition BY col1 ORDER BY ##SPID)
FROM table2) b
ON a.col1 = b.col1
AND a.Rn = b.Rn

Access Multiple SQL Connection

I have two queries in Access which are returning two tables like:
(The tables have both about 1000 lines)
SELECT
(select count(*)
from Table1 T2
where T1.Name=T2.Name and T1.Variable1 >= T2.Variable1) as Rank,
T1.Name,
T1.Variable1
FROM Table1 T1
Results:
+-------+---------+------------+
| Rank | Name | Variable1 |
+-------+---------+------------+
| 1 | Tim | x |
| 2 | Tim | y |
| 3 | Tim | z |
| 1 | Susan | x |
| 2 | Susan | w |
+-------+---------+------------+
Second query:
SELECT (select count(*)
from Table2 T2
where T1.Name=T2.Name and T1.Variable2 >= T2.Variable2) as Rank,
T1.Name,T1.Variable2
FROM Table2 T1
Results:
+--------+---------+------------+
| Ran | Name | Variable2 |
+--------+---------+------------+
| 1 | Tim | a |
| 2 | Tim | b |
| 3 | Tim | c |
| 1 | Susan | a |
| 2 | Susan | c |
+--------+---------+------------+
I want to link them:
Select distinct Table1.Name, Table1.Variable1, Table2.Variable2
from Table1, Table2
where Table1.Name=Table2.Name and Table1.Rank=Table2.Rank
Results:
+-----------+---------+-------------+------------+
| Rank | Name | Variable1 | Variable2 |
+-----------+---------+-------------+------------+
| 1 | Tim | x | a |
| 2 | Tim | y | b |
| 3 | Tim | z | c |
| 1 | Susan | x | a |
| 2 | Susan | w | b |
+-----------+---------+-------------+------------+
But that link isn't performing well in access.
I also tried to link them via "join" but the performance isnt getting better.
These ranking queries are expensive (the subquery has to be executed for each row of the main table).
Stacking / cascading expensive queries in Access often performs badly.
Your best option is to change your 1st and 2nd query into "Create table" (SELECT INTO) queries, storing the results in intermediate tables.
E.g.
SELECT
(select count(*)
from Table1 T2
where T1.Name=T2.Name and T1.Variable1 >= T2.Variable1) as Rank,
T1.Name,
T1.Variable1
INTO Result1
FROM Table1 T1
Then use these tables (Result1, Result2) as input for the JOIN.