GROUP BY Aggregation with COUNT - sql

I would like to know how to join tables in order to get a sort of a pivot.
table1 table2
col1 | col2 col1
------+------ -----
1 | A A
1 | B B
1 | C C
2 | A D
2 | A E
2 | A
2 | B
This code will return first two columns as I would like to (It will list every table2 entry for each table1 entry grouped) but I don't know how to continue to get the count for how many col2 occurancies are in table1. I would like to list zeros as well.
SELECT table1.col1, table2.col1
FROM table1, table2
GROUP BY table1.col1, table2.col1;
Expected result:
col1 | col2 | col3
-------+-------+----
1 | A | 1
1 | B | 1
1 | C | 1
1 | D | 0
1 | E | 0
2 | A | 3
2 | B | 1
2 | C | 0
2 | D | 0
2 | E | 0

You can use a Cartesian query:
SELECT
table1.col1,
table2.col1 as col2,
Abs(Sum([table1].[col2]=[table2].[col1])) AS col3
FROM
table1,
table2
GROUP BY
table1.col1,
table2.col1;

You can use such a query with joins, aggregations and correlated subquery as below:
SELECT t1.col1,t1.col2,
(SELECT count(*) FROM table1 WHERE col2=t1.col2 AND col1=t1.col1) as col3
FROM (SELECT t1.col1,max(col2) as col2,count(col2) as ct
FROM table2 t2
JOIN table1 t1
ON t2.col1=t1.col2
GROUP BY t1.col1) t2
RIGHT JOIN ( SELECT tt1.col1,t2.col1 as col2
FROM table2 t2
CROSS JOIN (SELECT distinct col1 FROM table1 ) tt1
) t1
ON t2.col2=t1.col2;
Demo

Related

How to find rows that have only one value in another table

I want to find rows from table 1 that are joining with table 2 and have all rows same in table 2.
Example:
Row with id 4 in table2 is not valid because have different values in table1 (value1, value)
Row with id 5 in table2 is valid because have same values in table1 (value3)
table1
+----+--------+----------+
| id |table2Id| value |
+----+--------+----------+
| 1 | 4 |value1 |
| 2 | 4 |value2 |
| 3 | 5 |value3 |
| 4 | 5 |value3 |
| 5 | 5 |value3 |
+----+--------+----------+
table2
+----+
| id |
+----+
| 4 |
| 5 |
+----+
I'm not sure why a join is necessary. You can just use the information in table1:
select t1.*
from (select t1.*,
min(value) over (partition by table2id) as min_value,
max(value) over (partition by table2id) as max_value
from table1 t1
) t1
where min_value = max_value;
Or use not exists:
select t1.*
from table1 t1
where not exists (select 1
from table1 tt1
where tt1.table2id = t1.table2id and
tt1.value = t1.value
);
In either case, you can join to table2 if you need to for filtering or for other columns, but based on the information in your question the join is not needed.
You can do the following:
SELECT * FROM (
SELECT max(table2Id) as Id,value
FROM table1
GROUP BY value
HAVING COUNT(value)=1) JOIN table2 on table1.Id = table2.Id
Thanks
Try this -
Select t1.* from table1 t1 JOIN table2 t2 on t1.id = t2.id and t1.table2Id = t2.id

Case when column a from tableX is present in column b of tableY

Below is my use case, i'm querying redshift tables, using case when but get error in case when statement.
ERROR: Statement 2 is not valid. ERROR: syntax error at or near "b"
MY SQL query:
CREATE TEMP TABLE TABLE1 AS
(SELECT
COL1
,COL2
,COL3
FROM XYZ_TABLE
WHERE CONDITION1
AND CONDITION2);
CREATE TEMP TABLE TABLE2 AS
(SELECT DISTINCT
COL1
FROM ABC_TABLE
WHERE CONDITION1
AND CONDITION2);
SELECT
COL1
,COL2
,COL3
,CASE WHEN (a.COL1 IN b.COL1) THEN 1 ELSE 0 END AS IN_TABLE_B
FROM TABLE1 a
LEFT JOIN TABLE2 b
WHERE a.COL1 = b.COL1
What i want to achieve:
TABLE1
-----------------
ID | NAME | COL1
-----------------------
123 | A | BLA
234 | B | BLAA
345 | C | BLAH
456 | X | XXX
567 | N | FLS
TABLE2
-----------------
ID | COL1 | COL2
-----------------------
123 | SKLJF | BLA
345 | DKLUF | BLAH
567 | KKBDL | FLS
DESIRED OUTPUT ( IS ID IN TABLE1 PRESENT IN TABLE2, IF YES THEN 1 ELSE 0 END AS COLUMN_NAME)
----------------------------------------------
ID | IN TABLE B |
----------------------------------------------
123 | 1 |
234 | 0 |
345 | 1 |
456 | 0 |
567 | 1 |
You wouldn't use "IN" for this even though your desire mentions the word "in"
You're performing a left join, which will have a value for b.ID where the join succeeded and a null where it failed, hence:
SELECT
a.ID
,CASE WHEN b.ID IS NULL THEN 0 ELSE 1 END AS IN_TABLE_B
FROM TABLE1 a
LEFT JOIN TABLE2 b
WHERE a.ID = b.ID
I might use exists logic here:
SELECT
t1.ID,
CASE WHEN EXISTS (SELECT 1 FROM TABLE2 t2 WHERE t1.COL1 = t2.COL1)
THEN 1 ELSE 0 END AS IN_TABLE_B
FROM TABLE1 t1
ORDER BY t1.ID;
This approach is robust to a record from the first table having multiple matches.

Select datas from multiple tables using one common table in sql server 2005

I have many tables and one common table that have ids of all these tables
for eg:
Table1
| ID | VALUE | DATE |
---------------------------
| 1 | 200 | 25/04/2013 |
| 2 | 250 | 26/05/2013 |
Table2
| ID | VALUE | DATE |
---------------------------
| 1 | 300 | 25/05/2013 |
| 2 | 100 | 12/02/2013 |
Table3
| ID | VALUE | DATE |
---------------------------
| 1 | 500 | 5/04/2013 |
| 2 | 100 | 1/01/2013 |
and one common table
| ID | TABLE | TABLEID |
-------------------------
| 1 | table1 | 1 |
| 2 | table3 | 1 |
| 3 | table2 | 1 |
| 4 | table1 | 2 |
| 5 | table2 | 2 |
| 6 | table3 | 2 |
and using this common table i need to select all datas in above 3 tables
eg:
output
id table tableid value date
1 table1 1 200 25/04/2013
2 table3 1 500 5/04/2013
3 table2 1 300 25/05/2013
4 table1 2 250 26/05/2013
5 table2 2 100 12/02/2013
6 table3 2 100 1/01/2013
If you don't want to use UNION ALL you can use COALESCE for the same using LEFT JOIN like this:
SELECT c.*
, COALESCE(t1.Value, t2.Value,t3.Value) AS Value
, COALESCE(t1.Date, t2.Date,t3.Date) AS Date
FROM Common c
LEFT JOIN Table1 t1 ON c.tableid = t1.[id]
AND [Table] = 'table1'
LEFT JOIN Table2 t2 ON c.tableid = t2.[id]
AND [Table] = 'table2'
LEFT JOIN Table2 t3 ON c.tableid = t3.[id]
AND [Table] = 'table3'
ORDER BY ID;
See this SQLFiddle
By this way you can reduce your task to join all records using UNION ALL. But for the given data structure you have to join all tables anyhow.
You need to join all tables with common table separately then join them using UNION ALL:
SELECT *
FROM Common c
JOIN Table1 t1 ON c.tableid = t1.[id]
AND [Table] = 'table1'
UNION ALL
SELECT *
FROM Common c
JOIN Table2 t2 ON c.tableid = t2.[id]
AND [Table] = 'table2'
UNION ALL
SELECT *
FROM Common c
JOIN Table3 t3 ON c.tableid = t3.[id]
AND [Table] = 'table3';
See this SQLFiddle
You can UNION ALL the tables adding flag column in the process and then JOIN the result with common table.
WITH CTE_Tables AS
(
SELECT 'Table1' AS Tab, * FROM Table1
UNION ALL
SELECT 'Table2' AS Tab, * FROM Table2
UNION ALL
SELECT 'Table3' AS Tab, * FROM Table3
)
SELECT *
FROM CommonTable c1
LEFT JOIN CTE_Tables cte ON cte.ID = c1.TableID AND cte.Tab = c1.[Table]
SQLFiddle DEMO

When joining a table to itself to identfy duplicate date in a column, how do you keep it from returning the inverse in the results?

I am trying to write a sql statement to return me the list of duplicate items I find in a table. For the sake of simplicity imagine a table named TEST with a rowid column and a text column called column 1 with the following date:
rowid | column1
---------------
1 | A
2 | B
3 | C
4 | A
5 | B
6 | C
7 | D
The query I currently have is:
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid <> t2.rowid
It gives me the following results, as I would expect it to do:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
4 | A | 1 | A
5 | B | 2 | B
6 | C | 3 | C
What I really want is just:
rowid | column1 | rowid | column1
---------------------------------
1 | A | 4 | A
2 | B | 5 | B
3 | C | 6 | C
What black sql magic to I need to call upon in order to get my desired result?
select t1.rowid, t1.column1, t2.rowid, t2.column1
from test t1
inner join test t2 on t1.column1 = t2.column1 and t1.rowid < t2.rowid
Another approach to produce results in the same form as the original table:
SELECT t.rowid, t.column1
FROM (SELECT column1
FROM test
GROUP BY column1
HAVING COUNT(*) > 1) q
INNER JOIN test t
ON q.column1 = t.column1
ORDER BY t.column1, t.rowid
Have you tried this?
select min(rowid), column1, max(rowid), column1
from test
group by column1
having count(*)>1
Saves doing self-joins or subqueries, gotta be faster.

MySQL get data from another table with duplicate ID/data

How to query data from table_1 which ID is not available on table_2 that has duplicate ID's. See example below.
I want to get ID 5 and 6 of Table 1 from Table 2
Table 1
-------------
| ID | Name |
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
| 5 | e |
| 6 | f |
-------------
Table 2
-------------
Table 1 ID |
| 1 |
| 1 |
| 2 |
| 2 |
| 2 |
| 3 |
| 4 |
-------------
Thanks!
Minus query would be very helpful, see this link: minus query replacement
for your data this would look like this:
SELECT table_1.id FROM table_1 LEFT JOIN table_2 ON table_2.id = table_1.id WHERE table_2.id IS NULL
Use:
SELECT t.id
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.id = t1.id
WHERE t2.id IS NULL
Using NOT EXISTS:
SELECT t.id
FROM TABLE_1 t1
WHERE NOT EXISTS(SELECT NULL
FROM TABLE_2 t2
WHERE t2.id = t1.id)
Using NOT IN:
SELECT t.id
FROM TABLE_1 t1
WHERE t1.id NOT IN (SELECT t2.id
FROM TABLE_2 t2)
Because there shouldn't be NULL values in table2's id column, the LEFT JOIN/IS NULL is the fastest means: http://explainextended.com/2009/09/18/not-in-vs-not-exists-vs-left-join-is-null-mysql/
If I am understanding you correctly you want to do an outer join. In this case it would be:
SELECT * FROM
table_1 LEFT JOIN ON table_2
ON table_1.id = table_2.id
WHERE table_2.id is NULL
This one does what you want:
Select t1.id
From table1 t1
Left Join table2 t2
On t2.id = t1.id
Where t2.id Is Null
Result:
id
--
5
6