Join number of pairs in a single table using SQL - sql

I have two tables of events in bigquery that look like as follows. The main idea is two count the number of events in each table (are always pairs of event_id and user_id) and join them in a single table that for each pair in any table it tells the number of events.
table 1:
| event_id | user id |
| -------- | ------- |
| 1 | 1 |
| 2 | 1 |
| 2 | 3 |
| 2 | 5 |
| 1 | 1 |
| 4 | 7 |
table 2:
| event_id | user id |
| -------- | ------- |
| 1 | 1 |
| 3 | 1 |
| 2 | 3 |
I would like to get a table which has the number of events of each table:
| event_id | user id | num_events_table1 | num_events_table2 |
| -------- | ------- | ----------------- | ----------------- |
| 1 | 1 | 2 | 1 |
| 2 | 1 | 1 | 0 |
| 2 | 3 | 1 | 1 |
| 2 | 5 | 1 | 0 |
| 4 | 7 | 1 | 0 |
| 3 | 1 | 0 | 1 |
Any idea of how to do this with sql? I have tried this:
SELECT i1, e1, num_viewed, num_displayed FROM
(SELECT id as i1, event as e1, count(*) as num_viewed
FROM table_1
group by id, event) a
full outer JOIN (SELECT id as i2, event as e2, count(*) as num_displayed
FROM table_2
group by id, event) b
on a.i1 = b.i2 and a.e1 = b.e2
This is not getting exactly what I want. I amb getting i1 which are null and e1 that are null.

Consider below
#standardSQL
with `project.dataset.table1` as (
select 1 event_id, 1 user_id union all
select 2, 1 union all
select 2, 3 union all
select 2, 5 union all
select 1, 1 union all
select 4, 7
), `project.dataset.table2` as (
select 1 event_id, 1 user_id union all
select 3, 1 union all
select 2, 3
)
select event_id, user_id,
countif(source = 1) as num_events_table1,
countif(source = 2) as num_events_table2
from (
select 1 source, * from `project.dataset.table1`
union all
select 2, * from `project.dataset.table2`
)
group by event_id, user_id
if applied to sample data in your question - output is

If I understand correctly, the simplest method is to modify your query via a USING clause along with COALESCE():
SELECT id, event, COALESCE(num_viewed, 0), COALESCE(num_displayed, 0)
FROM (SELECT id, event, count(*) as num_viewed
FROM table_1
GROUP BY id, event
) t1 FULL JOIN
(SELECT id , event, COUNT(*) as num_displayed
FROM table_2
GROUP BY id, event
) t2
USING (id, event);
Note: This requires that the two columns used for the JOIN have the same name. If this is not the case, then you might still need column aliases in the subqueries.

One way is aggregate the union
select event_id, user id, sum(cnt1) cnt1, sum(cnt2) cnt2
from (
select event_id, user id, 1 cnt1, 0 cnt2
from table_1
union all
select event_id, user id, 0 cnt1, 1 cnt2
from table_2 ) t
group by event_id, user id

Related

Grouping data using PostgreSQL based on 2 fields

I have a problem with grouping data in postgresql. let say that I have table called my_table
some_id | description | other_id
---------|-----------------|-----------
1 | description-1 | a
1 | description-2 | b
2 | description-3 | a
2 | description-4 | a
3 | description-5 | a
3 | description-6 | b
3 | description-7 | b
4 | description-8 | a
4 | description-9 | a
4 | description-10 | a
...
I would like to group my database based on some_id then differentiate which one has same and different other_id
I would expecting 2 type of queries: 1 that has same other_id and 1 that has different other_id
Expected result
some_id | description | other_id
---------|-----------------|-----------
2 | description-3 | a
2 | description-4 | a
4 | description-8 | a
4 | description-9 | a
4 | description-10 | a
AND
some_id | description | other_id
---------|-----------------|-----------
1 | description-1 | a
1 | description-2 | b
3 | description-5 | a
3 | description-6 | b
3 | description-7 | b
I am open for suggestion both using sequelize or raw query
thank you
One approach, using MIN and MAX as analytic functions:
WITH cte AS (
SELECT *, MIN(other_id) OVER (PARTITION BY some_id) min_other_id,
MAX(other_id) OVER (PARTITION BY some_id) max_other_id
FROM yourTable
)
-- all some_id the same
SELECT some_id, description, other_id
FROM cte
WHERE min_other_id = max_other_id;
-- not all some_id the same
SELECT some_id, description, other_id
FROM cte
WHERE min_other_id <> max_other_id;
Demo
You can also do this using exists and not exists:
-- all same
select t.*
from my_table t
where not exists (select 1
from my_table t2
where t2.some_id = t.some_id and t2.other_id <> t.other_id
);
-- any different
select t.*
from my_table t
where exists (select 1
from my_table t2
where t2.some_id = t.some_id and t2.other_id <> t.other_id
);
Note that this ignores NULL values. If you want them treated as a "different" value then use is distinct from rather than <>.

SQL query for finding records where count < 2

I have a table called Customer:
|device_id|user_id|
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 5 | 2 |
| 6 | 2 |
| 7 | 3 |
Now I want to return only the entries which have only 1 device per user. In this case only
|device_id|user_id|
| 7 | 3 |
Should be returned because user_id 3 is the only one with only 1 device (user_id 1 has 4, user_id 2 has 2)
How would I do that with a query?
One method is not exists:
select t.*
from t
where not exists (select 1
from t t2
where t2.user_id = t.user_id and t2.device_id <> t.device_id
);
You can also use aggregation:
select device_id, max(user_id) as user_id
from t
group by device_id
having count(*) = 1;
We can use group by to group the data on the basis of user_id followed by aggregate funciton to get the count of device:
SELECT device_id,user_id FROM customer where user_id IN
(
SELECT user_id from
(
SELECT user_id,count(*) FROM customer GROUP BY user_id HAVING count(*)<2
)
);

copy one table to another table with diffrent columns

I have a TableA columns are (id,name,A,B,C,p_id)
i want convert TableA to TableB, TableB columns are (id,name,alphabets,alphabets_value,p_id)
Record in TableA
id | name | A | B | C | p_id
1 | xyz | a | b | | 1
2 | opq | a`| b`| c`| 1
Expected In TableB
u_id | id | name | alphabets | alphabets_value | p_id
1 | 1 | xyz | A | a | 1
2 | 1 | xyz | B | b | 1
3 | 2 | opq | A | a` | 1
4 | 2 | opq | B | b` | 1
5 | 2 | opq | C | c` | 1
i want TableB output currently using Microsoft SQL
This is an unpivot, probably most easily explained by a UNION ALL:
SELECT id, name, 'A' as alphabets, a as alphabets_value, p_id
UNION ALL
SELECT id, name, 'B' as alphabets, b as alphabets_value, p_id
UNION ALL
SELECT id, name, 'C' as alphabets, c as alphabets_value, p_id
You can then WHERE to remove the nulls from this, and ROW_NUMBER to give yourself a fake U_id:
SELECT ROW_NUMBER() OVER(ORDER BY id, alphabets) as u_id, x.*
FROM
(
SELECT id, name, 'A' as alphabets, a as alphabets_value, p_id
UNION ALL
SELECT id, name, 'B' as alphabets, b as alphabets_value, p_id
UNION ALL
SELECT id, name, 'C' as alphabets, c as alphabets_value, p_id
)
WHERE
x.alphabets_value IS NOT NULL
Once you get to having a result set you want, INSERT INTO, UPDATE FROM or MERGE to get it into table B is quite trivial

How to get count from one table which is mutually dependent to another table

I have two table
Let's name as first table: QC_Meeting_Master
Second table: QC_Project_Master I want to calculate count of problems_ID Which is mutually depend on second table
ID | QC_ID | Problems_ID |
___|_______|_____________|
1 | 1 | 2 |
2 | 1 | 7 |
ID | QC_ID | Problem_ID |
___|_______|_____________|
1 | 1 | 7 |
2 | 1 | 7 |
3 | 1 | 7 |
4 | 1 | 7 |
5 | 1 | 2 |
6 | 1 | 2 |
7 | 1 | 2 |
select COUNT(Problem_ID) from [QC_Project_Master] where Problem_ID in
(select Problems_ID from QC_Meeting_Master QMM join QC_Project_Master QPM on QMM.Problems_ID = QPM.Problem_ID)
I have to calculate Count of QC_Project_Master (problem_ID) on basis of QC_Meeting_Master (Problems_ID)
it means for first table: QC_Meeting_Master(Problems_ID) = 2,
then count should be 3
And for Second table: QC_Project_Master (Problems_ID) = 7,
then count should be 4
use conditional aggregation
select sum(case when t2.Problem_ID=2 then 1 else 0 end),
sum(case when t2.Problem_ID=7 then 1 else 0 end) from
table1 t1 join table2 t2 on t1.QC_ID=t2.QC_ID and t1.Problems_ID=t2.Problems_ID
if you need all the group count then use below
select t2.QC_ID,t2.Problems_ID, count(*) from
table1 t1 join table2 t2
on t1.QC_ID=t2.QC_ID and t1.Problems_ID=t2.Problems_ID
group by t2.QC_ID,t2.Problems_ID
As far as I understood your problem this is simple aggregation and JOIN as below:
SELECT mm.QC_ID, mm.Problem_ID, pm.cnt
FROM QC_Meeting_Master mm
INNER JOIN
(
SELECT QC_ID, Problem_ID, COUNT(*) cnt
FROM QC_Project_Master
GROUP BY QC_ID, Problem_ID
) pm
ON pm.QC_ID = mm.QC_ID AND pm.Problem_ID = mm.Problem_ID;

SQL select distinct when one column in and another column greater than

Consider the following dataset:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 1 | a | 0.2 |
| 1 | b | 8 |
| 1 | c | 3.5 |
| 1 | d | 2.2 |
| 2 | b | 4 |
| 2 | c | 0.5 |
| 2 | d | 6 |
| 3 | a | 2 |
| 3 | b | 4 |
| 3 | c | 3.6 |
| 3 | d | 0.2 |
+---------------------+
I'm tying to develop a sql select statement that returns the top or distinct ID where NAME 'a' and 'b' both exist and both of the corresponding VALUE's are >= '1'. Thus, the desired output would be:
+---------------------+
| ID | NAME | VALUE |
+---------------------+
| 3 | a | 2 |
+----+-------+--------+
Appreciate any assistance anyone can provide.
You can try to use MIN window function and some condition to make it.
SELECT * FROM (
SELECT *,
MIN(CASE WHEN NAME = 'a' THEN [value] end) OVER(PARTITION BY ID) aVal,
MIN(CASE WHEN NAME = 'b' THEN [value] end) OVER(PARTITION BY ID) bVal
FROM T
) t1
WHERE aVal >1 and bVal >1 and aVal = [Value]
sqlfiddle
This seems like a group by and having query:
select id
from t
where name in ('a', 'b')
having count(*) = 2 and
min(value) >= 1;
No subqueries or joins are necessary.
The where clause filters the data to only look at the "a" and "b" records. The count(*) = 2 checks that both exist. If you can have duplicates, then use count(distinct name) = 2.
Then, you want the minimum value to be 1, so that is the final condition.
I am not sure why your desired results have the "a" row, but if you really want it, you can change the select to:
select id, 'a' as name,
max(case when name = 'a' then value end) as value
you can use in and sub-query
select top 1 * from t
where t.id in
(
select id from t
where name in ('a','b')
group by id
having sum(case when value>1 then 1 else 0)>=2
)
order by id