Microsoft SQL with more than one distinct COUNT and WHERE clause - sql

I have a data in Microsoft SQL server with the following format:
id1 id2 month quantA quantB
1 10 1 5 15
1 10 1 10 20
1 10 2 5 10
1 10 2 10 NULL
1 11 1 NULL NULL
1 11 2 5 NULL
1 11 2 10 5
2 10 1 10 20
2 10 1 5 NULL
2 11 2 NULL NULL
I need to construct a table grouped by id1 and month with the following columns:
id1
month
var1 = count how many *distinct* id2 by month and id1 for which quantA!=Null
var2 = count how many *distinct* id2 by month and id1 for which quantB!=Null

You can construct the query basically how you have written it:
select id1, month,
count(distinct case when quantA is not null then id2 end) as var1,
count(distinct case when quantB is not null then id2 end) as var2
from t
group by id1, month
COUNT DISTINCT ignores NULLs when doing the count.

Related

row_number() but only increment value after a specific value in a column

Query: SELECT (row_number() OVER ()) as grp, * from tbl
Edit: the rows below are returned by a pgrouting shortest path function and it does have a sequence.
seq grp id
1 1 8
2 2 3
3 3 2
4 4 null
5 5 324
6 6 82
7 7 89
8 8 null
9 9 1
10 10 2
11 11 90
12 12 null
How do I make it so that the grp column is only incremented after a null value on id - and also keep the same order of rows
seq grp id
1 1 8
2 1 3
3 1 2
4 1 null
5 2 324
6 2 82
7 2 89
8 2 null
9 3 1
10 3 2
11 3 90
12 3 null
demo:db<>fiddle
Using a cumulative SUM aggregation is a possible approach:
SELECT
SUM( -- 2
CASE WHEN id IS NULL THEN 1 ELSE 0 END -- 1
) OVER (ORDER BY seq) as grp,
id
FROM mytable
If the current (ordered!) value is NULL, then make it 1, else 0. Now you got a bunch of zeros, delimited by a 1 at each NULL record. If you'd summerize these values cumulatively, at each NULL record, the sum increased.
Execution of the cumulative SUM() using window functions
This yields:
0 8
0 3
0 2
1 null
1 324
1 82
1 89
2 null
2 1
2 2
2 90
3 null
As you can see, the groups start with the NULL records, but you are expecting to end it.
This can be achieved by adding another window function: LAG(), which moves the records to the next row:
SELECT
SUM(
CASE WHEN next_id IS NULL THEN 1 ELSE 0 END
) OVER (ORDER BY seq) as grp,
id
FROM (
SELECT
LAG(id) OVER (ORDER BY seq) as next_id,
seq,
id
FROM mytable
) s
The result is your expected one:
1 8
1 3
1 2
1 null
2 324
2 82
2 89
2 null
3 1
3 2
3 90
3 null

SQL: subset data: select id when time_id for id satisfy a condition from another column

I have a data (dt) in SQL like the following:
ID time_id act rd
11 1 1 1
11 2 4 1
11 3 7 0
12 1 8 1
12 2 2 0
12 3 4 1
12 4 3 1
12 5 4 1
13 1 4 1
13 2 1 0
15 1 3 1
16 1 8 0
16 2 8 0
16 3 8 0
16 4 8 0
16 5 8 0
and I want to take the subset of this data such that only ids (and their corresponding time_id, act, rd) that has time_id == 5 is retained. The desired output is the following
ID time_id act rd
12 1 8 1
12 2 2 0
12 3 4 1
12 4 3 1
12 5 4 1
16 1 8 0
16 2 8 0
16 3 8 0
16 4 8 0
16 5 8 0
I know I should use having clause somehow but have not been successful so far (returns me empty outputs). below is my attempt:
SELECT * FROM dt
GROUP BY ID
Having min(time_id) == 5;
This query:
select id from tablename where time_id = 5
returns all the ids that you want in the results.
Use it with the operator IN:
select *
from tablename
where id in (select id from tablename where time_id = 5)
You can use a correlated subquery with exists:
select t.*
from t
where exists (select 1 from t t2 where t2.id = t.id and t2.time_id = 5);
WITH temp AS
(
SELECT id FROM tab WHERE time_id = 5
)
SELECT * FROM tab t join temp tp on(t.id=tp.id);
check this query
select * from table t1 join (select distinct ID from table t where time_id = 5) t2 on t1.id =t2.id;

How to count more than one value in ms sql

I have a table with a column named direction. This columns has just 1 or 0 int value. For every 1 or zero there is and ID couple. For example:
ID1 ID2 direction
1 2 1
2 3 0
2 4 1
4 1 1
1 2 0
2 3 1
I need a select query in order to take 0 counts and 1 counts for every ID1 and ID2 pair. How can I do that?
Edit:
Result table should look like this: (Numbers does not match with above example)
ID1 ID2 0count 1count
1 2 1 4
2 3 2 2
2 2 1 1
Conditional aggregation is your friend:
SELECT ID1,
ID2,
SUM(CASE WHEN direction = 0 THEN 1 ELSE 0 END) As CountDirection0,
SUM(CASE WHEN direction = 1 THEN 1 ELSE 0 END) As CountDirection1,
FROM Table
GROUP BY ID1, ID2

SQL Aggregate on Two tables

Table A has millions of records from 2014, Using Oracle
ID Sales_Amount Sales_Date
1 10 20/11/2014
1 10 22/11/2014
1 10 22/12/2014
1 10 22/01/2015
1 10 22/02/2015
1 10 22/03/2015
1 10 22/04/2015
1 10 22/05/2015
1 10 22/06/2015
1 10 22/07/2015
1 10 22/08/2015
1 10 22/09/2015
1 10 22/10/2015
1 10 22/11/2015
Table B
ID ID_Date
1 22/11/2014
2 01/12/2014
I want sum of totals for 6 months as well as 1 year for ID 1 taking starting
date from Table B as 22/11/2014
Output Sales_Amount_6Months Sales_Amount_6Months
1 70 130
Shall I use add_months in this case?
Yes, you can use ADD_MONTHS() and conditional aggregation :
SELECT b.id,
SUM(CASE WHEN a.sales_date between b.id_date AND ADD_MONTHS(b.id_date,6) THEN a.sales_amount ELSE 0 END) as sales_6_month,
SUM(CASE WHEN a.sales_date between b.id_date AND ADD_MONTHS(b.id_date,12) THEN a.sales_amount ELSE 0 END) as sales_12_month
FROM TableB b
JOIN TableA a
ON(b.id = a.id)
GROUP BY b.id

MySQL - Selecting records based on maximum secondary ID

Here's part of my table:
id team_id log_id
1 12 1
2 12 1
3 12 1
4 12 1
5 1 2
6 1 2
7 1 3
8 1 3
What query would produce this output (so only the records with the highest log_id values are returned that correspond to team_id)?
id team_id log_id
1 12 1
2 12 1
3 12 1
4 12 1
7 1 3
8 1 3
SELECT *
FROM mytable t
WHERE log_id = (SELECT MAX(log_id) FROM mytable WHERE team_id = t.team_id)
SELECT id, team_id, log_id
FROM table1 t2
JOIN (SELECT team_id, MAX(log_id) max_log_id
FROM table1
GROUP BY team_id) t2 ON t1.team_id = t2.team_id
AND t1.log_id = t2.max_log_id