SQL sum with group by - sql

Below is the list entry:
GROUPNUMBER|COUNTRY|COUNT|AMOUNT
BD2 |US |4 |2.00
BD2 |US |10 |8.00
BD2 |CANADA |12 |10.00
BD5 |UK |2 |1.00
BD5 |US |6 |4.00
BD5 |UK |8 |6.00
Result output needed:
GROUPNUMBER|US_COUNT|US_AMOUNT|NON_US_COUNT|NON_US_AMOUNT
BD2 |14 |10.00 |12 |10.00
BD5 |6 |4.00 |10 |7.00
Need to separate the count and amount with US and NON_US and group by GroupNumber field.
is this possible in MS SQL?
Thanks,
Subs

Should be doable with a CASE statement:
SELECT
GROUPNUMBER,
SUM(CASE WHEN COUNTRY = 'US' THEN [COUNT] ELSE 0 END) AS US_COUNT,
SUM(CASE WHEN COUNTRY = 'US' THEN AMOUNT ELSE 0 END) AS US_AMOUNT,
SUM(CASE WHEN COUNTRY != 'US' THEN [COUNT] ELSE 0 END) AS NON_US_COUNT,
SUM(CASE WHEN COUNTRY != 'US' THEN AMOUNT ELSE 0 END) AS NON_US_AMOUNT
FROM theTable
GROUP BY GROUPNUMBER
SqlFiddle

Related

How to use Dense_rank() order by time but ignore duplicates from another colum?

I have a dataframe like this :
network|Value1|Value2|datetime
---------------------------------------
1 |A |null |2021-07-16 15:59:56.133
1 |B |null |2021-07-15 11:00:05.633
1 |B |null |2021-07-15 10:59:59.100
1 |C |null |2021-07-15 06:03:49.000
1 |null |A |2021-07-16 15:59:56.133
1 |null |B |2021-07-16 14:45:00.309
1 |null |C |2021-07-16 09:19:26.580
I want to create two ranks:
for each network, I want to rank [Value1] by datetime desc
for each network, I want to rank [Value2] by datetime desc
But for each ranks, I don't want to count duplicates [Value1] or [Value2]
The expected outcome should be:
network|Value1|Value2|datetime |rank_Value1 |rank_Value2
-------------------------------------------------------------
1 |A |null |2021-07-16 15:59:56.133 |1 |null
1 |B |null |2021-07-15 11:00:05.633 |2 |null
1 |B |null |2021-07-15 10:59:59.100 |2 |null
1 |C |null |2021-07-15 06:03:49.000 |3 |null
1 |null |A |2021-07-16 15:59:56.133 |null |1
1 |null |B |2021-07-16 14:45:00.309 |null |2
1 |null |C |2021-07-16 09:19:26.580 |null |3
Since I want the rank to be the same when [Value] is duplicated and I want the rank to be incremented 1 by 1, I use DENSE_RANK() for that and I tried this:
SELECT *,
CASE WHEN Value1 is null THEN NULL ELSE DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value1 is not null THEN 1 ELSE 0 END order by datetime desc) END as rank_Value1,
CASE WHEN Value2 is null THEN NULL ELSE DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value2 is not null THEN 1 ELSE 0 END order by datetime desc) END as rank_Value2
FROM df
But the outcome is as followed :
network|Value1|Value2|datetime |rank_Value1 |rank_Value2
-------------------------------------------------------------
1 |A |null |2021-07-16 15:59:56.133 |1 |null
1 |B |null |2021-07-15 11:00:05.633 |2 |null
1 |B |null |2021-07-15 10:59:59.100 |3 |null
1 |C |null |2021-07-15 06:03:49.000 |4 |null
1 |null |A |2021-07-16 15:59:56.133 |null |1
1 |null |B |2021-07-16 14:45:00.309 |null |2
1 |null |C |2021-07-16 09:19:26.580 |null |3
I feel like I am almost there but I don't know how to do this...
I am not comfortable with TSQL so if someone can help me, I would really appreciate it!
Reading through the lines here, but I suspect this is a gaps and island problem. If so, then I think this might be what you are after:
WITH YourTable AS(
SELECT *
FROM (VALUES(1,'A ',null,CONVERT(datetime,'2021-07-16T15:59:56.133')),
(1,'B ',null,CONVERT(datetime,'2021-07-15T11:00:05.633')),
(1,'B ',null,CONVERT(datetime,'2021-07-15T10:59:59.100')),
(1,'C ',null,CONVERT(datetime,'2021-07-15T06:03:49.000')),
(1,null,'A ',CONVERT(datetime,'2021-07-16T15:59:56.133')),
(1,null,'B ',CONVERT(datetime,'2021-07-16T14:45:00.309')),
(1,null,'C ',CONVERT(datetime,'2021-07-16T09:19:26.580')))V(network,Value1,Value2,datetime)),
Grps AS(
SELECT network,
Value1,
Value2,
datetime,
ROW_NUMBER() OVER (PARTITION BY network ORDER BY datetime) -
ROW_NUMBER() OVER (PARTITION BY network, Value1 ORDER BY datetime) AS Group1,
ROW_NUMBER() OVER (PARTITION BY network ORDER BY datetime) -
ROW_NUMBER() OVER (PARTITION BY network, Value2 ORDER BY datetime) AS Group2
FROM YourTable)
SELECT network,
Value1,
Value2,
datetime,
CASE WHEN Value1 IS NOT NULL THEN DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value1 IS NOT NULL THEN 1 ELSE 0 END ORDER BY group1 DESC) END AS rank_Value1,
CASE WHEN Value2 IS NOT NULL THEN DENSE_RANK() OVER (PARTITION BY network, CASE WHEN Value2 IS NOT NULL THEN 1 ELSE 0 END ORDER BY group2 DESC) END AS rank_Value2
FROM Grps
ORDER BY CASE WHEN Value1 IS NULL THEN 1 ELSE 0 END,
datetime DESC;
How about measuring the change in values?
SELECT df.*,
(CASE WHEN value1 IS NOT NULL
THEN SUM(CASE WHEN next_value_1 = value1 THEN 0 ELSE 1 END) OVER (ORDER BY datetime)
END),
(CASE WHEN value2 IS NOT NULL
THEN SUM(CASE WHEN next_value_1 = value2 THEN 0 ELSE 1 END) OVER (ORDER BY datetime)
END)
FROM (SELECT df.*,
LEAD(value1) OVER (ORDER BY datetime) as as next_value1,
LEAD(value2) OVER (ORDER BY datetime) as as next_value2
FROM df
) df;
Note that in your sample data, the value1s and value2s are not interleaved. The above assumes that is the case. Otherwise, you need the (case) expression to separate out the rows with each value in the partitioning clauses.

Output several counts of one table records in one row

Here is an example table CALLRECORD:
+--------+------------+
|callid | rating |
|1 | |
|2 | 5 |
|3 | |
|4 | 1 |
|5 | |
+--------+------------+
No problem to output total number of calls, number of rated calls, average rating and number of unrated calls:
select count(*) as total from callrecord;
select count(*) as rated, avg(rating) as average_rating from callrecord where rating is not null;
select count(*) as unrated from callrecord where rating is null;
+--------+
|total |
|5 |
+--------+
+--------+------------+
|rated |average |
|2 |3 |
+--------+------------+
+--------+
|unrated |
|3 |
+--------+
I'm looking for how to output all above to one row with single SQL request:
+--------+--------+------------+---------+
|total |rated |average |unrated |
|5 |2 |3 |3 |
+--------+--------+------------+---------|
db<>fiddle here
Most aggregate functions ignore null values, so what you want is simpler that you may think:
select
count(*) total, -- total number of rows
count(rating) as rated, -- count of non-null ratings
avg(rating) average, -- avg ignore `null`
count(*) - count(rating) unrated -- count of null ratings
from mytable
Try using the SUM aggregation with a CASE statement inside of it. Example below.
Select
COUNT(*) AS 'Total',
SUM(CASE WHEN rating IS NULL THEN 0 ELSE 1 END) AS 'Rated',
(SUM(CASE WHEN rating IS NULL THEN 0 ELSE rating END)/SUM(CASE WHEN rating IS NULL THEN 0 ELSE 1 END)) AS 'Avg',
SUM(CASE WHEN rating IS NULL THEN 1 ELSE 0 END) AS 'Unrated'
From callrecord

Produce a per customer per month sales report via SQL eg group by two columns and sum the third

I rarely write SQL (Azure SQL) however I am trying to generate a per month sales total per customer.
Customer:
|Username |ID |
|user1 |1 |
|user2 |2 |
|user3 |3 |
Order:
|CustomerId |Month |Total |
|1 |1 |275 |
|1 |1 |10 |
|2 |1 |100 |
|1 |3 |150 |
|2 |2 |150 |
|2 |2 |65 |
|3 |2 |150 |
I want to produce
|Username |Month1Total |Month2Total | Month3Total |
|user1 |285 |275 | 150 |
|user2 |100 |215 | 0 |
|user3 |0 |150 | 0 |
I can do the following
SELECT customerTable.Username Username, SUM(orderTable.OrderTotal) TotalMay
FROM "Order" orderTable
JOIN Customer customerTable ON orderTable.CustomerId = customerTable.Id
WHERE DATENAME(Month, (orderTable.PaidDateUTC)) = 'May'
GROUP BY Username
Which will give me an output per month. However I don't know how to loop this, do it per month and then group by username.
IF you want to have a separate column for each month then try this
SELECT customerTable.Username Username
,SUM(iif(ordertable.[month] = 1,orderTable.OrderTotal,0)) TotalJan
,SUM(iif(ordertable.[month] = 2,orderTable.OrderTotal,0)) TotalFeb
,SUM(iif(ordertable.[month] = 3,orderTable.OrderTotal,0)) TotalMar
,SUM(iif(ordertable.[month] = 4,orderTable.OrderTotal,0)) TotalApr
,SUM(iif(ordertable.[month] = 5,orderTable.OrderTotal,0)) TotalMay
FROM "Order" orderTable
JOIN Customer customerTable ON orderTable.CustomerId = customerTable.Id
GROUP BY Username
should be easy to add the remaining months
you can use case when
with t1 as
( select o.CustomerId,o.Month, sum(Total) as total from
[Order]
group by o.CustomerId,o.Month
) select c.Username,
case when t1.month=1 then t1.total else 0 end month1,
case when t1.month=2 then t1.total else 0 end month2,
case when t1.month=3 then t1.total else 0 end month3
from t1 join Customer c on t1.CustomerId=c.ID
Or you can use PIVOT
select c.username, t.* from
(
select * from
(select * from ord
) src
pivot
( sum(Total) FOR Month IN ([1],[2],[3])
) pvt
) as t join Customer c on t.CustomerId=c.ID
Something like this:
SELECT DATENAME(Month, (orderTable.PaidDateUTC)) MonthName,
customerTable.Username Username,
SUM(orderTable.OrderTotal) TotalMay
FROM "Order" orderTable
JOIN Customer customerTable ON orderTable.CustomerId = customerTable.Id
GROUP BY DATENAME(Month, (orderTable.PaidDateUTC)), Username
You just need to move the month name from the WHERE clause to the GROUP BY.
I would simply do JOIN with conditional aggregation :
SELECT c.Username,
SUM(CASE WHEN o.Month = 1 THEN o.Total ELSE 0 END) AS [Month1Total],
SUM(CASE WHEN o.Month = 2 THEN o.Total ELSE 0 END) AS [Month2Total],
SUM(CASE WHEN o.Month = 3 THEN o.Total ELSE 0 END) AS [Month3Total],
. . .
FROM Customer C INNER JOIN
Order o
ON o.CustomerId = c.id
GROUP BY c.Username;

SQLite How to stack groups horizontally?

For example, say I have a table (id is letter):
letter|color |number
a |green |2
a |blue |3
b |red |3
b |blue |4
b |yellow|1
c |red |9
c |blue |5
What I want is to transform it to:
letter|color_1|color_2|color_3|number_1|number_2|number_3
a |green |blue | |2 |3 |
b |red |blue |yellow |3 |4 |1
c |red |blue | |9 |5 |
What type of SQL transformation is this? My boss said it is something done frequently but I've never seen it before? Also, how would you do it?
This is a pivot query. If you know that you want three sets of columns, then you can use conditional aggregation.
The problem in SQLite is that you don't have an easy way to enumerate things. For this, you can use a subquery:
select t.letter,
max(case when seqnum = 1 then color end) as color_1,
max(case when seqnum = 2 then color end) as color_2,
max(case when seqnum = 3 then color end) as color_3,
max(case when seqnum = 1 then number end) as number_1,
max(case when seqnum = 2 then number end) as number_2,
max(case when seqnum = 3 then number end) as number_3
from (select t.*,
(select count(*) from t t2 where t2.letter = t.letter and t2.color <= t.color) as seqnum
from t
) t
group by t.letter;

How to query multiple COUNT(*) with good performance

I have this table:
CREATE TABLE schedule (
schedule_id serial NOT NULL,
start_date date,
CONSTRAINT schedule_id PRIMARY KEY (schedule_element_id)
)
And this table:
CREATE TABLE schedule_user (
schedule_user_id serial NOT NULL,
schedule_id integer,
state int,
CONSTRAINT fk_schedule_id FOREIGN KEY (schedule_id)
REFERENCES schedule (schedule_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
schedule
-------------------------
|schedule_id |date |
|------------+------------|
|1 |'2013-10-10'|
|2 |'2013-10-20'|
|3 |'2013-08-13'|
-------------------------
schedule_user
-----------------------------------
|schedule_user_id|schedule_id |state|
|----------------+------------+-----|
|1 | 1 |0 |
|2 | 1 |1 |
|3 | 1 |2 |
|4 | 1 |0 |
|5 | 1 |1 |
|6 | 1 |1 |
|4 | 2 |0 |
|5 | 2 |1 |
|7 | 2 |0 |
|2 | 3 |1 |
-----------------------------------
And I want a table like this:
characteristic
---------------------------------------
|schedule_id |state0|state1|state2|total|
|------------+------+------+------+-----|
|1 |2 |3 |1 |6 |
|2 |2 |1 |0 |3 |
|3 |1 |1 |0 |2 |
---------------------------------------
I've made this query that looks as as horrible as it's performance.
SELECT
schedule.schedule_id AS id,
(( SELECT count(*) AS count
FROM schedule_user
WHERE schedule_user.schedule_id = schedule.schedule_id
AND state=0))::integer AS state0,
(( SELECT count(*) AS count
FROM schedule_user
WHERE schedule_user.schedule_id = schedule.schedule_id
AND state=1))::integer AS state1,
(( SELECT count(*) AS count
FROM schedule_user
WHERE schedule_user.schedule_id = schedule.schedule_id
AND state=2))::integer AS state2,
(( SELECT count(*) AS count
FROM schedule_user
WHERE schedule_user.schedule_id = schedule.schedule_id))::integer
AS total
FROM schedule
Is there a better way to perform such a query?
Should I create an Index to 'state' column? if so, how should it look like?
You want to make a pivot table. An easy way to make one in SQL if you know all of the possible values of state beforehand is using sum and case statements.
select schedule_id,
sum(case state when 0 then 1 else 0 end) as state0,
sum(case state when 1 then 1 else 0 end) as state1,
sum(case state when 2 then 1 else 0 end) as state2,
count(*) as total
from schedule_user
group by schedule_id;
Another way is to use the crosstab table function.
Neither of these will let you get away with not knowing the set of values of state (and hence the columns in the result set).
I would try
SELECT s.schedule_id,
COUNT(CASE WHEN su.state = 0 THEN 1 END) AS state0,
COUNT(CASE WHEN su.state = 1 THEN 1 END) AS state1,
COUNT(CASE WHEN su.state = 2 THEN 1 END) AS state2,
COUNT(su.state) AS total
FROM schedule s
LEFT
OUTER
JOIN schedule_user su
ON su.schedule_id = s.schedule_id
GROUP
BY s.schedule_id
;
Ths standard approach is to use SUM() with a CASE over a JOIN with a GROUP BY:
SELECT
schedule.schedule_id AS id,
SUM (case when state=0 then 1 else 0 end) AS state0,
SUM (case when state=1 then 1 else 0 end) AS state1,
SUM (case when state=2 then 1 else 0 end) AS state2,
count(*) AS total
FROM schedule
LEFT JOIN schedule_user
ON schedule_user.schedule_id = schedule.schedule_id
GROUP BY 1