Is there a way to select sum on one column based on other DISTINCT column, while grouping by third column(date) only - sql

I have three columns
year | money | id
2020 100 01
2020 100 01
2019 50 02
2018 50 03
2020 40 04
results should be
Year | Money | total people
2020 | 240 | 4
** AS first two ids are the same, I tried it as below
select year, sum(money), Count( Distinct id) from table
group by year
But the result shows 4 people which is the correct but wrong sum, as it is counting all of the money

You can aggregate and then aggregate again:
select max(year), sum(money), count(*)
from (select distinct year, money, id
from t
) t;

You can use SUM() and COUNT(DISTINCT x).
For example:
select
year,
sum(money) as money,
(select count(distinct id) from t) as total_people
from t
where year = 2020
group by year;
Result:
YEAR MONEY TOTAL_PEOPLE
----- ------ ------------
2020 240 4
See running example at db<>fiddle.

Not the most performant, but if you wish to avoid a derived table, you can do
select distinct
max(year) over (),
sum(money) over (),
count(*) over ()
from t
group by year, money, id;
And if you want this grouped by year, you can define the partitions in the over clause

Related

Using subquery in conjunction with a WHERE clause

There exists the following table:
practice=# select * from table;
letter | value | year
--------+---------+------
A | 5000.00 | 2021
B | 6000.00 | 2021
C | 6000.00 | 2021
B | 8000.00 | 2022
A | 9000.00 | 2022
C | 7000.00 | 2022
A | 2000.00 | 2021
B | 1000.00 | 2022
C | 3000.00 | 2021
(9 rows)
In order to calculate the percentages of A, B, and C relative to the total value (i.e. the sum of A values in the table divided by the sum of all values in the table), I am using a subquery as follows:
practice=# select letter, cast((group_values/(select sum(value) from percentages)*100) as decimal(4,2)) as group_values from (select letter, sum(value) as group_values from percentages group by letter order by letter) as subquery order by group_values desc;
letter | group_values
--------+--------------
A | 34.04
C | 34.04
B | 31.91
(3 rows)
However, I now want to be able to filter the results by year, e.g. calculate the above only where the year entries are 2022, for instance.
I have tried incorporating a WHERE clause within the subquery to filter by year.
select letter, cast((group_values/(select sum(value) from percentages)*100) as decimal(4,2)) as group_values from (select letter, sum(value) as group_values from percentages where year='2022' group by letter order by letter) as subquery order by group_values desc;
However, we can see that this does not update the total to only include the entries for 2022. Instead, it seems that SQL is calculating the percentage entries for 2022 across A, B, and C for the total across all years.
letter | group_values
--------+--------------
A | 19.15
B | 19.15
C | 14.89
(3 rows)
Similarly, using the WHERE clause outside the subquery results in an error:
select letter, cast((group_values/(select sum(value) from percentages)*100) as decimal(4,2)) as group_values from (select letter, sum(value) as group_values from percentages group by letter order by letter) as subquery where year='2022' order by group_values desc;
ERROR: column "year" does not exist
The subquery would get you all te years, but adding the qhere clause there will only get the numbers for 2022
i also remove the unnecessary order by letters
SELECT
letter,
CAST((group_values / (SELECT
SUM(value)
FROM
percentages
WHERE
year = '2022') * 100)
AS DECIMAL (4 , 2 )) AS group_values
FROM
(SELECT
letter, SUM(value) AS group_values
FROM
percentages
WHERE
year = '2022'
GROUP BY letter) AS subquery
ORDER BY group_values DESC;
select letter, year, sum(value) * 1 0 / sum(value) over (partition by year)
from T
where year = 2022
group by letter, year;
The partition is redundant when all rows are in it but it will still work when the filter is removed too.

SQL: How to return revenue for specific year

I would like to show the revenue for a specific year for all customers regardless of whether or not they have revenue data for the specific year. (in cases they dont have data for the specific year, a filler like 'no data' would work)
Sample Data looks like:
Table 1
Customer
Price
Quantity
Order Date
xxx
12
5
1990/03/25
yyy
15
7
1991/05/35
xxx
34
2
1990/08/21
Desired Output would look a little something like this:
Customer
Revenue (for 1990)
xxx
128
yyy
no data
Getting the total revenue for each would be:
SELECT Customer,
SUM(quantity*price) AS Revenue
but how would i go about listing it out for a specific year for all customers? (incl. customers that dont have data for that specific year)
We can use a CTE or a sub-query to create a list of all customers and another to get all years and the cross join them and left join onto revenue.
This gives an row for each customer for each year. If you add where y= you will only get the year requested.
CREATE TABLE revenue(
Customer varchar(10),
Price int,
Quantity int,
OrderDate date);
insert into revenue values
('xxx', 12,5,'2021-03-25'),
('yyy', 15,7,'2021-05-15'),
('xxx', 34,2,'2022-08-21');
with cust as
(select distinct customer c from revenue),
years as
(select distinct year(OrderDate) y from revenue)
select
y "year",
c customer ,
sum(price*quantity) revenue
from years
cross join cust
left join revenue r
on cust.c = r.customer and years.y = year(OrderDate)
group by
c,y,
year(OrderDate)
order by y,c
year | customer | revenue
---: | :------- | ------:
2021 | xxx | 60
2021 | yyy | 105
2022 | xxx | 68
2022 | yyy | null
db<>fiddle here
You would simply use group by and do the sum in a subquery and left join it to your customers table. ie:
select customers.Name, totals.Revenue
from Customers
Left join
( select customerId, sum(quantity*price) as revenue
from myTable
where year(orderDate) = 1990
group by customer) totals on customers.CustomerId = myTable.customerId;

SQL query: get total values for each month

I have a table that stores, number of fruits sold on each day. Stores number of items sold on particular date.
CREATE TABLE data
(
code VARCHAR2(50) NOT NULL,
amount NUMBER(5) NOT NULL,
DATE VARCHAR2(50) NOT NULL,
);
Sample data
code |amount| date
------+------+------------
aple | 1 | 01/01/2010
aple | 2 | 02/02/2010
orange| 3 | 03/03/2010
orange| 4 | 04/04/2010
I need to write a query, to list out, how many apple and orange sold for jan and february?
--total apple for jan
select sum(amount) from mg.drum d where date >='01/01/2010' and cdate < '01/02/2020' and code = 'aple';
--total apple for feb
select sum(amount) from mg.drum d where date >='01/02/2010' and cdate < '01/03/2020' and code = 'aple';
--total orange for jan
select sum(amount) from mg.drum d where date >='01/01/2010' and cdate < '01/02/2020' and code = 'orange';
--total orange for feb
select sum(amount) from mg.drum d where date >='01/02/2010' and cdate < '01/03/2020' and code = 'orange';
If I need to calculate for more months, more fruits, its tedious.is there a short query to write?
Can I combine at least for the months into 1 query? So 1 query to get total for each month for 1 fruit?
You can use conditional aggregation such as
SELECT TO_CHAR("date",'MM/YYYY') AS "Month/Year",
SUM( CASE WHEN code = 'apple' THEN amount END ) AS apple_sold,
SUM( CASE WHEN code = 'orange' THEN amount END ) AS orange_sold
FROM data
WHERE "date" BETWEEN date'2020-01-01' AND date'2020-02-29'
GROUP BY TO_CHAR("date",'MM/YYYY')
where date is a reserved keyword, cannot be a column name unless quoted.
Demo
select sum(amount), //date.month
from mg.drum
group by //date.month
//data.month Here you can give experssion which will return month number or name.
If you are dealing with months, then you should include the year as well. I would recommend:
SELECT TRUNC(date, 'MON') as yyyymm, code,
SUM(amount)
FROM t
GROUP BY TRUNC(date, 'MON'), code;
You can add a WHERE clause if you want only some dates or codes.
This will return a separate row for each row that has data. That is pretty close to the results from your four queries -- but this does not return 0 values.
select to_char(date_col,'MONTH') as month, code, sum(amount)
from mg.drum
group by to_char(date_col,'MONTH'), code

Oracle sql: Order by with GROUP BY ROLLUP

I'm looking everywhere for an answer but nothing seems to compare with my problem. So, using rollup with query:
select year, month, count (sale_id) from sales
group by rollup (year, month);
Will give the result like:
YEAR MONTH TOTAL
2015 1 200
2015 2 415
2015 null 615
2016 1 444
2016 2 423
2016 null 867
null null 1482
And I would like to sort by total desc, but I would like year with biggest total to be on top (important: with all records that compares to that year), and then other records for other years. So I would like it to look like:
YEAR MONTH TOTAL
null null 1482
2016 null 867
2016 1 444
2016 2 423
2015 null 615
2015 2 415
2015 1 200
Or something like that. Main purpose is to not "split" records comparing to one year while sorting it with total. Can somebody help me with that?
Try using window function max to get max of total for each year in the order by clause:
select year, month, count(sale_id) total
from sales
group by rollup(year, month)
order by max(total) over (partition by year) desc, total desc;
Hmmm. I think this does what you want:
select year, month, count(sale_id) as cnt
from sales
group by rollup (year, month)
order by sum(count(sale_id)) over (partition by year) desc, year;
Actually, I've never use window functions in an order by with a rollup query. I wouldn't be surprised if a subquery were necessary.
I think you need to used GROUPING SETS and GROUP_ID's. These will help you determine a NULL caused by a subtotal. Take a look at the doc: https://docs.oracle.com/cd/B19306_01/server.102/b14223/aggreg.htm

Generate year to date by month report in SQL [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Running total by grouped records in table
I am trying to put together an SQL statement that returns the SUM of a value by month, but on a year to date basis. In other words, for the month of March, I am looking to get the sum of a value for the months of January, February, and March.
I can easily do a group by to get a total for each month by itself, and potentially calculate the year to date value I need in my application from this data by looping through the results set. However, I was hoping to have some of this work handled with my SQL statement.
Has anyone ever tackled this type of problem with an SQL statement, and if so, what is the trick that I am missing?
My current sql statement for monthly data is similar to the following:
Select month, year, sum(value) from mytable group by month, year
If I include a where clause on the month, and only group by the year, I can get the result for a single month that I am looking for:
select year, sum(value) from mytable where month <= selectedMonth group by year
However, this requires me to have a particular month pre-selected or to utilize 12 different SQL statements to generate one clean result set.
Any guidance that can be provided would be greatly appreciated!
Update: The data is stored on an IBM iSeries.
declare #Q as table
(
mmonth INT,
value int
)
insert into #Q
values
(1,10),
(1,12),
(2,45),
(3,23)
select sum(January) as UpToJanuary,
sum(February)as UpToFebruary,
sum(March) as UpToMarch from (
select
case when mmonth<=1 then sum(value) end as [January] ,
case when mmonth<=2 then sum(value) end as [February],
case when mmonth<=3 then sum(value) end as [March]
from #Q
group by mmonth
) t
Produces:
UpToJanuary UpToFebruary UpToMarch
22 67 90
You get the idea, right?
NOTE: This could be done easier with PIVOT tables but I don't know if you are using SQL Server or not.
As far as I know DB2 does support windowing functions although I don't know if this is also supported on the iSeries version.
If windowing functions are supported (I believe IBM calls them OLAP functions) then the following should return what you want (provided I understood your question correctly)
select month,
year,
value,
sum(value) over (partition by year order by month asc) as sum_to_date
from mytable
order by year, month
create table mon
(
[y] int not null,
[m] int not null,
[value] int not null,
primary key (y,m))
select a.y, a.m, a.value, sum(b.value)
from mon a, mon b
where a.y = b.y and a.m >= b.m
group by a.y, a.m, a.value
2011 1 120 120
2011 2 130 250
2011 3 500 750
2011 4 10 760
2011 5 140 900
2011 6 100 1000
2011 7 110 1110
2011 8 90 1200
2011 9 70 1270
2011 10 150 1420
2011 11 170 1590
2011 12 600 2190
You should try to join the table to itself by month-behind-a-month condition and generate a synthetic month-group code to group by as follows:
select
sum(value),
year,
up_to_month
from (
select a.value,
a.year,
b.month as up_to_month
from table as a join table as b on a.year = b.year and b.month => a.month
)
group by up_to_month, year
gives that:
db2 => select * from my.rep
VALUE YEAR MONTH
----------- ----------- -----------
100 2011 1
200 2011 2
300 2011 3
400 2011 4
db2 -t -f rep.sql
1 YEAR UP_TO_MONTH
----------- ----------- -----------
100 2011 1
300 2011 2
600 2011 3
1000 2011 4