SQL SUM OVER PARTITION BY 2 columns not working

SQL SUM OVER PARTITION BY 2 columns not working - sql

Common question, I know. Just haven't been able to find a solution to my question, so hit me with the removal and or downvotes if you must.
(Oracle 12c)
I have data that looks like this:
Date ITEM QTY
01-MAR-20 STS 6920
01-MAR-20 STS 2581
01-MAR-20 S01 22606
01-MAR-20 S01 22312
01-MAR-20 S01 56000
....
I want to get QTY to aggregate (sum) at the Date and item level with only one record for each unique item on each date, so it looks like this:
Date ITEM QTY
01-MAR-20 STS 9501
01-MAR-20 S01 100918
The query I'm trying to use to do this is:
SELECT
WO.DATE,
D.ITEM,
SUM(WO.QUANTITY) OVER (PARTITION BY WO.DATE, D.ITEM) AS QTY
FROM SCHEMA_1.WO,
SCHEMA_2.D
WHERE WO.ITEM_DIM_KEY = D.ITEM_DIM_KEY AND
(DATE > '01 MAR 2020' AND DATE < '01 JUL 2020')
ORDER BY WO.COMPLETED_DATE;

A date data type in Oracle can have a time component. So you need to be careful. Unless you know hat you have no time component, trunc() is safer. Also, you can use the date keyword to handle date constants:
SELECT
SELECT TRUNC(WO.DATE), D.ITEM,
SUM(WO.QUANTITY)
FROM SCHEMA_1.WO JOIN
SCHEMA_2.D
ON WO.ITEM_DIM_KEY = D.ITEM_DIM_KEY
WHERE WO.DATE >= DATE '2020-03-01' AND
WO.DATE < DATE '2020-07-01'
GROUP BY TRUNC(WO.DATE), D.ITEM
ORDER BY TRUNC(WO.DATE);
Notes:
You don't need an analytic function. Aggregation should be sufficient.
Use proper, explicit, standard, readable JOIN syntax.
I assume the column in the ORDER BY is intended to be the first column in the result set.

You just need a basic GROUP BY query:
SELECT
WO.DATE,
D.ITEM,
SUM(WO.QUANTITY) AS QTY
FROM SCHEMA_1.WO
INNER JOIN SCHEMA_2.D
ON WO.ITEM_DIM_KEY = D.ITEM_DIM_KEY
WHERE
DATE > '01 MAR 2020' AND DATE < '01 JUL 2020'
GROUP BY
WO.DATE,
D.ITEM
ORDER BY
WO.COMPLETED_DATE;
Using SUM as a window function would make sense if you wanted to retain every record in the result set of the join. However, in your case, you want to report aggregate sums for each date/item group. Using GROUP BY is what you want here.

I think you just need a simple aggregation:
select trunc(date), item, sum(qty)
from schema_1.wo
join schema_2.d on wo.item_dim_key = d.item_dim_key
where date > date '2020-03-01' and date < date '2020-07-01'
group by trunc(date), item
order by wo.completed_date
Incidentally, I upgraded the JOIN to SQL-92 and fixed the dates literal to ISO dates instead of VARCHARs.

Related

Group Data by Year, Oracle SQL

I am trying to create a query that counts records that existed within a year. The table looks like this:
Title_ID ISSUE_DATE EXPIRY_DATE CLIENT_NUMBER
123 '26-JUN-19' '17-AUG-20' 8529
124 '04-APR-19' '17-SEP-22' 8529
125 '09-MAY-15' '11-SEP-19' 3654
126 '31-DEC-19' '25-NOV-22' 9852
127 '27-OCT-18' '26-FEB-21' 2254
128 '05-OCT-11' '01-JAN-19' 9852
Specifically, I want to count the number of distinct CLIENT_NUMBERS of the records that existed in a given calendar year.
The record (title) exists from the ISSUE_DATE until the EXPIRY_DATE. If the record existed at any point within a year (Let's say 2019), then we are interested in including it in our client count.
So, if the record was issued in 2019 or if the record expired in 2019 or if the record was issued before 2019 and expired after 2019, then we are interested in including it in the client count for the year it existed.
I have built the following query that does this, but only for one specific year (2019). I'd like to build the query further so it look at each calendar year and counts the distinct client numbers when the client has an active title:
SELECT *
-- count(distinct client_number)
FROM
TITLE
WHERE
issue_date between '01-Jan-19' and '31-Dec-19'
or expiry_date between '01-Jan-19' and '31-Dec-19'
or (issue_date < '01-Jan-19' and expiry_date > '31-Dec-19')
Where I am having trouble is, my data is much larger than the subset I have provided. I would like to recursively get counts of distinct client numbers by year using the same kind of logic to include a record within a calendar year as I have outlined above. So, I'd like to have a table like this:
YEAR COUNT_OF_CLIENT_NUMBERS
2020 5469
2019 5587
2018 4852
2017 4501
2016 3265
etc
I think I've stretched by current SQL abilities at this point, so I thought Id ask to see if there are any suggestions to make this happen?
Thanks.
EDIT: to clarify, the issue date and the expiry date apply to the title, not the client. So, the title is issued on the issue date and expires on the expiry date. A client can own one or more title(s).
So, I am looking to get a count of how many distinct clients own active titles within a give year if one or more of their titles is active within that year. So the key is, a title is considered active if it was issued in that year OR it expired within that year OR it was issued before that year and expired after that year. A title CAN be active in multiple years (i.e. Issued on Feb. 4, 2014 and expires on Apr.7 2017, I want to include the client count for each year that titles exists....2014, 2015, 2016 and 2017).
So, I created a table to join to (thanks #GMB for the suggestion):
with calendar_year (y) as
(
select 2010 from dual
union all select y + 1 from calendar_year where y < 2020
)
select * from calendar_year
Which returns:
2010
2011
2012
2013
2014
etc
I want to join that to my titles table, but I am having issues recursively looking at the issue date and expiry date to join up the title to each year it existed in. Any help in that area, would be great!

You can use a recursive query to generate the years, then bring the table with a left join, and aggregate:
with dates (dt) as (
select date '2016-01-01' from dual
union all select add_months(dt, 1) from dates where dt < date '2020-01-01'
)
select d.dt, count(distinct t.client_number) count_of_client_numbers
from dates d
left join title t
on t.issue_date <= d.dt
and t.expiry_date > d.dt
group by d.dt
The upside of this approach is that you get results for each and every year, even those where no title started or ended.

You can get number of clients on any day by unpivoting the data, so there is one row per date. Then keep track of the "ins" and "outs".
You don't specify the database, but here is one approach:
select dte, sum(inc),
sum(sum(inc)) over (order by dte) as active_on_date
from ((select issue_date as dte, 1 as inc
from t
) union all
(select expiry_date as dte, -1 as inc
from t
)
) t
group by dte
order by dte;
EDIT:
Hmmm, the above may not do exactly what you want. If you want to count distinct client numbers rather than overall rows, then it might be simpler to just list the dates and join:
select d.dte, count(distinct t.client_id)
from (select date '2020-01-01' as dte from dual union all
select date '2019-01-01' as dte from dual union all
select date '2018-01-01' as dte from dual union all
. . .
) d left join
t
on d.dte between t.issue_dte and t.expiry_dte
group by d.dte
order by d.dte;

Oracle sql query not showing specific date

This is my first time with oracle database. So I save data with date 30/04/20 and I want to retrieve it. So I use SELECT * FROM USER_ACTION WHERE ACTION_DATE_TIME <= '30-APR-20' order by ACTION_DATE_TIME desc but no data with date 30/04/20 are shown. However when I use SELECT * FROM USER_ACTION WHERE ACTION_DATE_TIME <= '01-MAY-20' order by ACTION_DATE_TIME desc, I can see the data. Is there anyway that I can get date with exact date? no need to put extra +1 day to get it.
This is result when use 30-APR-20:
This is result when use 01-MAY-20:

Given that your ACTION_DATE_TIME column be a datetime, with time component, if you want to include 30th April 2020 proper, you should be using this inequality:
SELECT *
FROM USER_ACTION
WHERE ACTION_DATE_TIME < date '2020-05-01'
ORDER BY ACTION_DATE_TIME DESC;
This will include all dates strictly less than 1st May 2020, which include all of 30th April 2020.
If the date value is coming from the outside, then just add one day to it:
SELECT *
FROM USER_ACTION
WHERE ACTION_DATE_TIME < date '2020-05-01' + 1
ORDER BY ACTION_DATE_TIME DESC;

use trunc to convert date time to date as below
SELECT *
FROM USER_ACTION
WHERE TRUNC(ACTION_DATE_TIME) <= '30-APR-20'
order by ACTION_DATE_TIME desc

How to get unique dates based on from_date and to_date in SQL Server

from_date to_date duration
-------------------------------------
2018-10-01 2018-10-10 9
2018-10-05 2018-10-07 3
If I provide input #from_date = 2018-10-01, to_date = 2018-10-11, I want to display count as 9

How about that:
SELECT DATEDIFF(DAY,'20181001','20181011')-1

--To select a single value per row
SELECT
DATEDIFF(DAY,from_date,to_date) as duration
FROM
SomeTable
You could apply a WHERE clause to filter to just a specific row that you want the duration of returned or wrap the DATEDIFF function in an AVG() or SUM() to get the avergae or total of all the durations in the table. You can do all kinds of very complex things with T-SQL. For instance the below query will get you the average duration for each month when whatever was started (from_date) for the year 2017.
E.G. -
SELECT
DATEPART(Month, from_date) as Month,
AVG(DATEDIFF(DAY, from_date, to_date) as AvgDuration
FROM SomeTable
WHERE
DATEPART(Year, from_date) = 2017
GROUP BY
DATEPART(Month, from_date)
Hope this helps. If not, feel free to try again. :)

Grouping all dates as one field and showing the sum of sales

I have converted all dates within my table to reflect as YYYY/MM/01 but I am left with 25 or so of these dates that are all the same and I just want to group them together and I can't figure out how to do it. I'm newish to SQL and was hoping someone could point me in the right direction for this.
Much appreciated!
SELECT
DATEFROMPARTS(YEAR(ReportedDate), MONTH(ReportedDate), 1) AS Date, SUM(Sales) Sales
FROM
dbo.Sales
WHERE
YEAR(ReportedDate) = 2018 AND MONTH(ReportedDate) = 01
GROUP BY
ReportedDate

Because you are grouping by ReportedDate, for every ReportedDate you will get a record, even though you didn't select ReportedDate in your SELECT clause. Think of it as a hidden column in your data. Instead, try grouping by the functions in your select statement.
SELECT
DATEFROMPARTS(YEAR(ReportedDate), MONTH(ReportedDate), 1) AS Date, SUM(Sales) Sales
FROM
dbo.Sales
WHERE
YEAR(ReportedDate) = 2018 AND MONTH(ReportedDate) = 01
GROUP BY
DATEFROMPARTS(YEAR(ReportedDate), MONTH(ReportedDate), 1)

As an alternative to your query I suggest you to use EOMONTH function. You would not need to use extra date functions. And I think it's better to show last day of month than first day when showing totals per month
SELECT
EOMONTH(ReportedDate) AS Date, SUM(Sales) Sales
FROM
dbo.Sales
WHERE
EOMONTH(ReportedDate) = EOMONTH(GETDATE(), -1)
GROUP BY
EOMONTH(ReportedDate)
Notes:
EOMONTH(GETDATE(), -1) gets last day of previous month
Use DATEADD(DD, 1, EOMONTH(ReportedDate, -1)) to get first day of month

how to display the total sum of all individual dates in Oracle

I have a query that should display the total sum of sales of all individual dates,not the separate sales in each day. Below is the query I have tried and I am attaching a sample image of the output that I have gotten from this query. Your help would be appreciated.
SELECT sc_cd,Mon,sum(NET_SAL) SALE
FROM (SELECT TO_CHAR(to_date(deli_DT),'Mm') mm,
sc_cd,
TO_CHAR(to_date(deli_DT),'dd-Mon-yy') Mon,
sum(sale_net) NET_SAL
from bill_mas
where sc_cd not in ('22')
AND deli_dt BETWEEN '01-aug-15' and '31-aug-15'
AND CANCL IS NULL
AND sc_cd='01'
GROUP BY TO_CHAR(to_date(deli_DT),'Mm'),
SC_cd,
TO_CHAR(to_date(deli_DT),'dd-Mon-yy'),
sale_net)
ORDER BY 3;

You have the sale_net column in the group_by clause, so you will still see one row for each value - you'll only see any actual aggregation if you have two source rows with the same value. Remove that from the group by. It's also not clear why you're using a subquery; if you don't want MM in the output, just don't select it in the first place:
SELECT sc_cd,TO_CHAR(deli_DT,'dd-Mon-yy') Mon,sum(sale_net) NET_SAL
from bill_mas
where sc_cd not in('22')
and deli_dt BETWEEN '01-aug-15' and '31-aug-15'
and CANCL IS NULL
AND sc_cd='01'
GROUP BY SC_cd,TO_CHAR(deli_DT,'dd-Mon-yy')
order by 3
You should perhaps be selecting and grouping by trunc(deli_DT) rather than TO_CHAR(to_date(deli_DT),'dd-Mon-yy'), but if you need to format it anyway then it might not matter. But if deli_DT is a date field - as it seems to be, though it isn't entirely clear - then you should not be doing to_date() on it at all, as Boneist commented. You're really doing to_date(to_char(deli_dt)), with two implicit conversions using your NLS_DATE_FORMAT.
Using strings for your filter isn't a good idea though, and neither is using two-digit years; and you won't be seeing any rows which are from 2015-08-31 but after midnight; you should use explicit date conversions or literals, and use greater than/less than instead of between:
and deli_dt >= to_date('01-aug-2015', 'DD-mon-YYYY')
and deli_dt < to_date('01-sep-2015', 'DD-mon-YYYY'
Or:
and deli_dt >= date '2015-08-01'
and deli_dt < date '2015-09-01'

The issue is simple, you must eliminate sale_net column from group by clause.
Additionaly, if deli_DT is a Date datatype, you should write the query without to_date function. Also, you don't need two group by clauses:
SELECT
TO_CHAR(deli_DT,'Mm') mm,
sc_cd,
TO_CHAR(deli_DT,'dd-Mon-yy') Mon,
sum(sale_net) NET_SAL
from bill_mas
where sc_cd not in('22')
and deli_dt BETWEEN '01-aug-15' and '31-aug-15'
and CANCL IS NULL
AND sc_cd='01'
GROUP BY TO_CHAR(deli_DT,'Mm'), sc_cd, TO_CHAR(deli_DT,'dd-Mon-yy')
order by 3;

If you are using oracle and want sales by day you can use analytic function like-
sum(net_sale) over(partition by mon) from your_table;
it will give you sale of each day like I have mad e a temp1 table like this-
id mon net_sale
1 05-08-15 123
1 05-08-15 23
1 05-08-15 1
1 12-08-15 23
1 12-08-15 455
1 12-08-15 122
and the output is like-
Mon net_sale
05-08-15 147
05-08-15 147
05-08-15 147
12-08-15 600
12-08-15 600
12-08-15 600

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas