Oracle SQL Accumulated value for the date - sql

I have a table with 3 columns: id, date and amount, but I would like to get accumulated SUM for each date (Last column).
Do you have an easy solution how to add this column?
I am trying with this:
SELECT date, sum(amount) as accumulated
FROM table group by date
WHERE max(date);
Should I user OVER() for this?

Use a window function to the total for each day:
SELECT date,
amount,
sum(amount) over (partition by date) as accumulated
FROM the_table;
However this will only work, if your dates all have the same time part (in Oracle a DATE column also contains a time). To make sure you ignore the time part, use trunc() to make sure all time parts are normalized to 00:00:00
SELECT date,
amount,
sum(amount) over (partition by trunc(date)) as accumulated
FROM the_table;

Use This:
SELECT T.ID, T.DATE, T.AMOUNT, (SELECT SUM(S.AMOUNT) FROM TABLE S WHERE S.DATE=T.DATE) ACCUMULATED
from
table T
This will give you the records from the table with a sum for all records for the date.

Related

How do I get the second last date in proc sql?

I'm writing a query in SQL to get the First, Last and Second last Transaction date for Customers. I have added the first and last using the Min() and Max() functions, how can I add the second last date in my query?
select distinct Shoppers, Min(Date) as First_Txn, Max(Date) as Last_Txn,
sum(revenue_sale) as Revenue, sum(units) as Units, count(distinct invoice) as Invoices,
from myTable
where Date between 20220101 and 20220131
group by 1;

Getting sum of a column that needs a distinct value from other column

I have this table where I wanted to get the sum of the balance column but each item should have a unique value from the date column.
I'm trying to find all the rows in the balance column that are the same and have the same date, and then find the sum of the balance column.
sample data with unique dates:
balance
date
700
2021-07-03
700
2021-09-03
300
2021-09-04
500
2021-09-05
query used goes like:
select distinct a.balance, a.date from table a where a.date between (some date) and (some other date)
I have tried:
select sum(a.balance), a.date from table a where a.date between (some date) and (some other date) group by a.date
but the balance column shows the sum of all of the values in the column but shows distinct dates as shown below.
balance
date
893938
2021-07-03
858585
2021-09-03
728366
2021-09-04
665322
2021-09-05
I guess this is a job for a subquery. So let's take your problem step by step.
I'm trying to find all the rows in the balance column that are the same and have the same date,
This subquery gets you that, I believe. It give the same result as SELECT DISTINCT but it also counts the duplicated rows.
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
and then find the sum of the balance column.
Nest the subquery like this.
SELECT SUM(balance) summed_balance, date
FROM (
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
) subquery
GROUP BY date
If you only want to consider rows that actually have duplicates, change your subquery to
SELECT COUNT(*) num_same_rows, balance, date
FROM `table`
WHERE a.datum BETWEEN '2021-01-01' AND '2021-09-01'
GROUP BY date, balance
HAVING COUNT(*) >= 1
Be careful here, though. You didn't tell us what you want to do, only how you want to do it. The way you described your problem calls for discarding duplicated data before doing the sums. Is that right? Do you want to discard data?
2nd query you posted looks OK - sort of.
However, I think that it is the fact that date column contains not only date, but also time (as DATE datatype in Oracle does). Therefore, I'd say that it is trunc you need. Something like this:
SELECT TRUNC (a.datum) datum,
SUM (a.balance) sum_balance
FROM table_a a
WHERE a.datum BETWEEN DATE '2021-01-01' AND DATE '2021-09-01'
GROUP BY TRUNC (a.datum)

Find the average lowest item in a collection grouped by date in SQL

My SQL isn't the best - I can get this working in C# but it seems more efficient to get it in my data layer - I've got a table Prices:
ID
Price
DateTime
Each row is exactly 1 hour from the next, so I have a snapshot of a price every hour.
I'm trying to work out which hour in a day over the entire dataset has the lowest price (on average).
So ideally I'm after a list of each hour in the day ranked by how cheap on average that hour is over the entire dataset - so a maximum of 24 rows (one for each hour in the day).
Any help would be greatly appreciated!
Thanks :D
Which database are you on?
Different DBs have different ways to extract date from a timestamp column.
Postgres has date(timestamp), In Oracle, you can use trunc(timestamp). Or most DBs have to_char/to_date. So you can try that.
Once you have extracted the date, you can try something like this -
select ID,
Price,
DateTime,
trunc(DateTime) as day,
rank() over (partition by trunc(DateTime) order by Price asc) as least_for_day
from Prices
Now you can use the "least_for_day" ranked column and select by day.
Again, depending on the DB, you can either directly qualify on the ranked column in the same SQL or use the above as a sub-query and filter for the rank.
You can use a query like below
select
hour,
avg(daily_rank) avg_rank
from
(
select *, hour= format((datetime as datetime),'HH'), daily_rank= dense_rank() over (partition by cast(datetime as date) order by price asc)
) t
group by hour
Thank you very much to #Many Manjunath and #DhruvJoshi. Final solution below;
WITH prices AS
(
SELECT
[Price],
[DateTime],
CAST([DateTime] AS TIME) 'Time',
CAST([DateTime] as date) 'Date',
rank() over (partition by cast([DateTime] as date) order by [Price] asc) as least_for_day
FROM [dbo].[Prices]
)
SELECT [Time], count(*) 'Qty Cheapest' FROM prices
WHERE least_for_day = 1
GROUP BY [Time]
ORDER BY 2 DESC
That returns 24 rows:

Selecting max date of each month

I have a table with a lot of cumulative columns, these columns reset to 0 at the end of each month. If I sum this data, I'll end up double counting. Instead, With Hive, I'm trying to select the max date of each month.
I've tried this:
SELECT
yyyy_mm_dd,
id,
name,
cumulative_metric1,
cumulative_metric2
FROM
mytable
WHERE
yyyy_mm_dd = last_day(yyyy_mm_dd)
mytable has daily data from the start of the year. In the output of the above, I only see the last date for January but not February. How can I select the last day of each month?
February is not over yet. Perhaps a window function does what you want:
SELECT yyyy_mm_dd, id, name, cumulative_metric1, cumulative_metric2
FROM (SELECT t.*,
MAX(yyyy_mm_dd) OVER (PARTITION BY last_day(yyyy_mm_dd)) as last_yyyy_mm_dd
FROM mytable t
) t
WHERE yyyy_mm_dd = last_yyyy_mm_dd;
This calculates the last day in the data.
use correlated subquery and date to month function in hive
SELECT
yyyy_mm_dd,
id,
name,
cumulative_metric1,
cumulative_metric2
FROM
mytable t1
WHERE
yyyy_mm_dd = select max(yyyy_mm_dd) from mytable t2 where
month(t1.yyyy_mm_dd)= month(t2.yyyy_mm_dd)

Last day of the month with a twist in SQLPLUS

I would appreciate a little expert help please.
in an SQL SELECT statement I am trying to get the last day with data per month for the last year.
Example, I am easily able to get the last day of each month and join that to my data table, but the problem is, if the last day of the month does not have data, then there is no returned data. What I need is for the SELECT to return the last day with data for the month.
This is probably easy to do, but to be honest, my brain fart is starting to hurt.
I've attached the select below that works for returning the data for only the last day of the month for the last 12 months.
Thanks in advance for your help!
SELECT fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,fd.column_name
FROM super_table fd,
(SELECT TRUNC(daterange,'MM')-1 first_of_month
FROM (
select TRUNC(sysdate-365,'MM') + level as DateRange
from dual
connect by level<=365)
GROUP BY TRUNC(daterange,'MM')) fom
WHERE fd.cust_id = :CUST_ID
AND fd.coll_date > SYSDATE-400
AND TRUNC(fd.coll_date) = fom.first_of_month
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,
TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)
You probably need to group your data so that each month's data is in the group, and then within the group select the maximum date present. The sub-query might be:
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY YEAR(coll_date) * 100 + MONTH(coll_date);
This presumes that the functions YEAR() and MONTH() exist to extract the year and month from a date as an integer value. Clearly, this doesn't constrain the range of dates - you can do that, too. If you don't have the functions in Oracle, then you do some sort of manipulation to get the equivalent result.
Using information from Rhose (thanks):
SELECT MAX(coll_date) AS last_day_of_month
FROM Super_Table AS fd
GROUP BY TO_CHAR(coll_date, 'YYYYMM');
This achieves the same net result, putting all dates from the same calendar month into a group and then determining the maximum value present within that group.
Here's another approach, if ANSI row_number() is supported:
with RevDayRanked(itemDate,rn) as (
select
cast(coll_date as date),
row_number() over (
partition by datediff(month,coll_date,'2000-01-01') -- rewrite datediff as needed for your platform
order by coll_date desc
)
from super_table
)
select itemDate
from RevDayRanked
where rn = 1;
Rows numbered 1 will be nondeterministically chosen among rows on the last active date of the month, so you don't need distinct. If you want information out of the table for all rows on these dates, use rank() over days instead of row_number() over coll_date values, so a value of 1 appears for any row on the last active date of the month, and select the additional columns you need:
with RevDayRanked(cust_id, server_name, coll_date, rk) as (
select
cust_id, server_name, coll_date,
rank() over (
partition by datediff(month,coll_date,'2000-01-01')
order by cast(coll_date as date) desc
)
from super_table
)
select cust_id, server_name, coll_date
from RevDayRanked
where rk = 1;
If row_number() and rank() aren't supported, another approach is this (for the second query above). Select all rows from your table for which there's no row in the table from a later day in the same month.
select
cust_id, server_name, coll_date
from super_table as ST1
where not exists (
select *
from super_table as ST2
where datediff(month,ST1.coll_date,ST2.coll_date) = 0
and cast(ST2.coll_date as date) > cast(ST1.coll_date as date)
)
If you have to do this kind of thing a lot, see if you can create an index over computed columns that hold cast(coll_date as date) and a month indicator like datediff(month,'2001-01-01',coll_date). That'll make more of the predicates SARGs.
Putting the above pieces together, would something like this work for you?
SELECT fd.cust_id,
fd.server_name,
fd.instance_name,
TRUNC(fd.coll_date) AS coll_date,
fd.column_name
FROM super_table fd,
WHERE fd.cust_id = :CUST_ID
AND TRUNC(fd.coll_date) IN (
SELECT MAX(TRUNC(coll_date))
FROM super_table
WHERE coll_date > SYSDATE - 400
AND cust_id = :CUST_ID
GROUP BY TO_CHAR(coll_date,'YYYYMM')
)
GROUP BY fd.cust_id,fd.server_name,fd.instance_name,TRUNC(fd.coll_date),fd.column_name
ORDER BY fd.server_name,fd.instance_name,TRUNC(fd.coll_date)