Select sum by date distinction SQL Oracle - sql

Transaction Table
No Date Amount
1 06-07-2017 1000
2 06-07-2017 1500
3 08-07-2017 2000
4 09-07-2017 2000
5 09-07-2017 2000
6 09-07-2017 2000
Is it possible to achieve this result with single query ( no query loop)
No Date Total Amount
1 06-07-2017 2500
2 08-07-2017 2000
3 09-07-2017 6000

Are you looking for Group By?
select Trunc("Date"), -- we have to put ".." since Date is a Keyword in Oracle
sum(Amount) as "Total Amount"
from MyTable
group by Trunc("Date")
order by Trunc("Date")
Edit: it seems that Date field contains date and time, while time part should be truncated - Trunc - when aggregating (see the comments)

For the exact results:
select row_number() over (order by date) as No,
date, sum(amount) as "Total Amount"
from t
group by date
order by date;
Note: In Oracle, the date data type can contain a time component -- and this might not be visible in the output. If so, the aggregation doesn't do what you expect. If this is the case, then:
select row_number() over (order by trunc(date)) as No,
trunc(date) as date, sum(amount) as "Total Amount"
from t
group by trunc(date)
order by trunc(date);

Related

Retrieve Customers with a Monthly Order Frequency greater than 4

I am trying to optimize the below query to help fetch all customers in the last three months who have a monthly order frequency +4 for the past three months.
Customer ID
Feb
Mar
Apr
0001
4
5
6
0002
3
2
4
0003
4
2
3
In the above table, the customer with Customer ID 0001 should only be picked, as he consistently has 4 or more orders in a month.
Below is a query I have written, which pulls all customers with an average purchase frequency of 4 in the last 90 days, but not considering there is a consistent purchase of 4 or more last three months.
Query:
SELECT distinct lines.customer_id Customer_ID, (COUNT(lines.order_id)/90) PurchaseFrequency
from fct_customer_order_lines lines
LEFT JOIN product_table product
ON lines.entity_id= product.entity_id
AND lines.vendor_id= product.vendor_id
WHERE LOWER(product.country_code)= "IN"
AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
AND lines.date < CURRENT_DATE()
GROUP BY Customer_ID
HAVING PurchaseFrequency >=4;
I tried to use window functions, however not sure if it needs to be used in this case.
I would sum the orders per month instead of computing the avg and then retrieve those who have that sum greater than 4 in the last three months.
Also I think you should select your interval using "month(CURRENT_DATE()) - 3" instead of using a window of 90 days. Of course if needed you should handle the case of when current_date is jan-feb-mar and in that case go back to oct-nov-dec of the previous year.
I'm not familiar with Google BigQuery so I can't write your query but I hope this helps.
So I've found the solution to this using WITH operator as below:
WITH filtered_orders AS (
select
distinct customer_id ID,
extract(MONTH from date) Order_Month,
count(order_id) CountofOrders
from customer_order_lines` lines
where EXTRACT(YEAR FROM date) = 2022 AND EXTRACT(MONTH FROM date) IN (2,3,4)
group by ID, Order_Month
having CountofOrders>=4)
select distinct ID
from filtered_orders
group by ID
having count(Order_Month) =3;
Hope this helps!
An option could be first count the orders by month and then filter users which have purchases on all months above your threshold:
WITH ORDERS_BY_MONTH AS (
SELECT
DATE_TRUNC(lines.date, MONTH) PurchaseMonth,
lines.customer_id Customer_ID,
COUNT(lines.order_id) PurchaseFrequency
FROM fct_customer_order_lines lines
LEFT JOIN product_table product
ON lines.entity_id= product.entity_id
AND lines.vendor_id= product.vendor_id
WHERE LOWER(product.country_code)= "IN"
AND lines.date >= DATE_SUB(CURRENT_DATE() , INTERVAL 90 DAY )
AND lines.date < CURRENT_DATE()
GROUP BY PurchaseMonth, Customer_ID
)
SELECT
Customer_ID,
AVG(PurchaseFrequency) AvgPurchaseFrequency
FROM ORDERS_BY_MONTH
GROUP BY Customer_ID
HAVING COUNT(1) = COUNTIF(PurchaseFrequency >= 4)

Past 7 days running amounts average as progress per each date

So, the query is simple but i am facing issues in implementing the Sql logic. Heres the query suppose i have records like
Phoneno Company Date Amount
83838 xyz 20210901 100
87337 abc 20210902 500
47473 cde 20210903 600
Output expected is past 7 days progress as running avg of amount for each date (current date n 6 days before)
Date amount avg
20210901 100 100
20210902 500 300
20210903 600 400
I tried
Select date, amount, select
avg(lg) from (
Select case when lag(amount)
Over (order by NULL) IS NULL
THEN AMOUNT
ELSE
lag(amount)
Over (order by NULL) END AS LG)
From table
WHERE DATE>=t.date-7) as avg
From table t;
But i am getting wrong avg values. Could anyone please help?
Note: Ive tried without lag too it results the wrong avgs too
You could use a self join to group the dates
select distinct
a.dt,
b.dt as preceding_dt, --just for QA purpose
a.amt,
b.amt as preceding_amt,--just for QA purpose
avg(b.amt) over (partition by a.dt) as avg_amt
from t a
join t b on a.dt-b.dt between 0 and 6
group by a.dt, b.dt, a.amt, b.amt; --to dedupe the data after the join
If you want to make your correlated subquery approach work, you don't really need the lag.
select dt,
amt,
(select avg(b.amt) from t b where a.dt-b.dt between 0 and 6) as avg_lg
from t a;
If you don't have multiple rows per date, this gets even simpler
select dt,
amt,
avg(amt) over (order by dt rows between 6 preceding and current row) as avg_lg
from t;
Also the condition DATE>=t.date-7 you used is left open on one side meaning it will qualify a lot of dates that shouldn't have been qualified.
DEMO
You can use analytical function with the windowing clause to get your results:
SELECT DISTINCT BillingDate,
AVG(amount) OVER (ORDER BY BillingDate
RANGE BETWEEN TO_DSINTERVAL('7 00:00:00') PRECEDING
AND TO_DSINTERVAL('0 00:00:00') FOLLOWING) AS RUNNING_AVG
FROM accounts
ORDER BY BillingDate;
Here is a DBFiddle showing the query in action (LINK)

Oracle SQL - create select statement which will retrieve every end date of the month but compare the values the day or two after the end of month

Let's say I have a customer table.
customer_name || date || amount
-----------------------------------
A 31-OCT-20 100
A 01-NOV-20 100
A 02-NOV-20 200
B 31-OCT-20 300
B 01-NOV-20 325
B 02-NOV-20 350
I need to create a select statement which will retrieve every end date of the month and compare the values for the amounts respective to the day or two after. If the amount for the day or two is different from the end date of that month, display the recent changed amount.
Example 1 - Retrieve customer A for 31-OCT-20, compare to 01-NOV-20 and 02-NOV-20, output 200 for the amount.
Example 2 - Retrieve customer B for 31-OCT-20, compare to 01-NOV-20 and 02-NOV-20, output 350 for the amount.
Hmmm . . .
select t.*,
(case when next_amount <> amount or next2_amount <> amount
then greatest(next_amount, next2_amount)
else next_amount
end) as imputed_next_2_days
from (select t.*,
lead(amount) over (partition by customer_name order by date) as next_amount,
lead(amount, 2) over (partition by customer_name order by date) as next2_amount
from t
) t
where date = last_day(date);
You can use the following query:
Select * from
(Select t.*,
row_number() over (partition by customer_name order by dt desc) as rn
From your_table t
Where extract(day from t.dt + 2) between 2 and 4)
Where rn = 1
Tip of the day: don't use oracle reserved keywords as the column name. (Date)

Finding when requests are met or exceeded by customer by month

I have a table that has customers and I want to find what month the customer met or exceeded a certain number of requests.
The table has customer_id a timestamp of each request.
What I am looking for is the month (or day) that the customer met or exceeded 10000 requests. I've tried to get a running total in place but this just isn't working for me. I've left it in the code in case someone knows how I can do this.
What I have is the following:
SELECT
customer_id
, DATE_TRUNC(CAST(TIMESTAMP_MILLIS(created_timestamp) AS DATE), MONTH) as cMonth
, COUNT(created_timestamp) as searchCount
-- , SUM(COUNT (DISTINCT(created_timestamp))) OVER (ROWS UNBOUNDED PRECEDING) as RunningTotal2
FROM customer_requests.history.all
GROUP BY distributor_id, cMonth
ORDER BY 2 ASC, 1 DESC;
The representation I am after is something like this.
customer requests cMonth totalRequests
cust1 6000 2017-10-01 6000
cust1 4001 2017-11-01 10001
cust2 4000 2017-10-01 4000
cust2 4000 2017-11-01 8000
cust2 4000 2017-12-01 12000
cust2 3000 2017-12-01 3000
cust2 3000 2017-12-01 6000
cust2 3000 2017-12-01 9000
cust2 3000 2017-12-01 12000
Assuming SQL Server, try this (adjusting the cutoff at the top to get the number of transactions you need; right now it looks for the thousandth transaction per customer).
Note that this will not return customers who have not exceeded your cutoff, and assumes that each transaction has a unique date (or is issued a sequential ID number to break ties if there can be ties on date).
DECLARE #cutoff INT = 1000;
WITH CTE
AS (SELECT customer_id,
transaction_ID,
transaction_date,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY transaction_date, transaction_ID) AS RN,
COUNT(transaction_ID) OVER (PARTITION BY customer_id) AS TotalTransactions
FROM #test)
SELECT DISTINCT
customer_id,
transaction_date as CutoffTransactionDate,
TotalTransactions
FROM CTE
WHERE RN = #cutoff;
How it works:
row_number assigns a unique sequential identifier to each of a customer's transactions, in the order in which they were made. count tells you the total number of transactions a person made (assuming again one record per transaction - otherwise you would need to calculate this separately, since distinct won't work with the partition).
Then the second select returns the 1,000th (or however many you specify) row for each customer and its date, along with the total for that customer.
this is my solution.
SELECT
customerid
,SUM(requests) sumDay
,created_timestamp
FROM yourTable
GROUP BY
customerid,
created_timestamp
HAVING SUM(requests) >= 10000;
Its pretty simple. You just group according to your needs, sum up the requests and select the rows that meet your HAVING clause.
You can try the query here.
If you want a cumulative sum, you can use window functions. In Standard SQL, this looks like:
SELECT customer_id,
DATE_TRUNC(CAST(TIMESTAMP_MILLIS(created_timestamp) AS DATE), MONTH) as cMonth
COUNT(*) as searchCount,
SUM(COUNT(*)) OVER (ORDER BY MIN(created_timestamp) as runningtotal
FROM customer_requests.history.all
GROUP BY distributor_id, cMonth
ORDER BY 2 ASC, 1 DESC;

query to display additional column based on aggregate value

I've been mulling on this problem for a couple of hours now with no luck, so I though people on SO might be able to help :)
I have a table with data regarding processing volumes at stores. The first three columns shown below can be queried from that table. What I'm trying to do is to add a 4th column that's basically a flag regarding if a store has processed >=$150, and if so, will display the corresponding date. The way this works is the first instance where the store has surpassed $150 is the date that gets displayed. Subsequent processing volumes don't count after the the first instance the activated date is hit. For example, for store 4, there's just one instance of the activated date.
store_id sales_volume date activated_date
----------------------------------------------------
2 5 03/14/2012
2 125 05/21/2012
2 30 11/01/2012 11/01/2012
3 100 02/06/2012
3 140 12/22/2012 12/22/2012
4 300 10/15/2012 10/15/2012
4 450 11/25/2012
5 100 12/03/2012
Any insights as to how to build out this fourth column? Thanks in advance!
The solution start by calculating the cumulative sales. Then, you want the activation date only when the cumulative sales first pass through the $150 level. This happens when adding the current sales amount pushes the cumulative amount over the threshold. The following case expression handles this.
select t.store_id, t.sales_volume, t.date,
(case when 150 > cumesales - t.sales_volume and 150 <= cumesales
then date
end) as ActivationDate
from (select t.*,
sum(sales_volume) over (partition by store_id order by date) as cumesales
from t
) t
If you have an older version of Postgres that does not support cumulative sum, you can get the cumulative sales with a subquery like:
(select sum(sales_volume) from t t2 where t2.store_id = t.store_id and t2.date <= t.date) as cumesales
Variant 1
You can LEFT JOIN to a table that calculates the first date surpassing the 150 $ limit per store:
SELECT t.*, b.activated_date
FROM tbl t
LEFT JOIN (
SELECT store_id, min(thedate) AS activated_date
FROM (
SELECT store_id, thedate
,sum(sales_volume) OVER (PARTITION BY store_id
ORDER BY thedate) AS running_sum
FROM tbl
) a
WHERE running_sum >= 150
GROUP BY 1
) b ON t.store_id = b.store_id AND t.thedate = b.activated_date
ORDER BY t.store_id, t.thedate;
The calculation of the the first day has to be done in two steps, since the window function accumulating the running sum has to be applied in a separate SELECT.
Variant 2
Another window function instead of the LEFT JOIN. May of may not be faster. Test with EXPLAIN ANALYZE.
SELECT *
,CASE WHEN running_sum >= 150 AND thedate = first_value(thedate)
OVER (PARTITION BY store_id, running_sum >= 150 ORDER BY thedate)
THEN thedate END AS activated_date
FROM (
SELECT *
,sum(sales_volume)
OVER (PARTITION BY store_id ORDER BY thedate) AS running_sum
FROM tbl
) b
ORDER BY store_id, thedate;
->sqlfiddle demonstrating both.