select column values based on other column date - sql

I have a dataset being returned that has monthly values for different 'Goals.' The goals have unique ID's and the month/date values will always be the same for the goals. The difference is sometimes one goal doesn't have values for all the same months as the other goal because it might start at a later date, and i want to 'consolidate' the results and sum them together based on the 'First' startBalance for each goal. Example dataset would be;
goalID monthDate startBalance
1 1/1/2014 10
1 2/1/2014 15
1 3/1/2014 22
1 4/1/2014 30
2 4/1/2014 13
2 5/1/2014 29
What i want to do is display these consolidated (summed) values in a table based on the 'First' (earliest Month/Year) value for each goal. The result would look like;
Year startBalance
2014 23
This is because the 'First' value for goalID of 1 is 10 and the 'First' value for goalID of 2 is '13'
I am trying to ultimately use this dataset in an SSRS report through Report Builder, but the groupings are not working correctly for me so i figured if i could achieve this through my queries and just display the data that would be a viable solution.
An example of real result data would be
so i'd want the overall resultset to be;
Year startBalance
2014 876266.00
2015 888319.92
2016 ---------
and so on, i understand for 2015 in that result set there is a value of 0.00 for ID 71, but usually that will contain an actual dollar amount, which would automatically adjust.

WITH balances AS (
SELECT ROW_NUMBER() OVER (PARTITION BY goalID ORDER BY monthDate ASC) n, startBalance, DATEPART(year, monthDate) [year]
FROM Goals
)
SELECT [year], SUM(startBalance) startBalance
FROM balances
WHERE n = 1
GROUP BY [year]

Related

Oracle SQL: How to fill Null value with data from most recent previous date that is not null?

Essentially date field is updated every month along with other fields, however one field is only updated ~6 times throughout the year. For months where that field is not updated, looking to show the most recent previous data
Date
Emp_no
Sales
Group
Jan
1234
100
Med
Feb
1234
200
---
Mar
1234
170
---
Apr
1234
150
Low
May
1234
180
---
Jun
1234
90
High
Jul
1234
100
---
Need it to show:
Date
Emp_no
Sales
Group
Jan
1234
100
Med
Feb
1234
200
Med
Mar
1234
170
Med
Apr
1234
150
Low
May
1234
180
Low
Jun
1234
90
High
Jul
1234
100
High
This field is not updated at set intervals, could be 1-4 months of Nulls in a row
Tried something like this to get the second most recent date but unsure how to deal with the fact that i could need between 1-4 months prior
LAG(Group)
OVER(PARTITION BY emp_no
ORDER BY date)
Thanks!
This is the traditional "gaps and islands" problem.
There are various ways to solve it, a simple version will work for you.
First, create a new identifier that splits the rows in to "groups", where only the first row in each group is NOT NULL.
SUM(CASE WHEN "group" IS NOT NULL THEN 1 ELSE 0 END) OVER (PARTION BY emp_no ORDER BY "date") AS emp_group_id
Then you can use MAX() in another window function, as all "groups" will only have one NOT NULL value.
WITH
gaps
AS
(
SELECT
t.*,
SUM(
CASE WHEN "group" IS NOT NULL
THEN 1
ELSE 0
END
)
OVER (
PARTITION BY emp_no
ORDER BY "date"
)
AS emp_group_id
FROM
your_table t
)
SELECT
"date",
emp_no,
sales,
MAX("group")
OVER (
PARTITION BY emp_no, emp_group_id
)
AS "group"
FROM
gaps
Edit
Ignore all that.
Oracle has IGNORE NULLS.
LAST_VALUE("group" IGNORE NULLS)
OVER (
PARTITION BY emp_no
ORDER BY "date"
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
)
AS "group"

SQL Create Column Headers by Month ID

I am trying to extract itemised sales data for the past 12 months and build a dynamic table with column headers for each month ID. Extracting the data as below works, however when I get to the point of creating a SUM column for each month ID, I get stuck. I have tried to find similar questions but I'm not sure of the best approach.
Select Item, Qty, format(Transaction Date,'MMM-yy')
from Transactions
Data Extract:
Item
Qty
Month ID
A123
50
Apr-22
A123
30
May-22
A123
50
Jun-22
A321
50
Apr-22
A999
25
May-22
A321
10
Jun-22
Desired Output:
Item
Apr-22
May-22
Jun-22
A123
50
30
50
A321
50
Null
10
A999
Null
25
Null
Any advice would be greatly appreciated.
This is a typical case of pivot operation, where you
first filter every value according to your "Month_ID" value
then aggregate on common "Item"
WITH cte AS (
SELECT Item, Qty, FORMAT(Transaction Date,'MMM-yy') AS Month_ID
FROM Transactions
)
SELECT Item,
MAX(CASE WHEN Month_ID = 'Apr-22' THEN Qty END) AS [Apr-22],
MAX(CASE WHEN Month_ID = 'May-22' THEN Qty END) AS [May-22],
MAX(CASE WHEN Month_ID = 'Jun-22' THEN Qty END) AS [Jun-22]
FROM cte
GROUP BY Item
Note: you don't need the SUM as long as there's only one value for each couple <"Item", "Month-Year">.

Running Total by Year in SQL

I have a table broken out into a series of numbers by year, and need to build a running total column but restart during the next year.
The desired outcome is below
Amount | Year | Running Total
-----------------------------
1 2000 1
5 2000 6
10 2000 16
5 2001 5
10 2001 15
3 2001 18
I can do an ORDER BY to get a standard running total, but can't figure out how to base it just on the year such that it does the running total for each unique year.
SQL tables represent unordered sets. You need a column to specify the ordering. One you have this, it is a simple cumulative sum:
select amount, year, sum(amount) over (partition by year order by <ordering column>)
from t;
Without a column that specifies ordering, "cumulative sum" does not make sense on a table in SQL.

SQL How to calculate Average time between Order Purchases? (do sql calculations based on next and previous row)

I have a simple table that contains the customer email, their order count (so if this is their 1st order, 3rd, 5th, etc), the date that order was created, the value of that order, and the total order count for that customer.
Here is what my table looks like
Email Order Date Value Total
r2n1w#gmail.com 1 12/1/2016 85 5
r2n1w#gmail.com 2 2/6/2017 125 5
r2n1w#gmail.com 3 2/17/2017 75 5
r2n1w#gmail.com 4 3/2/2017 65 5
r2n1w#gmail.com 5 3/20/2017 130 5
ation#gmail.com 1 2/12/2018 150 1
ylove#gmail.com 1 6/15/2018 36 3
ylove#gmail.com 2 7/16/2018 41 3
ylove#gmail.com 3 1/21/2019 140 3
keria#gmail.com 1 8/10/2018 54 2
keria#gmail.com 2 11/16/2018 65 2
What I want to do is calculate the time average between purchase for each customer. So lets take customer ylove. First purchase is on 6/15/18. Next one is 7/16/18, so thats 31 days, and next purchase is on 1/21/2019, so that is 189 days. Average purchase time between orders would be 110 days.
But I have no idea how to make SQL look at the next row and calculate based on that, but then restart when it reaches a new customer.
Here is my query to get that table:
SELECT
F.CustomerEmail
,F.OrderCountBase
,F.Date_Created
,F.Total
,F.TotalOrdersBase
FROM #FullBase F
ORDER BY f.CustomerEmail
If anyone can give me some suggestions, that would be greatly appreciated.
And then maybe I can calculate value differences (in percentage). So for example, ylove spent $36 on their first order, $41 on their second which is a 13% increase. Then their second order was $140 which is a 341% increase. So on average, this customer increased their purchase order value by 177%. Unrelated to SQL, but is this the correct way of calculating a metric like this?
looking to your sample you clould try using the diff form min and max date divided by total
select email, datediff(day, min(Order_Date), max(Order_Date))/(total-1) as avg_days
from your_table
group by email
and for manage also the one order only
select email,
case when total-1 > 0 then
datediff(day, min(Order_Date), max(Order_Date))/(total-1)
else datediff(day, min(Order_Date), max(Order_Date)) end as avg_days
from your_table
group by email
The simplest formulation is:
select email,
datediff(day, min(Order_Date), max(Order_Date)) / nullif(total-1, 0) as avg_days
from t
group by email;
You can see this is the case. Consider three orders with od1, od2, and od3 as the order dates. The average is:
( (od2 - od1) + (od3 - od2) ) / 2
Check the arithmetic:
--> ( od2 - od1 + od3 - od2 ) / 2
--> ( od3 - od1 ) / 2
This pretty obviously generalizes to more orders.
Hence the max() minus min().

How to sum total amount for every month in a year?

I have a database in SQL Server 2012 and there is a table with dates in D.M.YYYY format like below:
ID | Date(date type) | Amount(Numeric)
1 3.4.2013 16.00
1 12.4.2013 13.00
1 2.5.2013 9.50
1 18.5.2013 10.00
I need to sum the total amount for every month in a given year. For example:
ID | Month | TotalAmount
1 1 0.00
...
1 4 29.00
1 5 19.50
I thought what I needed was to determine the number of days in a month, so I created a function which is described in determine the number of days, and it worked. After that I tried to compare two dates(date type) and got stuck; there are some examples out there, but all of them about datetime.
Is this wrong? How can I accomplish this?
I think you just want an aggregation:
select id, month(date) as "month", sum(amount) as TotalAmount
from t
where year(date) = 2013
group by id, month(date)