I have a simple query which does the below:
SELECT
B.WEEK_DT WEEK_DT,
SUM(A.PROFIT) PROFIT
FROM
CUSTOMERS A
INNER JOIN WEEK_TABLE B
ON A.WEEK_ID = B.WEEK_ID
Now, I want to extend this query to get Sum of profit for all of yr 2013. That means, the above data gives me value at weekly level and i also want a separate column which give me 2013_Profit, summing up all weeks of previous yr.
week_dt is in the format of mm-dd-yyyy
also, we have an offset in the week table, if that helps:
- WK_OFFSET WK_DT
-13 February 22, 2014
-12 March 1, 2014
-11 March 8, 2014
-10 March 15, 2014
-9 March 22, 2014
-8 March 29, 2014
-7 April 5, 2014
-6 April 12, 2014
-5 April 19, 2014
-4 April 26, 2014
-3 May 3, 2014
-2 May 10, 2014
-1 May 17, 2014
Please let me know how i can get another column for each customer which gives a sum previous yr profits.
Some thing like the below:
Customer Curr_WK_Profit Prev_YR_Profit
AAA 10 520
BBB 20 1040
CCC 30 1560
Related
I have a table of names with two different dates. I want to know the count of names that are occurring between the two dates and the overlap percentage.
This is the output format that is desired. I am not looking for dates in between. I am looking for records that are in July 05 and also in August 10. Overlap percentage for each id would be - count of records in July 5 and also August 10/count of records on July 5.(Actual table has dates in date datatype).
Overlap % will always be less than or equal to 100 since count of records existing on July 5 as well as August 10 will always be <=count of records on July 5.
id
Count on July 05
Count of IDs from July 05 included in August 10
% overlap
ABC
BCD
CDE
DEF
EFG
Rough version of the input table
id
type
Group
date
ABC
Mobile
1
July 5
BCD
Mobile
1
July 5
ABC
Desktop
1
August 10
CDE
Mobile
2
July 5
BCD
Mobile
2
August 10
As I understood from your comments, the overlap will be the minimum count value of the two dates, i.e. for ABC if we have 6 in July and 2 in August the overlap will be 2, and if we have 3 in July and 5 in August the overlap will be 3.
If that is the case then you may use the following query tested on MS SQL Server 2019:
SELECT t.id, t.[Count on July 05],
CASE
WHEN t.[Count on July 05]<= t.[Count of August 10] THEN t.[Count on July 05]
WHEN t.[Count on July 05]> t.[Count of August 10] THEN t.[Count of August 10]
END AS [Count of IDs from July 05 included in August 10],
CASE
WHEN t.[Count on July 05]<= t.[Count of August 10] THEN CAST(t.[Count on July 05]*1.00/t.[Count on July 05] * 100 AS DECIMAL(18, 2))
WHEN t.[Count on July 05]> t.[Count of August 10] THEN CAST(t.[Count of August 10]*1.00/t.[Count on July 05] * 100 AS DECIMAL(18, 2))
END AS [% overlap]
FROM(
SELECT id,
COUNT(CASE WHEN [tdate] IN ('July 5') THEN 1 END) as [Count on July 05],
COUNT(CASE WHEN [tdate] IN ('August 10') THEN 1 END) as [Count of August 10]
FROM [Tbl]
GROUP BY id) t
I hope that is what you are looking for.
I have the following code which gets me how many rows were written on each day there was anything done.
SELECT
ingestion_time,
COUNT(ingestion_time) AS Rows_Written,
FROM
`workday.ingestions`
GROUP BY
ingestion_time
ORDER BY
ingestion_time
Which will give me something that looks like the following:
Ingestion_Time
Rows_Written
Jan 2, 2021
8
Jan 5, 2021
5
Jan 8, 2021
9
Jan 9, 2021
2
However, I want to be able to add in the missing dates so the tables looks like this instead:
Ingestion_Time
Rows_Written
Jan 2, 2021
8
Jan 3, 2021
0
Jan 4, 2021
0
Jan 5, 2021
5
Jan 6, 2021
0
Jan 7, 2021
0
Jan 8, 2021
9
Jan 9, 2021
2
How can I go about doing this? Do need to create a whole table with all dates and join it somehow, or is there another way? Thanks in advance.
Consider below approach
select date(Ingestion_Time) Ingestion_Time, Rows_Written
from your_current_query union all
select day, 0 from (
select *, lead(Ingestion_Time) over(order by Ingestion_Time) next_time
from your_current_query
), unnest(generate_date_array(date(Ingestion_Time) + 1, date(next_time) - 1)) day
if to apply to sample data in your question - output is
Using Oracle SQL, I’m trying to calculate total unique visits to a website. The table I’m using to write the query does not have a timestamp which includes minutes and seconds just DDMMYY and every row in the table represents a customer click on the page. The table designates a new “session” every hour, regardless of whether that actually reflects a new visit from the customer’s POV. What I must do is use non-consecutive sessions as a proxy for unique visits. So, if there is an hour break between visits the previous consecutive grouping is one visit. I define a visit as a unique combination of customer ID + session day + session hour. If there are consecutive session hours within a customer + day combination, I count that as a single session. The HOUR filed contains string values that concatenate date with hour. In order to do the appropriate visit count calculation, I will need to parse out the hour and subtract from the previous (lag) row in order to determine if there is greater than an hour “break”.
Example of Raw Data:
TRANS_TO_DATE CUSTOMER_ID HOUR
10/21/17 1007589445 October 21, 2017, Hour 1
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 2
10/21/17 1007589445 October 21, 2017, Hour 3
10/21/17 1007589445 October 21, 2017, Hour 5
10/21/17 1007589445 October 21, 2017, Hour 6
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
10/21/17 1007589445 October 21, 2017, Hour 23
11/1/17 1007589445 November 1, 2017, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 10
1/1/18 1007589445 January 1, 2018, Hour 11
1/1/18 1007589445 January 1, 2018, Hour 14
1/1/18 1007589445 January 1, 2018, Hour 20
1/1/18 1007589445 January 1, 2018, Hour 22
The visit count is actually this:
Customer_id Day Hour Visit Grouping
1007589445 October 21, 2017 1 Visit 1
1007589445 October 21, 2017 2 Visit 1
1007589445 October 21, 2017 3 Visit 1
1007589445 October 21, 2017 5 Visit 2
1007589445 October 21, 2017 6 Visit 2
1007589445 October 21, 2017 23 Visit 3
1007589445 November 1, 2017 10 Visit 1
1007589445 January 1, 2018 10 Visit 1
1007589445 January 1, 2018 11 Visit 1
1007589445 January 1, 2018 14 Visit 2
1007589445 January 1, 2018 20 Visit 3
1007589445 January 1, 2018 21 Visit 4
Customer 1007589445 had
3 visits on October 21, 2017
- 1 visit on November 1, 2017
- 4 visits on January 1, 2018
Total visits: 8
Below is the sql code I have so far which needs to be modifide to satisfy the critera above.
select
CUSTOMER_ID,
TRANS_TO_DATE,
HOUR,
count (HOUR) as visits
from mstr_clickstream_vw
where trans_to_date between start_date and end_date
and web_store_ind='US'
group by CUSTOMER_ID, TRANS_TO_DATE,HOUR
You can get the hour with:
cast(trim(substr(hour, -2)) as int)
Then to use this to assign sessions by using lag() and a cumulative conditional aggregation:
select cs.*,
sum(case when trans_to_date = prev_ttd and prev_hh = hh then 0
when trans_to_date = prev_ttd and prev_hh = hh - 1 then 0
when hh = 0 and prev_hh = 23 and trans_to_date = prev_ttd + interval '1' day then 0
else 1
end) over (partition by customer_id order by trans_to_date, hh) as grouping
from (select cs.*,
lag(trans_to_date) over (partition by customer_id order by trans_to_date, hh) as prev_ttd,
lag(hh) over (partition by customer_id order by trans_to_date, hh) as prev_hh
from (select cs.*,
cast(trim(substr(hour, -2)) as int) as hh
from mstr_clickstream_vw cs
) cs
) cs;
I have been asked to create a trendline in SSRS, this trendline will the predicted future value based on current year data.
Here I have data of year 2018 and I needed to predict the trends of ClaimVolume for year 2019.
Please find the data
Month Month Name Year ClaimVolume
1 January 2018 13746
2 February 2018 13412
3 March 2018 15143
4 April 2018 15655
5 May 2018 15190
6 June 2018 15365
7 July 2018 18943
8 August 2018 24305
9 September 2018 18893
10 October 2018 26659
11 November 2018 18696
12 December 2018 22367
Please help me in providing SQL query for the above task.
I am having issues getting a SQL statement to work how I need it to. To be honest, I'm pretty green when it comes to SQL so the tries I've attempted have come from copy/paste code that I've tried to edit to make work and it's not running. So what I need is a query to be used for a report in ACCESS.
Here is what data look like:
ID TechID OccurrenceDate OccurrenceName OccurrenceAmt
247 9991 Friday, February 15, 2013 Coaching 4.50
242 9991 Friday, February 08, 2013 Con't Occurrence 0.00
241 9991 Thursday, February 07, 2013 Unscheduled Absense 1.00
240 9991 Wednesday, February 06, 2013 Shift Int less 2 hrs 0.50
243 9991 Monday, February 04, 2013 Unscheduled Absense 1.00
246 9991 Monday, January 21, 2013 Unscheduled Absense 1.00
245 9991 Wednesday, January 16, 2013 Con't Occurrence 0.00
244 9991 Tuesday, January 15, 2013 Unscheduled Absense 1.00
239 9999 Friday, February 08, 2013 Unscheduled Absense 1.00
237 9999 Wednesday, February 06, 2013 Unscheduled Absense 1.00
238 9999 Saturday, February 02, 2013 Coaching 7.00
236 9999 Tuesday, September 11, 2012 Other 6.00
235 9999 Tuesday, September 11, 2012 Other 0.00
228 9999 Thursday, August 23, 2012 Unscheduled Absense 1.00
227 9999 Friday, August 10, 2012 Unscheduled Absense 1.00
226 9999 Wednesday, August 08, 2012 Con't Occurrence 0.00
223 9999 Wednesday, February 29, 2012 Unscheduled Absense 1.00
249 9998 Saturday, February 02, 2013 Unscheduled Absense 1.00
251 9998 Monday, January 21, 2013 Unscheduled Absense 1.00
So basically if there is an "OccurrenceName" of either "Coaching" or "Other" within the last 6 months that amount plus any other occurrences within the previous 6 months should be their Tech Total. If there is are no "Coaching" or "Other" occurrences within the last 6 months then I need to sum the OccurrenceAmount for just the rolling previous 6 months.
Hopefully my very well explained scenario makes sense.
EDIT #1:
Okay, my expected output for this data should be:
TechID Total
9991 4.5
9999 9.0
9998 2.0
So as you can see, for TechID 9991 calculates 4.5 because there was a "Coaching" occurrence and nothing since in the previous 6 months. 9999 would have 9 because there was a Coaching for 7 and two more since then within the previous 6 months bringing that total to 9. 9998 has 2 because that tech has no coaching or anything within the last 6 months so the total is 2.
EDIT #2:
So the only lines that should be counted are the lines that are indented. For 9999, there was a coaching for 7 and 2 more regular occurrences bringing his total to 9. Is that more clear?
EDIT #3:
Okay, got a little further down the road.
#lance - through trial and error I am getting closer... have this for now, but can't get it working:
SELECT tblEmployeeData.TechID, tblEmployeeData.LName, tblEmployeeData.FName, Sum(tblOccurrence.OccurrenceAmt), Last(tblOccurrence.CoachingDate) AS LastOfCoachingDate, tblEmployeeData.SupLName
FROM tblEmployeeData RIGHT JOIN tblOccurrence ON tblEmployeeData.TechID = tblOccurrence.TechID
GROUP BY tblEmployeeData.TechID, tblEmployeeData.LName, tblEmployeeData.FName, tblEmployeeData.SupLName
HAVING (((tblOccurrence.OccurrenceAmt))=IIf([tblOccurrence].[CoachingDate]="",[tblOccurrence].[OccurrenceDate] Between Date() And DateAdd('m',-6,Date()),IIf([tblOccurrence].[CoachingDate]<=DateAdd('m',-6,Date()),[tblOccurrence].[OccurrenceDate] Between Date() And DateAdd('d',[tblOccurrence].[CoachingDate],Date()))));
EDIT #4:
This query is the "best" beginning query I have gotten to work. It pulls over ALL employee data then populates MaxCoaching and MaxDate. So I tried connecting this query to your second query to get totals onto a query and can't get it working.
Query:
SELECT tblEmployeeData.TechID, tblEmployeeData.LName, tblEmployeeData.FName, Max(tblOccurrence.CoachingDate) AS LastCoachingDate, Max([OccurrenceDate]) AS MaxDate, tblEmployeeData.SupLName
FROM tblEmployeeData LEFT JOIN tblOccurrence ON tblEmployeeData.TechID = tblOccurrence.TechID
GROUP BY tblEmployeeData.SupLName, tblEmployeeData.TechID, tblEmployeeData.LName, tblEmployeeData.FName
So these results get the most recent Coaching Date (if any) and the most recent event date so I need to sum Occurrences based on 2 conditions:
If there is a coaching/other date within the last 6 months, it needs that occurrence total from that line PLUS any other dates that have occurred after their coaching/other date
If no coaching/other date has occurred within the last 6 months, then I need the total of occurrences within the last 6 months.
Moving closer to getting a working query! Thanks for your help
#Zamael if I understand your question and the clarifications, the reason 9998 sums to 9, is that the September dates are more than 6 months ago, and that there is another line item with an OccurenceAmt of 2 that we don't see in the table.
Assuming that's correct, I suggest that you create a query with the following SQL:
Select TechID,
SUM(IIF([OccurenceName] in ('Coaching','Other'),1,0)) as CoachingOtherCount,
SUM(IIF([OccurenceName] in ('Coaching','Other'),OccurenceAmt,0)) as CoachingAmt,
SUM(IIF(IIF([OccurenceName] in ('Coaching','Other'),0,OccurenceAmt))) as MiscAmt
From TableName
Where OccurenceDate > = DateAdd("m",-6,Date())
Group By TechID;
Then, Create a second query with following SQL:
select TechID,
IIF(Nz(CoachingOtherCount,0) > 0, CoachingAmt, MiscAmt) as SumOccurenceAmt
from Query1;
EDIT 1 - After Clarification
What I understand now is:
Step 1: Find "Coaching" or "Other" in past 6 Months. If exists goto 2, else goto 3.
Step 2: Sum All lines occurring on or after the earliest from Step 1.
Step 3: Sum all lines from past 6 months.
If this is what you're looking for then the following code will work.
SELECT TechID,
MIN(IIF([OccurrenceName] in ('Coaching','Other'),OccurrenceDate,NULL)) AS MinCoachingOtherDate,
MIN([OccurrenceDate]) AS MinDate
FROM tblOccurrence
WHERE OccurrenceDate >= DateAdd("m",-6,Date())
GROUP BY TechID;
Then, in the second query
Select tbo.TechID,
SUM([OccurrenceAmt]) as Amount
From tblOccurrence tbo
LEFT JOIN QUERY1 q on q.techid = tbo.techid
WHERE tbo.OccurrenceDate >= IIF(q.MinCoachingOtherDate IS NULL, q.MinDate, q.MinCoachingOtherDate)
GROUP BY tbo.TechID;
The first query finds the earliest date for Coaching/Other in the first column if it exists, and simply the earliest date in the second. Then, in the second query we assume the Coaching/Other date unless it is null, then we default to the second date. Because the where constrains the dates dynamically for each TechID, you can just sum all of the OccurenceAmt.