How to add custom YoY field to output?

How to add custom YoY field to output? - sql

I'm attempting to determine the YoY growth by month, 2017 to 2018, for number of Company bookings per property.
I've tried casting and windowed functions but am not obtaining the correct result.
Example Table 1: Bookings
BookID Amnt BookType InDate OutDate PropertyID Name Status
-----------------------------------------------------------------
789555 $1000 Company 1/1/2018 3/1/2018 22111 Wendy Active
478141 $1250 Owner 1/1/2017 2/1/2017 35825 John Cancelled
There are only two book types (e.g., Company, Owner) and two Book Status (e.g., Active and Cancelled).
Example Table 2: Properties
Property ID State Property Start Date Property End Date
---------------------------------------------------------------------
33111 New York 2/3/2017
35825 Michigan 7/21/2016
The Property End Date is blank when the company still owns it.
Example Table 3: Months
Start of Month End of Month
-------------------------------------------
1/1/2018 1/31/2018
The previous developer created this table which includes a row for each month from 2015-2020.
I've tried many various iterations of my current code and can't even come close.
Desired Outcome
I need to find the YoY growth by month, 2017 to 2018, for number of Company bookings per property. The stakeholder has requested the output to have the below columns:
Month Name Bookings_Per_Property_2017 Bookings_Per_Property_2018 YoY
-----------------------------------------------------------------------
The number of Company bookings per property in a month should be calculated by counting the total number of active Company bookings made in a month divided by the total number of properties active in the month.

Here is a solution that should be close to what you need. It works by:
LEFT JOINing the three tables; the important part is to properly check the overlaps in date ranges between months(StartOfMonth, EndOfMonth), bookings(InDate, OutDate) and properties(PropertyStartDate, PropertyEndDate): you can have a look at this reference post for general discussion on how to proceed efficiently
aggregating by month, and using conditional COUNT(DISTINCT ...) to count the number of properties and bookings in each month and year. The logic implicitly relies on the fact that this aggregate function ignores NULL values. Since we are using LEFT JOINs, we also need to handle the possibility that a denominator could have a 0 value.
Notes:
you did not provide expected results so this cannot be tested
also, you did not explain how to compute the YoY column, so I left it alone; I assume that you can easily compute it from the other columns
Query:
SELECT
MONTH(m.StartOfMonth) AS [Month],
COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2017 THEN b.BookID END)
/ NULLIF(COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2017 THEN p.PropertyID END), 0)
AS Bookings_Per_Property_2017,
COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2018 THEN b.BookID END)
/ NULLIF(COUNT(DISTINCT CASE WHEN YEAR(StartOfMonth) = 2018 THEN p.PropertyID END), 0)
AS Bookings_Per_Property_2018
FROM months m
LEFT JOIN bookings b
ON m.StartOfMonth <= b.OutDate
AND m.EndOfMonth >= b.InDate
AND b.status = 'Active'
AND b.BookType = 'Company'
LEFT JOIN properties p
ON m.StartOfMonth <= COLAESCE(p.PropertyEndDate, m.StartOfMonth)
AND m.EndOfMonth >= p.PropertyStartDate
GROUP BY MONTH(m.StartOfMonth)

Related

How to find if there is a match in a certain time interval between two different date fields SQL?

I have a column in my fact table that defines whether a Supplier is old or new based on the following case-statement:
CASE
WHEN (SUBSTRING([Reg date], 1, 6) = SUBSTRING([Invoice date], 1, 6)
THEN ('New supplier')
ELSE('Old supplier')
END as [Old/New supplier]
So for example, if a Supplier was registered 201910 and Invoice date was 201910 then the Supplier would be considered a 'New supplier' that month. Now I want to calculate the number of Old/New suppliers for each month by doing an distinct count on Supplier no, which is not a problem. The last step is where it gets tricky, now I want to count the number of New/Old suppliers over a 12-month period(if there has been a match on Invoice date and reg date in any of the lagging 12 months). So I create the following mdx expression:
aggregate(parallelperiod([D Time].[Year-Month-Day].[Year],1,[D Time].[Year-Month-Day].currentmember).lead(1) : [D Time].[Year-Month-Day].currentmember ,[Measures].[Supplier No Distinct Count])
The issue I am facing is that it will count Supplier no "1234" twice since it has been both new and old during that time period. What I wish is that, if it finds one match it would be considered a "New" Supplier for that 12- month period.
This is how the result ends up looking but I want it to be zero for "Old" since Reg date and Invoice date matched once during that 12-month period it should be considered new for the whole Rolling 12 month on 201910
Any help, possible approaches or ideas are highly appreciated.
Best regards,
Rubrix

Aggregate first at the supplier level and then at the type level:
select type, count(*)
from (select supplierid,
(case when min(substring(regdate, 1, 6)) = min(substring(invoicedate, 1, 6))
then 'new' else 'old'
end) as type
from t
group by supplierid
) s
group by type;
Note: I assume your date columns are in some obscure string format for your code to work. Otherwise, you should be using appropriate date functions.

SELECT COUNT(*) OVER () AS TotalCount
FROM Facts
WHERE Regdate BETWEEN(olddate, newdate) OR InvoiceDate BETWEEN(olddate, newdate)
GROUP BY
Supplier
The above query will return all the suppliers within that time period and then group them. Thus COUNT(*) will only include unique subscribers.
You might wanna change the WHERE clause because I didn't quite understand how you are getting the 12 month period. Generally if your where clause returns the suppliers within that time period(they don't have to be unique) then the group by and count will handle the rest.

Subtracting results from two queries / sub queries

I'm trying to create a query which returns the results of customer orders (created in a month e.g. January) - the cancelled customer orders in that exact month (cancelled customer orders in month of January) and display the results grouped by location (Rows) and by year with month (Columns).
Currently I have a table containing all the customer order information both created and cancelled. Each customer order has a unique order number, location where it was sold, creation date and cancellation date. If the customer order is still valid, then the cancellation date will be null or "//". If the customer order is cancelled then it will have a cancellation date. As some additional information a customer order can be created in January 2019 and cancelled in July or August, or December etc. What I would like to obtain is the net customer orders for all the months by doing gross customer orders for a month - cancelled customer orders for that month and for a specific location = net customer orders for that month for that location.
In order to achieve this what I have tried, was to create two separate queries from the table, first one containing all the valid customer orders and the second one containing all the cancellations. Then i tried creating a cross-tab between the two other queries, trying to count what I mentioned above, grouping by location and then pivoting the of the year and month.
First query with valid customer orders named cust_valid (simplified):
SELECT cust_ords.[SaleLoc], cust_ords.[OrderNum], cust_ords.[CreationDate], cust_ords.[CancelDate]
FROM cust_ords
WHERE cust_ords.[CancelDate] = "" OR cust_ords.[CancelDate] = "//";
Second query with cancelled customer orders named cust_cancelled (simplified):
SELECT cust_ords.[SaleLoc], cust_ords.[OrderNum], cust_ords.[CreationDate], cust_ords.[CancelDate]
FROM cust_ords
WHERE cust_ords.[CancelDate] <> "" OR cust_ords.[CancelDate] <> "//";
Last, a cross-tab between them:
TRANSFORM Count(cust_valid.[OrderNum]) AS [NetOrderCount]
SELECT cust_valid.[SaleLoc]
FROM cust_valid LEFT JOIN cust_cancelled ON cust_valid.[CreationDate] = cust_cancelled.[CancelDate]
WHERE cust_valid.[CreationDate] = cust_cancelled.[CancelDate]
GROUP BY cust_ords.[SaleLoc]
PIVOT cust_valid.[CreationDate];
In this sense, I am trying to obtain (count) the net customer orders (total created for a month - what was cancelled in that month) for every given location and display the results per month (basically the columns names should be the year and the month). So for example if i have 10 customer orders in January, 5 in February and 15 in March, if 3 of the ones in January get cancelled in March, then I would like to count for the month of March 15 - 3, thus ending up with January 10, February 5, March 12.

First of all, you say an order is valid is valid if cancellation date is null or //, however you test for:
WHERE cust_ords.[CancelDate] = "" OR cust_ords.[CancelDate] = "//";
To test for null use [CancelDate] is null, or shorthand the test to ISNULL([CancelDate],'//')='//'
Second, in your second query you test for cancelled orders, with
WHERE cust_ords.[CancelDate] <> "" OR cust_ords.[CancelDate] <> "//";
That is not the negation of your test for cancelled orders!
!(A or B) => !A and !B
So you should use
WHERE cust_ords.[CancelDate] <> "" and cust_ords.[CancelDate] <> "//";
Or rather ISNULL(cust_ords.[CancelDate],'//')!='//'
Noow to your query itself, you are joining on dates, that is you join orders on a given date with cancellations on the same date. However you want to see orders and cancellation pr month. Since you left join cancellations you will only ever count cancellations that happen on the same date as orders!
SELECT
cust_valid.[SaleLoc]
, Format(
iif(isnull(cust_ords.[CancelDate],'//')='//'
,cust_ords.[CreationDate]
,cust_ords.[CancelDate]
,'MMMM yy') Mnth
, sum(iif(isnull(cust_ords.[CancelDate],'//'='//',1,0)) ValidOrders
, sum(iif(isnull(cust_ords.[CancelDate],'//'='//',0,1)) CancelledOrders
, sum(iif(isnull(cust_ords.[CancelDate],'//'='//',1,0))
- sum(iif(isnull(cust_ords.[CancelDate],'//'='//',0,1)) NetOrderCount
FROM
cust_ords
group by
cust_valid.[SaleLoc]
, Format(
iif(isnull(cust_ords.[CancelDate],'//')='//'
,cust_ords.[CreationDate]
,cust_ords.[CancelDate]
,'MMMM yy')
This should give you the basic data useful for pivoting, at least in SQL Server

THank you #Søren Kongstad for your useful explanations. I have modified / corrected the code accordingly to provide me with the needed results:
SELECT CustOrders.[Grupp namn], Format(IIf(CustOrders.[DatAnnulCde]="/ /",CustOrders.[DatCreatCde],CustOrders.[DatAnnulCde]),"yyyy-mm") AS year_month,
Sum(IIf(IsNull(CustOrders.DatCreatCde),0,1)) AS GrossOrders, Sum(IIf(CustOrders.DatAnnulCde<>"/ /",1,0)) AS CancelledOrders,
Sum(IIf(IsNull(CustOrders.DatCreatCde),0,1)) - Sum(IIf(CustOrders.DatAnnulCde<>"/ /",1,0)) AS NetOrders
FROM CustOrders
WHERE CustOrders.[CarType] = "Renault PC" And CustOrders.[DatCreatCde] >= Format(Year(Now())&"-01-01","yyyy-mm-dd")
GROUP BY CustOrders.[Grupp namn], Format(IIf(CustOrders.[DatAnnulCde]="/ /",CustOrders.[DatCreatCde],CustOrders.[DatAnnulCde]),"yyyy-mm");
The fields have different names then in the initial examples
DatAnnulCde = CancelDate
DatCreatCde = CreationDate

How many customers upgraded from Product A to Product B?

I have a "daily changes" table that records when a customer "upgrades" or "downgrades" their membership level. In the table, let's say field 1 is customer ID, field 2 is membership type and field 3 is the date of change. Customers 123 and ABC each have two rows in the table. Values in field 1 (ID) are the same, but values in field 2 (TYPE) and 3 (DATE) are different. I'd like to write a SQL query to tell me how many customers "upgraded" from membership type 1 to membership type 2 how many customers "downgraded" from membership type 2 to membership type 1 in any given time frame.
The table also shows other types of changes. To identify the records with changes in the membership type field, I've created the following code:
SELECT *
FROM member_detail_daily_changes_new
WHERE customer IN (
SELECT customer
FROM member_detail_daily_changes_new
GROUP BY customer
HAVING COUNT(distinct member_type_cd) > 1)
I'd like to see an end report which tells me:
For Fiscal 2018,
X,XXX customers moved from Member Type 1 to Member Type 2 and
X,XXX customers moved from Member Type 2 to Member type 1

Sounds like a good time to use a LEAD() analytical function to look ahead for a given customer's member_Type; compare it to current record and then evaluate if thats an upgrade/downgrade then sum results.
DEMO
CTE AS (SELECT case when lead(Member_Type_Code) over (partition by Customer order by date asc) > member_Type_Code then 1 else 0 end as Upgrade
, case when lead(Member_Type_Code) over (partition by Customer order by date asc) < member_Type_Code then 1 else 0 end as DownGrade
FROM member_detail_daily_changes_new
WHERE Date between '20190101' and '20190201')
SELECT sum(Upgrade) upgrades, sum(downgrade) downgrades
FROM CTE
Giving us: using my sample data
+----+----------+------------+
| | upgrades | downgrades |
+----+----------+------------+
| 1 | 3 | 2 |
+----+----------+------------+
I'm not sure if SQL express on rex tester just doesn't support the sum() on the analytic itself which is why I had to add the CTE or if that's a rule in non-SQL express versions too.
Some other notes:
I let the system implicitly cast the dates in the where clause
I assume the member_Type_Code itself tells me if it's an upgrade or downgrade which long term probably isn't right. Say we add membership type 3 and it goes between 1 and 2... now what... So maybe we need a decimal number outside of the Member_Type_Code so we can handle future memberships and if it's an upgrade/downgrade or a lateral...
I assumed all upgrades/downgrades are counted and a user can be counted multiple times if membership changed that often in time period desired.
I assume an upgrade/downgrade can't occur on the same date/time. Otherwise the sorting for lead may not work right. (but if it's a timestamp field we shouldn't have an issue)
So how does this work?
We use a Common table expression (CTE) to generate the desired evaluations of downgrade/upgrade per customer. This could be done in a derived table as well in-line but I find CTE's easier to read; and then we sum it up.
Lead(Member_Type_Code) over (partition by customer order by date asc) does the following
It organizes the data by customer and then sorts it by date in ascending order.
So we end up getting all the same customers records in subsequent rows ordered by date. Lead(field) then starts on record 1 and Looks ahead to record 2 for the same customer and returns the Member_Type_Code of record 2 on record 1. We then can compare those type codes and determine if an upgrade or downgrade occurred. We then are able to sum the results of the comparison and provide the desired totals.
And now we have a long winded explanation for a very small query :P

You want to use lag() for this, but you need to be careful about the date filtering. So, I think you want:
SELECT prev_membership_type, membership_type,
COUNT(*) as num_changes,
COUNT(DISTINCT member) as num_members
FROM (SELECT mddc.*,
LAG(mddc.membership_type) OVER (PARTITION BY mddc.customer_id ORDER BY mddc.date) as prev_membership_type
FROM member_detail_daily_changes_new mddc
) mddc
WHERE prev_membership_type <> membership_type AND
date >= '2018-01-01' AND
date < '2019-01-01'
GROUP BY membership_type, prev_membership_type;
Notes:
The filtering on date needs to occur after the calculation of lag().
This takes into account that members may have a certain type in 2017 and then change to a new type in 2018.
The date filtering is compatible with indexes.
Two values are calculated. One is the overall number of changes. The other counts each member only once for each type of change.

With conditional aggregation after self joining the table:
select
2018 fiscal,
sum(case when m.member_type_cd > t.member_type_cd then 1 else 0 end) upgrades,
sum(case when m.member_type_cd < t.member_type_cd then 1 else 0 end) downgrades
from member_detail_daily_changes_new m inner join member_detail_daily_changes_new t
on
t.customer = m.customer
and
t.changedate = (
select max(changedate) from member_detail_daily_changes_new
where customer = m.customer and changedate < m.changedate
)
where year(m.changedate) = 2018
This will work even if there are more than 2 types of membership level.

SQL Server - Need to obtain duplicate records based on mutiple criteria of the same column

I work with a huge dataset of hospital activity records. Each record represents something done on behalf of a patient. My focus is on patients that have experienced 'outpatient' activity, such as attended an appointment or clinic.
In the data, we get records that are duplicates in that; a patient is shown to have attended their first out patient appointment more than once in a six month period. This is an error on the part of the hospital who send their data. We have to identify these records to send back as challenges.
I have the following SQL statement which is finding records where the 'Patient Code' appears more than once.
SELECT * FROM dbo.Z_ForQueries a
JOIN (SELECT PatientCode
FROM dbo.Z_ForQueries
GROUP BY PatientCode
HAVING COUNT (*) > 1 ) b
ON a.PatientCode = b.PatientCode
WHERE [Multiple OPFA in month] = 'y'
I cannot for the life of me figure out how to syntax the next bit; For each set of duplicated patient codes, I only want to see the records where one of the records has a 'Month' of 7 (that's the just the current month I'm working on). If non of the groups of duplicated records have '7' in the month, then I don't need to see them.
For example, patient code L000066715 has 4 records, I can see that each record represents the same initial outpatient appointment in the same hospital speciality. Obviously you can only 'first attend' once. Each record has a month number; 3,4,6 & 7. Because this patient code has one of their duplicate records in month 7, I need it to be returned in the results along with the other 3 records.
Other patient codes exist in duplicate but none of their records are from month 7, so they don't need to be returned.
I hope I've set the scene properly for some help! Thanks.

Something like this should work:
SELECT *
FROM dbo.Z_ForQueries a
JOIN (
SELECT PatientCode,
MAX(CASE WHEN MONTH(dateColumn) = 7 THEN 1 ELSE 0 END) As InMonth
FROM dbo.Z_ForQueries
GROUP BY PatientCode
HAVING COUNT (*) > 1
) b ON a.PatientCode = b.PatientCode
And InMonth = 1
WHERE [Multiple OPFA in month] = 'y'
Explanation:
The CASE expression returns 1 for rows where Month=7, and 0 in all other cases. The MAX(..) around this CASE expressions thus returns 1 if any rows in the GROUP had a Month=7 and a 0 only if none of them did.

select all records from one table and return null values where they do not have another record in second table

I have looked high and low for this particular query and have not seen it.
We have two tables; Accounts table and then Visit table. I want to return the complete list of account names and fill in the corresponding fields with either null or the correct year etc. this data is used in a matrix report in SSRS.
sample:
Acounts:
AccountName AccountGroup Location
Brown Jug Brown Group Auckland
Top Shop Top Group Wellington
Super Shop Super Group Christchurch
Visit:
AcccountName VisitDate VisitAction
Brown Jug 12/12/2012 complete
Super Shop 1/10/2012 complete
I need to select weekly visits and show those that have had a complete visit and then the accounts that did not have a visit.
e.g.
Year Week AccountName VisitStatus for week 10/12/2012 should show
2012 50 Brown Jug complete
2012 50 Top Group not complete
2012 50 Super Shop not complete
e.g.
Year Week AccountName VisitStatus for week 1/10/2012 should show
2012 2 Brown Jug not complete
2012 2 Top Group not complete
2012 2 Super Shop complete

please correct me if am worng
select to_char(v.visitdate,'YYYY') year,
to_char(v.visitdate,'WW') WEAK,a.accountname,v.visitaction
from accounts a,visit v
where a.accountname=v.ACCCOUNTNAME
and to_char(v.visitdate,'WW')=to_char(sysdate,'WW')
union all
select to_char(sysdate,'YYYY') year,
to_char(sysdate,'WW') WEAK,a.accountname,'In Complete'
from accounts a
where a.accountname not in ( select v.ACCCOUNTNAME
from visit v where to_char(v.visitdate,'WW')=to_char(sysdate,'WW'));

The following answer assumes that
A) You want to see every week within a given range, whether any accounts were visited in that week or not.
B) You want to see all accounts for each week
C) For accounts that were visited in a given week, show their actual VisitAction.
D) For accounts that were NOT visited in a given week, show "not completed" as the VisitAction.
If all those are the case then the following query may do what you need. There is a functioning sqlfiddle example that you can play with here: http://sqlfiddle.com/#!3/4aac0/7
--First, get all the dates in the current year.
--This uses a Recursive CTE to generate a date
--for each week between a start date and an end date
--In SSRS you could create report parameters to replace
--these values.
WITH WeekDates AS
(
SELECT CAST('1/1/2012' AS DateTime) AS WeekDate
UNION ALL
SELECT DATEADD(WEEK,1,WeekDate) AS WeekDate
FROM WeekDates
WHERE DATEADD(WEEK,1,WeekDate) <= CAST('12/31/2012' AS DateTime)
),
--Next, add meta data to the weeks from above.
--Get the WeekYear and WeekNumber for each week.
--Note, you could skip this as a separate query
--and just included these in the next query,
--I've included it this way for clarity
Weeks AS
(
SELECT
WeekDate,
DATEPART(Year,WeekDate) AS WeekYear,
DATEPART(WEEK,WeekDate) AS WeekNumber
FROM WeekDates
),
--Cross join the weeks data from above with the
--Accounts table. This will make sure that we
--get a row for each account for each week.
--Be aware, this will be a large result set
--if there are a lot of weeks & accounts (weeks * account)
AccountWeeks AS
(
SELECT
*
FROM Weeks AS W
CROSS JOIN Accounts AS A
)
--Finally LEFT JOIN the AccountWeek data from above
--to the Visits table. This will ensure that we
--see each account/week, and we'll get nulls for
--the visit data for any accounts that were not visited
--in a given week.
SELECT
A.WeekYear,
A.WeekNumber,
A.AccountName,
A.AccountGroup,
IsNull(V.VisitAction,'not complete') AS VisitAction
FROM AccountWeeks AS A
LEFT JOIN Visits AS V
ON A.AccountName = V.AccountName
AND A.WeekNumber = DATEPART(WEEK,V.VisitDate)
--Set the maxrecursion number to a number
--larger than the number of weeks you will return
OPTION (MAXRECURSION 200);
I hope that helps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to add custom YoY field to output? - sql

Related

How to find if there is a match in a certain time interval between two different date fields SQL?

Subtracting results from two queries / sub queries

How many customers upgraded from Product A to Product B?

SQL Server - Need to obtain duplicate records based on mutiple criteria of the same column

select all records from one table and return null values where they do not have another record in second table

Categories

Resources