filtering where start & end Dates between selected Date - powerpivot

I have a fact Table that contains people & Thier qualifications. The fact table has two dates, the StartStudyDate & EndStudyDate. This represents the period the students were studying.
I then have a Person Dimension, Qualification Dimension, a Grouping Dimension & one Date Dimension.
Im trying to find a count of students who were actively studying on a particular date.
In SQL its relatively simple:
select a.PaygroupDescription, a.[Qualification], count(a.[PersonID])
from (
select distinct p.[PersonID], PaygroupDescription, q.[Qualification]
from [hr].[Appointment Detail] ad
join hr.Paygroup pg on ad.PaygroupID = pg.PaygroupID
join hr.Qualification q on q.QualificationID = ad.QualificationID
join hr.Person p on p.PersonID = ad.PersonID
join dimdate sd on sd.DateID = ad.StartDateID
join dimDate ed on ed.DateID = ad.EndDateID
where sd.date <= 20150101 and ed.date >= 20150101
) as a
group by a.PaygroupDescription, a.[Qualification]
The problem is i cant figure out how to do this in dax.
I started out by adding two columns to the fact table in the TabularModel:
ActualStartDate
=LOOKUPVALUE(
'Date'[Date],
'Date'[DateID],
'Appointment Detail'[StartDateID])
ActualEndDate
=LOOKUPVALUE(
'Date'[Date],
'Date'[DateID],
'Appointment Detail'[EndDateID])
I then wrote the measure that checks if one date is selected from DimDate, it gets all distinct rows where the selectedDate is <= ActualStartDate && >= ActualEndDate.
Problem is that this behaves like an absolute dog. if i try to add any attributes for breaking the data down, i run out of memory (at least in 32bit excel). I know i could try 64 bit excel, but my dataset is small, so memory should not be an issue. This is before i even add filters to the calculation for specific qualifications:
EmployeeCount:=if(HASONEVALUE('Date'[Date]),
CALCULATE
(distinctcount('Appointment Detail'[PersonID]),
DATESBETWEEN('Date'[Date], min('Appointment Detail'[ActualStartDate]),Max( 'Appointment Detail'[ActualEndDate]))
)
,BLANK())
Id appreciate help in understanding the problem correctly as im obviously missing something here regarding the problem & my dax experience is also very light.

I would remove the relationships to DimDate and then use something like the following pattern for your measure:
EmployeeCount := CALCULATE( COUNTROWS ( 'Person' ),
FILTER('Appointment Detail',
'Appointment Detail'[ActualStartDate] <= MAX(DimDate[Date])
&& 'Appointment Detail'[ActualEndDate] >= MIN(DimDate[Date])
))

Related

SQL Query to Find if (Count of date > X) = 0 For Group of ID

I apologize if the title is not be correct as I'm not sure what I need to ask for, since I don't know how to build the query.
I have the following query built to return a list of chemicals and other related fields.
SELECT DISTINCT
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date],
RDB.Assay_Group.[Assay_Group_Name] AS [Assay Group],
RDB.Chemical.[Chemical_Name],
RDB.Chemical.[Product_Number],
RDB.Chemical_Record.[Lot_Number],
RDB.Storage_Location.[Location_Name]
FROM RDB.Chemical_Record
LEFT JOIN RDB.Chemical ON Chemical_Record.[Chemical_ID] = Chemical.[ID_Chemical]
LEFT JOIN RDB.Storage_Location ON Storage_Location.[ID_Storage_Location] = Chemical_Record.[Storage_Location_ID]
LEFT JOIN RDB.Chemical_To_AGroup ON Chemical_To_AGroup.[Chemical_ID] = Chemical_Record.[Chemical_ID]
LEFT JOIN RDB.Assay_Group ON Assay_Group.[ID_Assay_Group] = Chemical_To_AGroup.[Assay_Group_ID]
WHERE RDB.Chemical_Record.[Expires_Date] >= DATEADD(day,-60, GETDATE())
ORDER BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.[Expires_Date], RDB.Assay_Group.[Assay_Group_Name]
I am using this query in a VB.Net application where it exports the results to an Excel worksheet and then performs additional actions to delete the rows I don't need. The process to query is quick, but working with Excel from .Net is painful and slow.
Instead I'd like to build the query to return the exact results I want, which I think is possible, I just can't figure out how. I have tried using a combination of Count, Group and Having, but since I've never worked with those I can't get them to work for me.
Example:
SELECT
COUNT(RDB.Chemical_Record.[Chemical_ID]) Count_ID,
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date]
FROM RDB.Chemical_Record
WHERE RDB.Chemical_Record.[Expires_Date] > DATEADD(day,30,GETDATE())
GROUP BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.[Expires_Date]
ORDER BY RDB.Chemical_Record.[Chemical_ID]
As you can see from this example, it doesn't return the count of ID's where Expiration Date > DATEADD(day,30,GETDATE()) nor does it return the ID's that I actually wanted.
What I need to return is all chemicals (ID) that DO NOT have an expiration date > Today + 30 for that specific ID. The screenshot below shows an example of the data that gets pulled. The yellow highlighted rows are the only two in that set that should get returned as there are no other chemicals of those two ID's with an expiration date > Today + 30. All the other ID's should not show up since they DO have ID's of COUNT(Expiration Date > Today + 30) > 0.
If someone could help me build the query using the appropriate Aggregate functions, it would be MUCH appreciated.
What I need to return is all chemicals (ID) that DO NOT have an expiration date > Today + 30 for that specific ID.
For this question, you can use a HAVING clause. No WHERE is needed:
SELECT COUNT(*) as Count_ID, cr.[Chemical_ID]
FROM RDB.Chemical_Record cr
GROUP BY cr.[Chemical_ID]
HAVING MAX(cr.Expires_Date) <= DATEADD(day, 30, GETDATE())
ORDER BY cr.[Chemical_ID]
Using the HAVING MAX solved my problem and I was then able to work out exactly what I needed. I had to do some more research to figure out how to bring all my columns back, but that wasn't as difficult.
Here is my final solution:
WITH CHEM AS (
SELECT RDB.Chemical_Record.[Chemical_ID]
FROM RDB.Chemical_Record
GROUP BY RDB.Chemical_Record.[Chemical_ID]
HAVING MAX(RDB.Chemical_Record.Expires_Date) <= DATEADD(day, 60, GETDATE())
)
SELECT DISTINCT
RDB.Chemical_Record.[Chemical_ID],
RDB.Chemical_Record.[Expires_Date],
RDB.Assay_Group.[Assay_Group_Name] AS [Assay Group],
RDB.Chemical.[Chemical_Name],
RDB.Chemical.[Product_Number],
RDB.Chemical_Record.[Lot_Number],
RDB.Storage_Location.[Location_Name]
FROM RDB.Chemical_Record
INNER JOIN CHEM ON CHEM.Chemical_ID = RDB.Chemical_Record.Chemical_ID
LEFT JOIN RDB.Chemical ON Chemical_Record.[Chemical_ID] = Chemical.[ID_Chemical]
LEFT JOIN RDB.Storage_Location ON Storage_Location.[ID_Storage_Location] = Chemical_Record.[Storage_Location_ID]
LEFT JOIN RDB.Chemical_To_AGroup ON Chemical_To_AGroup.[Chemical_ID] = Chemical_Record.[Chemical_ID]
LEFT JOIN RDB.Assay_Group ON Assay_Group.[ID_Assay_Group] = Chemical_To_AGroup.[Assay_Group_ID]
WHERE Expires_Date >= DATEADD(day, -60, GETDATE())
ORDER BY RDB.Chemical_Record.[Chemical_ID], RDB.Chemical_Record.Expires_Date
And a screenshot showing the resulting search:

Expand Join to not limit data

I have a weird question - I understand that Joins return matching data based on the 'ON' stipulation, however the problem I am facing is I need the Business date back for both tables but at the same time i need to join on the date in order to get the totals correct
See below code:
Select
o.Resort,
o.Business_Date,
Occupied,
Comps,
House,
ADR,
Room_Revenue,
Occupied-(Comps+House) AS DandT,
Coalesce(gd.Projected_Occ1,0) AS Projected_Occ1,
Occupied-(Comps+House)+Coalesce(gd.Projected_Occ1,0) as Total
from Occupancy o
left join Group_Details_HF gd
on o.Business_Date = gd.Business_Date
and o.Resort = gd.resort
UNION ALL
select
o.Resort,
o.Business_Date,
Occupied,
Comps,
House,
ADR,
Room_Revenue,
Occupied-(Comps+House) AS DandT,
Coalesce(gd.Projected_Occ1,0) AS Projected_Occ1,
Coalesce(Occupied-(Comps+House),0)+Coalesce(gd.Projected_Occ1,0) as Total
from Occupancy_Forecast o
FULL OUTER JOIN Group_Details_HF gd
on o.Business_Date = gd.Business_Date
and o.Resort = gd.resort
Currently, this gives me the desired results from the Occupancy and Occupancy forecast table however when the business date does not exist in the occupancy forecast table it ignores the group_details table, I need the results to combine the dates when they exist in both or give the unique results for each when there is no match
I have decided to create another pivot table storing the details from Group_Details_HF and then Union together the two tables which has given me the desired result rather than fiddling with the join :)

Can't Make Crosstab Query on a query containing SubQuery

I have query that contain subquery: to calculate the interval between departure and arrival time, from my table "Timetable"
this Query works very fine, but when trying to execute it from the Crosstab, It prompts me an error that it cannot find table "a" which is alias I used for "Timetable"
SELECT a.VesselID, a.MovementID, a.MovementTime, (SELECT TOP 1
Timetable.MovementTime
FROM Timetable
WHERE (((Timetable.MovementID)="Arrival") AND
((Timetable.VesselID)=a.VesselID]) AND ((Timetable.MovementTime)>a.
[MovementTime]))
ORDER BY Timetable.MovementTime) AS Arrival1,
DateDiff('h',[a].[MovementTime],[Arrival1]) AS [Interval]
FROM Timetable AS a INNER JOIN Timetable ON a.ID = Timetable.ID
WHERE (((a.MovementID)="Departure"));
I think this Question is very similar, and the solution is that I split my query As #DHW said, but I couldn't do that.
and this is my try on splitting:
[Departure_Query]
SELECT Timetable.VesselID, Timetable.MovementTime AS mymov,
Timetable.MovementID
FROM Timetable
WHERE (((Timetable.MovementID)="Departure"));
[Main]
SELECT Timetable.MovementTime, Timetable.MovementID, Timetable.VesselID, Departure_Query.mymov, DateDiff('h',[mymov],[MovementTime]) AS [Interval]
FROM Timetable INNER JOIN Departure_Query ON Timetable.VesselID = Departure_Query.VesselID
WHERE (((Timetable.MovementTime)>[Departure_Query].[mymov]) AND ((Timetable.MovementID)="Arrival") AND ((Timetable.VesselID)=[Departure_Query].[VesselID]))
ORDER BY Timetable.MovementTime;
I think the problem is:
In The working query I could put SELECT TOP 1 but in the split try I dont know where to put it.
update Actually, right now i want to split it anyway, because when i am trying to build a report in top of it. It prompts me that Access cant do grouping on this field.
But anyway this my attempt
TRANSFORM DateDiff('h',[a].[MovementTime],[Arrival1]) AS [Interval]
SELECT a.MovementTime
FROM Timetable AS a INNER JOIN Timetable ON a.ID = Timetable.ID
WHERE (((a.MovementID)="Departure"))
GROUP BY a.MovementID, a.MovementTime, (SELECT TOP 1 Timetable.MovementTime
FROM Timetable
WHERE (((Timetable.MovementID)="Arrival") AND ((Timetable.VesselID)=a.[VesselID]) AND ((Timetable.MovementTime)>a.[MovementTime]))
ORDER BY Timetable.MovementTime)
PIVOT a.VesselID;
The resultsThe Design View
Consider a crosstab with a domain aggregate, DMin() to replace subquery:
TRANSFORM DateDiff('h', main.[MovementTime], main.[Arrival1]) AS [Interval]
SELECT main.MovementID, main.MovementTime
FROM
(SELECT t.VesselID, t.MovementID, t.MovementTime,
DMin("MovementTime", "Timetable", "MovementID = 'Arrival'
AND VesselID = " & t.VesselID & "
AND MovementTime > #" & t.MovementTime & "#") As Arrival1
FROM Timetable AS t
WHERE (((t.MovementID) = 'Departure'))
) As
GROUP BY main.MovementID, main.MovementTime
PIVOT main.VesselID;
Thank you #Parfait and #June7, I am adding this answer so anyone in the future can benefit from this problem.
The Problem
I figured out the problem to be: The query is subtracting all the smaller departure dates for a specific Vessel
i.e. Vessel 1 Departed 6/1, 6/3, 6/6 and Arrived 6/2,6/2,6/8. so for the last day It was subtracting 6/8-6/6, 6/8-6/3, 6/8-6/1. of the course the only first one (the bold one)is the right one.
The Solution
SELECT Min(Timetable.MovementTime) AS MinOfMovementTime, Departure_Query.mymov AS DeptDate, Min(DateDiff('h',[mymov],[MovementTime])) AS WorkingH, Timetable.MovementID, Timetable.VesselID
FROM Timetable LEFT JOIN Departure_Query ON Timetable.VesselID = Departure_Query.VesselID
WHERE (((Timetable.MovementID)="Arrival") AND ((Timetable.VesselID)=[Departure_Query].[VesselID]) AND ((Timetable.MovementTime)>[mymov]))
GROUP BY Departure_Query.mymov, Timetable.MovementID, Timetable.VesselID
ORDER BY Min(Timetable.MovementTime);
The only change here is Min(DateDiff('h',[mymov],[MovementTime])) which only give the smallest subtraction value, which translates to The biggest Departure Date.

Include missing years in Group By query

I am fairly new in Access and SQL programming. I am trying to do the following:
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
and group by year even when there is no amount in some of the years. I would like to have these years listed as well for a report with charts. I'm not certain if this is possible, but every bit of help is appreciated.
My code so far is as follows:
SELECT
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
Sum(SO_SalesOrderPaymentHistoryLineT.Amount) AS [Sum Of PaymentPerYear]
FROM
Base_CustomerT
INNER JOIN (
SO_SalesOrderPaymentHistoryLineT
INNER JOIN SO_SalesOrderT
ON SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId
) ON Base_CustomerT.CustomerId = SO_SalesOrderT.CustomerId
GROUP BY
Base_CustomerT.SalesRep,
SO_SalesOrderT.CustomerId,
Base_CustomerT.Customer,
SO_SalesOrderPaymentHistoryLineT.DatePaid,
SO_SalesOrderPaymentHistoryLineT.PaymentType,
Base_CustomerT.IsActive
HAVING
(((SO_SalesOrderPaymentHistoryLineT.PaymentType)=1)
AND ((Base_CustomerT.IsActive)=Yes))
ORDER BY
Base_CustomerT.SalesRep,
Base_CustomerT.Customer;
You need another table with all years listed -- you can create this on the fly or have one in the db... join from that. So if you had a table called alltheyears with a column called y that just listed the years then you could use code like this:
WITH minmax as
(
select min(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as minyear,
max(year(SO_SalesOrderPaymentHistoryLineT.DatePaid) as maxyear)
from SalesOrderPaymentHistoryLineT
), yearsused as
(
select y
from alltheyears, minmax
where alltheyears.y >= minyear and alltheyears.y <= maxyear
)
select *
from yearsused
join ( -- your query above goes here! -- ) T
ON year(T.SO_SalesOrderPaymentHistoryLineT.DatePaid) = yearsused.y
You need a data source that will provide the year numbers. You cannot manufacture them out of thin air. Supposing you had a table Interesting_year with a single column year, populated, say, with every distinct integer between 2000 and 2050, you could do something like this:
SELECT
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
Sum(NZ(data.Amount)) AS [Sum Of PaymentPerYear]
FROM
(SELECT * FROM Base_CustomerT INNER JOIN Year) AS base
LEFT JOIN
(SELECT * FROM
SO_SalesOrderT
INNER JOIN SO_SalesOrderPaymentHistoryLineT
ON (SO_SalesOrderPaymentHistoryLineT.SalesOrderId = SO_SalesOrderT.SalesOrderId)
) AS data
ON ((base.CustomerId = data.CustomerId)
AND (base.year = Year(data.DatePaid))),
WHERE
(data.PaymentType = 1)
AND (base.IsActive = Yes)
AND (base.year BETWEEN
(SELECT Min(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT)
AND (SELECT Max(year(DatePaid) FROM SO_SalesOrderPaymentHistoryLineT))
GROUP BY
base.SalesRep,
base.CustomerId,
base.Customer,
base.year,
ORDER BY
base.SalesRep,
base.Customer;
Note the following:
The revised query first forms the Cartesian product of BaseCustomerT with Interesting_year in order to have base customer data associated with each year (this is sometimes called a CROSS JOIN, but it's the same thing as an INNER JOIN with no join predicate, which is what Access requires)
In order to have result rows for years with no payments, you must perform an outer join (in this case a LEFT JOIN). Where a (base customer, year) combination has no associated orders, the rest of the columns of the join result will be NULL.
I'm selecting the CustomerId from Base_CustomerT because you would sometimes get a NULL if you selected from SO_SalesOrderT as in the starting query
I'm using the Access Nz() function to convert NULL payment amounts to 0 (from rows corresponding to years with no payments)
I converted your HAVING clause to a WHERE clause. That's semantically equivalent in this particular case, and it will be more efficient because the WHERE filter is applied before groups are formed, and because it allows some columns to be omitted from the GROUP BY clause.
Following Hogan's example, I filter out data for years outside the overall range covered by your data. Alternatively, you could achieve the same effect without that filter condition and its subqueries by ensuring that table Intersting_year contains only the year numbers for which you want results.
Update: modified the query to a different, but logically equivalent "something like this" that I hope Access will like better. Aside from adding a bunch of parentheses, the main difference is making both the left and the right operand of the LEFT JOIN into a subquery. That's consistent with the consensus recommendation for resolving Access "ambiguous outer join" errors.
Thank you John for your help. I found a solution which works for me. It looks quiet different but I learned a lot out of it. If you are interested here is how it looks now.
SELECT DISTINCTROW
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
FROM
Base_Customer_RevenueYearQ
LEFT JOIN CustomerPaymentPerYearQ
ON (Base_Customer_RevenueYearQ.RevenueYear = CustomerPaymentPerYearQ.[RevenueYear])
AND (Base_Customer_RevenueYearQ.CustomerId = CustomerPaymentPerYearQ.CustomerId)
GROUP BY
Base_Customer_RevenueYearQ.SalesRep,
Base_Customer_RevenueYearQ.CustomerId,
Base_Customer_RevenueYearQ.Customer,
Base_Customer_RevenueYearQ.RevenueYear,
CustomerPaymentPerYearQ.[Sum Of PaymentPerYear]
;

SQL SUM function doubling the amount it should using multiple tables

My query below is doubling the amount on the last record it returns. I have 3 tables - activities, bookings and tempbookings. The query needs to list the activities and attached information and pull the total number (using the SUM) of places booked (as BookingTotal) from the booking table by each activity and then it needs to calculate the same for tempbookings (as tempPlacesReserved) providing the reservedate field inside that table is in the future.
However the first issue is that if there are no records for an activity in the tempbookings table it does not return any records for that activity at all, to get around this i created dummy records in the past so that it still returns the record, but if I can make it so I don't have to do this I would prefer it!
The main issue I have is that on the final record of the returned results it doubles the booking total and the places reserved which of course makes the whole query useless.
I know that I am doing something wrong I just haven't been able to sort it, I have searched similar issues online but am unable to apply them to my situation correctly.
Any help would be appreciated.
P.S. I'm aware that normally you wouldn't need to fully label all the paths to the databases, tables and fields as I have but for the program I am planning to use it in I have to do it this way.
Code:
SELECT [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice],
SUM([LeisureActivities].[dbo].[bookings].[bookingPlaces]) AS 'bookingTotal',
SUM (CASE WHEN[LeisureActivities].[dbo].[tempbookings].[tempReserveDate] > GetDate() THEN [LeisureActivities].[dbo].[tempbookings].[tempPlaces] ELSE 0 end) AS 'tempPlacesReserved'
FROM [LeisureActivities].[dbo].[activities],
[LeisureActivities].[dbo].[bookings],
[LeisureActivities].[dbo].[tempbookings]
WHERE ([LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[bookings].[activityID]
AND [LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[tempbookings].[tempActivityID])
AND [LeisureActivities].[dbo].[activities].[activityDate] > GetDate ()
GROUP BY [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice];
Your current query is using an INNER JOIN between each of the tables so if the tempBookings table has no records, you will not return anything.
I would advise that you start to use JOIN syntax. You might also need to use subqueries to get the totals.
SELECT a.[activityID],
a.[activityName],
a.[activityDate],
a.[activityPlaces],
a.[activityPrice],
coalesce(b.bookingTotal, 0) bookingTotal,
coalesce(t.tempPlacesReserved, 0) tempPlacesReserved
FROM [LeisureActivities].[dbo].[activities] a
LEFT JOIN
(
select activityID,
SUM([bookingPlaces]) AS bookingTotal
from [LeisureActivities].[dbo].[bookings]
group by activityID
) b
ON a.[activityID]=b.[activityID]
LEFT JOIN
(
select tempActivityID,
SUM(CASE WHEN [tempReserveDate] > GetDate() THEN [tempPlaces] ELSE 0 end) AS tempPlacesReserved
from [LeisureActivities].[dbo].[tempbookings]
group by tempActivityID
) t
ON a.[activityID]=t.[tempActivityID]
WHERE a.[activityDate] > GetDate();
Note: I am using aliases because it is easier to read
Use new SQL-92 Join syntax, and make join to tempBookings an outer join. Also clean up your sql with table aliases. Makes it easier to read. As to why last row has doubled values, I don't know, but on off chance that it is caused by extra dummy records you entered. get rid of them. That problem is fixed by using outer join to tempBookings. The other possibility is that the join conditions you had to the tempBookings table(t.tempActivityID = a.activityID) is insufficient to guarantee that it will match to only one record in activities table... If, for example, it matches to two records in activities, then the rows from Tempbookings would be repeated twice in the output, (causing the sum to be doubled)
SELECT a.activityID, a.activityName, a.activityDate,
a.activityPlaces, a.activityPrice,
SUM(b.bookingPlaces) bookingTotal,
SUM (CASE WHEN t.tempReserveDate > GetDate()
THEN t.tempPlaces ELSE 0 end) tempPlacesReserved
FROM LeisureActivities.dbo.activities a
Join LeisureActivities.dbo.bookings b
On b.activityID = a.activityID
Left Join LeisureActivities.dbo.tempbookings t
On t.tempActivityID = a.activityID
WHERE a.activityDate > GetDate ()
GROUP BY a.activityID, a.activityName,
a.activityDate, a.activityPlaces,
a.activityPrice;