SSAS Attribute relationship behavior in Hierarchy - ssas

I am using SQL server 2012 with Visual Studio 2015.
(While I am an excellent SSRS developer I am only just getting into SSAS)
I basically want to use the hierarchy as it neatens up the Excel Front end more than anything. At the same time though I want the attributes to behave in the same manner, when in the browser on their own, as they would if they are not in the hierarchy.
I have a customer dimension, for example, where the PK is an int called ID with a customer ID and Delivery Postcode
The attribute relation ship is set to;
ID to Customer ID
ID to Delivery Post code
The full hierarchy displays like this;
CustomerID Delivery PostCode Order Quantity
1 FY5 3TG 1
2 CH2 1NF 1
3 SK22 4DT 1
4 L20 6LT 1
5 FY6 8BX 1
6 WA8 7XP 1
7 L18 3JN 1
8 L12 3HB 1
9 M28 0SX 1
10 BB7 9AZ 1
10 ZZ99 9ZZ 1
9 ZZ99 9ZZ 1
If you just pull the delivery post code, from the hierarchy, through it returns;
Delivery PostCode Order Quantity
FY5 3TG 1
CH2 1NF 1
SK22 4DT 1
L20 6LT 1
FY6 8BX 1
WA8 7XP 1
L18 3JN 1
L12 3HB 1
M28 0SX 1
BB7 9AZ 1
ZZ99 9ZZ 1
ZZ99 9ZZ 1
I would expect the post codes at the bottom to roll up, eg postcode ZZ99 9ZZ, (like it would in a normal excel pivot for example), like the below. It is also worth noting that I have left the attributes visible outside the hierarchy. If use that delivery Postcode attribute it also returns the below;
Delivery PostCode Order Quantity
FY5 3TG 1
CH2 1NF 1
SK22 4DT 1
L20 6LT 1
FY6 8BX 1
WA8 7XP 1
L18 3JN 1
L12 3HB 1
M28 0SX 1
BB7 9AZ 1
ZZ99 9ZZ 2
I don't understand why the same attribute is acting differently inside and outside the hierarchy. My only conclusion, which is probably wrong(!), is that the hierarchy itself is forcing the Delivery Postcode attribute to act differently.
Any help would be great.
I basically want to use the hierarchy as it neatens up the Excel Front end more than anything. At the same time though I want the attributes to behave in the same manner, when in the browser on their own, as they would if they are not in the hierarchy.

Related

Is it allowed to have multiple empty parentid's?

So first I have a typical table that looks like this:
SectionID
SectionName
DivisionID
DivisionName
IndustryID
IndustryName
SectorID1
SecNam1
DivisionID11
DivName11
IndustryID111
InduNam111
SectorID1
SecNam1
DivisionID11
DivName11
IndustryID112
InduNam112
SectorID1
SecNam1
DivisionID12
DivName12
IndustryID121
InduNam121
SectorID2
SecNam2
DivisionID21
DivName21
IndustryID211
InduNam211
SectorID3
SecNam3
DivisionID31
DivName31
IndustryID311
InduNam311
now I want to transform it in a more dynamic hierarchy structure with parentIds which looks like this:
ID
ParentID
Level
Type
Name
1
NULL
1
SectorID1
SecNam1
2
NULL
1
SectorID2
SecNam2
3
NULL
2
SectorID3
SecNam3
4
1
2
DivisionID11
DivName11
5
1
2
DivisionID12
DivName12
5
1
2
DivisionID21
DivName21
6
2
2
DivisionID31
DivName31
7
2
3
IndustryID
InduNam111
8
2
3
IndustryID
InduNam112
9
2
3
IndustryID
InduNam121
10
2
3
IndustryID
InduNam211
11
2
3
IndustryID
InduNam311
Is it okay to have multiple empty parentId's?
Is this transformation ok?
Having the parent for the root node as null is fine - though in some designs this is populated with the same value as the record itself rather than having to deal with nulls e.g.
ID
ParentID
Level
Type
Name
1
SecNam1
1
SectorID1
SecNam1
Also, having different versions of the same value in the TYPE column looks odd. I would have expected just "SectorID" not "SectorID1", "SectorID2", "SectorID3". The difference between different sectors is handled by the name of the sector e.g. like you have done with "Division".
Did you mean the level of "SectorID3" to be 2 rather than 1 - or is that just a typo?

increase rank based on particular value in column

I would appreciate some help for below issue. I have below table
id
items
1
Product
2
Tea
3
Coffee
4
Sugar
5
Product
6
Rice
7
Wheat
8
Product
9
Beans
10
Oil
I want output like below. Basically I want to increase the rank when item is 'Product'. May I know how can I do that? For data privacy and compliance purposes I have modified the data and column names
id
items
ranks
1
Product
1
2
Tea
1
3
Coffee
1
4
Sugar
1
5
Product
2
6
Rice
2
7
Wheat
2
8
Product
3
9
Beans
3
10
Oil
3
I have tried Lag and lead functions but unable to get expected output
Here is solution using a derived value of 1 or 0 to denote data boundaries SUM'ed up with the ROWS UNBOUNDED PRECEDING option, which is key here.
SELECT
id,
items,
SUM(CASE WHEN items='Product' THEN 1 ELSE 0 END) OVER (ORDER BY id ROWS UNBOUNDED PRECEDING) as ranks
FROM

Select maximum value where another column is used for for the Grouping

I'm trying to join several tables, where one of the tables is acting as a
key-value store, and then after the joins find the maximum value in a
column less than another column. As a simplified example, I have the following three tables:
Documents:
DocumentID
Filename
LatestRevision
1
D1001.SLDDRW
18
2
P5002.SLDPRT
10
Variables:
VariableID
VariableName
1
DateReleased
2
Change
3
Description
VariableValues:
DocumentID
VariableID
Revision
Value
1
2
1
Created
1
3
1
Drawing
1
2
3
Changed Dimension
1
1
4
2021-02-01
1
2
11
Corrected typos
1
1
16
2021-02-25
2
3
1
Generic part
2
3
5
Screw
2
2
4
2021-02-24
I can use the LEFT JOIN/IS NULL thing to get the latest version of
variables relatively easily (see http://sqlfiddle.com/#!7/5982d/3/0).
What I want is the latest version of variables that are less than or equal
to a revision which has a DateReleased, for example:
DocumentID
Filename
Variable
Value
VariableRev
DateReleased
ReleasedRev
1
D1001.SLDDRW
Change
Changed Dimension
3
2021-02-01
4
1
D1001.SLDDRW
Description
Drawing
1
2021-02-01
4
1
D1001.SLDDRW
Description
Drawing
1
2021-02-25
16
1
D1001.SLDDRW
Change
Corrected Typos
11
2021-02-25
16
2
P5002.SLDPRT
Description
Generic Part
1
2021-02-24
4
How do I do this?
I figured this out. Add another JOIN at the start to add in another version of the VariableValues table selecting only the DateReleased variables, then make sure that all the VariableValues Revisions selected are less than this date released. I think the LEFT JOIN has to be added after this table.
The example at http://sqlfiddle.com/#!9/bd6068/3/0 shows this better.

Check constraint for multiple conditions

The teacher gave us a team assignment, and me and my teammate are quite struggling with it (especially since we need to use things like TRIGGERS and PROCEDURES, things we didn't see in class yet …).
We need to implement an arc-relationship, and we fail to understand how …
But before I tell you guys what I need to accomplish, I will give you part of the description of the task, so you guys can understand the situation a bit better …
We basically need to make an ERD for a VLSI CAD-system and we need to implement it. Now, we have our CELL entity, the attributes of which aren't really relevant … The only thing you guys need to know in order to help us is that it has a primary key, CELL_CODE, which is a VARCHAR.
Each CELL has many (I think at least four, I don't think you can have triangular CELLS, but doesn't matter anyways) SIDES. A SIDE can be logically identified by its CELL, and to make matters ridiculously difficult, each SIDE has to be numbered by its CELL, like so:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
Now, each SIDE has its CONNECTION_PINS. CONNECTION_PINS is also uniquely identified by SIDES, which are basically numbered in a similar manner:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
CONNECTION_PINS:
SEQUENCE_NUMBER SIE_SEQUENCE_NUMBER CELL_CODE
1 1 1
2 1 1
1 2 1
2 2 1
1 3 1
2 3 1
1 1 2
2 1 2
1 2 2
2 2 2
1 3 2
2 3 2
I tried to explain the numbering issue we have here: Data model - PRIMARY KEY numbering issue, but yeah, I didn't really explain it the way it should be explained ...
Now, we have one final entity, which is where the Arc comes in: CONNECTIONS. CONNECTIONS has 2 CONNECTION_PINS: one for START_FROMand one for END_OF. Now, logically seen the start pin can't be the end pin as well, for a given connection. And that's our struggle. Basically, this shouldn't be allowed:
CELLS:
CELL_CODE
1
2
SIDES:
SEQUENCE_NUMBER CELL_CODE
1 1
2 1
3 1
1 2
2 2
3 2
CONNECTION_PINS:
SEQUENCE_NUMBER SIE_SEQUENCE_NUMBER CELL_CODE
1 1 1
2 1 1
1 2 1
2 2 1
1 3 1
2 3 1
1 1 2
2 1 2
1 2 2
2 2 2
1 3 2
2 3 2
CONNECTIONS:
(you shouldn't be able to put this in …)
CPI_SEQNUM_START SIE_SEQNUM_START CELL_CODE_START CPI_SEQNUM_END SIE_SEQNUM_END CELL_CODE_END
1 1 1 1 1 1
Now, this is basically the ERD for this part:
ERD with barred relationships and the arc-relationship in question
and this is the physical model:
Physical model
I basically thought a simple CHECK might do (CHECK (CPI_SEQNUM_START <> CPI_SEQNUM_END AND CELL_CODE_START <> CELL_CODE_END AND SIE_SEQNUM_START <> SIE_SEQNUM_END) ), but that prevented us from inserting anything somehow … Any advice?
Your approach was correct to use a CHECK constraint. Your logic for the constraint was wrong though. You need an OR condition. Only one of the three fields needs to be different.
CPI_SEQNUM_START <> CPI_SEQNUM_END OR
CELL_CODE_START <> CELL_CODE_END OR
SIE_SEQNUM_START <> SIE_SEQNUM
... assuming all three fields are not nullable.

Query: Employee Training Schedules Based on Position/Workrole

My company sends folks to training. Based on projected new hires/transfers, I was asked to generate a report that estimates the number of seats we need in each course broken out by quarter.
Question: My question is two-fold:
What is the best way to represent a sequence of courses (i.e. prerequisites) in a relational DB?
How do I create the query(-ies) necessary to produce the following desired output:
Desired Output:
ID PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 1 1 1/14/2017 1/14/2017
2 2 1 2/17/2017 2/17/2017
3 2 2 2/18/2017 2/19/2017
4 2 3 2/20/2017 2/20/2017
5 3 49 1/18/2017 2/03/2017
6 …
Background Info: The courses are taken in-sequence: the first few courses are orientation courses for the company, and later courses are more specific to the employee's workrole. There are over 50 different courses, 40 different workroles and we're projecting ~1k new hires/transfers. Each work role must take a sequence of courses in a prescribed order, but I'm having trouble representing this ordering and subsequently writing the necessary query.
Existing Tables:
I have several tables that I've used to store the data: Personnel, LnkPersonnelToWorkroles,Workroles, LnkWorkrolesToCourses, and Courses (there's many others as well, but I omit them for the sake of scoping this question down). Here's some notional data from these tables:
Personnel (These are the projected new hires and their estimated arrival date.)
ID DisplayName RequiredCompletionDate
1 Kristel Bump 10/1/2016
2 Shelton Franke 3/11/2017
3 Shaunda Launer 4/16/2017
4 Clarinda Kestler 3/13/2017
5 My Wimsatt 6/6/2017
6 Gillian Bramer 10/25/2016
7 ...
Workroles (These are the positions in the company)
ID Workrole
1 Manager
2 Secretary
3 Admin Asst.
4 ...
LnkPersonnelToWorkroles (Links projected new hires to their projected workrole)
ID PersonnelID WorkroleID
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 6 1
7 ...
Courses (All courses available)
ID CourseName LengthInDays
1 Orientation 1
2 Email Etiquette 2
3 Workplace Safety 1
4 ...
LnkWorkrolesToCourses
(Links workroles to their required courses in a Many-to-Many relationship)
ID WorkroleID CourseID
1 1 1
2 2 1
3 2 2
4 2 3
5 3 49
6 ...
Thoughts: My approach is to first develop a person-by-person schedule based upon the new hire's target completion date and workrole. Then for each class, I could sum the number of new hires starting in that quarter.
I've considered trying to represent the courses in the most general way I could think of (i.e. using a directed acyclic graph), but since most of the courses have only a single prerequisite course, I think it's much easier to represent the prerequisites using the Prerequisites table below; however, I don't know how I would use this in a query.
Prerequisites (Is this a good idea?)
ID CourseID PrereqCourseID
1 2 1
2 3 1
3 4 1
4 5 4
5 ...
Note: I am not currently concerned with whether or not the courses are actually offered on those days; we will figure out the course schedules once we know approximately how many we need each quarter. Right now, we're trying to estimate the demand for each course.
Edit 1: To clarify the Desired Output table: if the person begins course 1 on day D, then they can't start course 2 until after they finish course 1, i.e. until the next day. For courses with a length L >1 days, the start date for a subsequent courses is delayed L days. Notice this effect playing out for workrole ID 2 in the Desired Output table: He is expected to arrive on 2/17, start and complete course 1 the same day, begin course 2 the next day (on 2/18), and finish course 2 the day after that (on 2/19).
I'm posting this answer because it gives me an approximate solution; other answers are still welcome.
I avoided a prerequisite table altogether and opted for a simpler approach: a partial ordering of the courses.
First, I drew the course prerequisite tree; it looked similar to this image:
I defined a partial ordering of the courses based on their depth in the prerequisite tree. In the picture above, CHM124 and High School Chem w/ Lab are priority 1, CHM152 is priority 2, CHM 153 is priority 3, CHM260 and CHM 270 are priority 4, and so on... This partial ordering was stored in the CoursePriority table:
CoursePriority:
ID CourseID Priority
1 1 1
2 2 2
3 3 3
4 4 3
5 5 4
6 6 3
7 ...
So that no two courses would every be taken at the same time, I perturbed each course's priority by a small random number using the following Update query:
UPDATE CoursePriority SET CoursePriority.Priority = [Priority]+Rnd([ID])/1000;
(I used [ID] as input to the Rnd method to ensure each course was perturbed by a different random number.) I ended up with this:
ID CourseID Priority
1 1 1.000005623
2 2 2.000094955
3 3 3.000036401
4 4 3.000052486
5 5 4.000076711
6 6 3.00000535
7 ...
The approach above answers my first question "What is the best [sensible] way to represent a sequence of courses (i.e. prerequisites) in a relational DB?" Now as for generating the course schedule...
First, I created a query qryLnkCoursesPriorities to link Courses to the CoursePriority table:
SELECT Courses.ID AS CourseID, Courses.DurationInDays, CoursePriority.Priority
FROM Courses INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID;
Result:
CourseID DurationInDays Priority
1 35 1.000076177
2 21 2.000148297
3 28 3.000094352
4 14 3.000081442
5...
Second, I created the qryWorkrolePriorityDelay query:
SELECT LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.CourseID AS CourseID, qryLnkCoursePriorities.Priority, qryLnkCoursePriorities.DurationInDays, ([DurationInDays]+Nz(DSum("DurationInDays","qryLnkCoursePriorities","[Priority]>" & [Priority] & ""))) AS LeadTimeInDays
FROM LnkWorkrolesToCourses INNER JOIN qryLnkCoursePriorities ON LnkWorkrolesToCourses.CourseID = qryLnkCoursePriorities.CourseID
ORDER BY LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.Priority;
Simply put: The qryWorkrolePriorityDelay query tells me how many days in advance each course should be taken to ensure the new hire can complete all subsequent courses prior to their required training completion deadline. It looks like this:
WorkroleID CourseID Priority DurationInDays LeadTimeInDays
1 7 1.000060646 7 147
1 1 1.000076177 35 140
1 2 2.000148297 21 105
1 4 3.000081442 14 84
1 6 3.000082824 14 70
1 3 3.000094352 28 56
1 5 4.000106905 28 28
2...
Finally, I was able to bring this all together to create the qryCourseSchedule query:
SELECT Personnel.ID AS PersonnelID, LnkWorkrolesToCourses.CourseID, [ProjectedHireDate]-[leadTimeInDays] AS ProjectedStartDate, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays] AS ProjectedEndDate
FROM Personnel INNER JOIN (((LnkWorkrolesToCourses INNER JOIN (Courses INNER JOIN qryWorkrolePriorityDelay ON Courses.ID = qryWorkrolePriorityDelay.CourseID) ON (Courses.ID = LnkWorkrolesToCourses.CourseID) AND (LnkWorkrolesToCourses.WorkroleID = qryWorkrolePriorityDelay.WorkroleID)) INNER JOIN LnkPersonnelToWorkroles ON LnkWorkrolesToCourses.WorkroleID = LnkPersonnelToWorkroles.WorkroleID) INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID) ON Personnel.ID = LnkPersonnelToWorkroles.PersonnelID
ORDER BY Personnel.ID, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays];
This query gives me the following output:
PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 7 5/7/2016 5/14/2016
1 1 5/14/2016 6/18/2016
1 2 6/18/2016 7/9/2016
1 4 7/9/2016 7/23/2016
1 6 7/23/2016 8/6/2016
1 3 8/6/2016 9/3/2016
1 5 9/3/2016 10/1/2016
2...
With this output, I created a pivot table, where course start dates were grouped by quarter and counted. This gave me exactly what I needed.