How can i loop through all records from jan till dec and assign a value based on hierarchy - sql

I have a data set of members and their health condition, and every health condition has a value associated with them and also every health condition falls in a group. in the group there is a hierarchy for example group A have (8,9,10,11,12).in this group the highest in hierarchy is 8. All these numbers have values assigned to them for example 8=0.71 9=0.61 and so on. and if 8 is captured first, then it should take the value assigned to 8 which is 0.71 if 8 (or any other condition) is captured later then it should only take the difference of both of them. for example, if 9 is captured first and then 8 is captured then it will be 0.71-0.61=0.10 value.
[enter image description here][1]
[1]: https://i.stack.imgur.com/kg5YD.png
in the picture, on scenario1, you can see that if a health condition '10' is captured on 1/15 then the column F will take the full value of the Health condition. next on 1/16 another health condition is captured and falls in same disease group then it takes the difference of the both values. (0.65-0.55=0.1)
in scenario 2, you can see that for same member same health condition,10, is captured in 02/18 so this time the column F will show only zero because this condition was already captured
please help.
thanks

Related

SQL new variable using multiple conditions (count of occurrences in 6 month look-back period using timestamp for each unique ID)

I am trying to achieve the following:
Attached is what my data looks like.
I want to create 2 new variables which counts the number of times 'Target' (variable 1) and 'Competitor' appears (variable 2), within the last 6 months of a given date_of_prescription. This would be done for every unique D_PRESCRIBER_ID.
So for example:
For ID: 1003000902 prescribing on 2020-03-18 date, the COMPETITOR drug. When you look at the rows before that, you can see that within 6 months prior to the 2020-03-18 date, there are 2 Target drugs prescribed and 0 competitor drugs prescribed. So my variable values will be: 2 (variable 1) and 0 (variable 2)
My data is much larger than what the screenshot looks like. It has more variables and 1000's of unique D_PRESCRIBER_IDs. Each row is not a unique ID, there are duplicates in the data for various date_of_prescription timestamps. These variables need to be created in my select statements in order to keep the rest of the data the same.
Any help here would be awesome. Thanks!

How can I reduce complexity? Data preparation, SQL + Tableau

I need to prepare some data to connect to tableau, and I'm struggling because the size of the data is too much for tableau to handle, so I'm looking for ideas to code this efficiently in SQL.
Setup:
I have 2 million users
There are 30 different categories, and each user can fall into many. For example:
User 1 - Category A, B and C
User 2 - Category F
User 3 - Category A, B
What I want:
Select three categories and assign priority 1, priority 2 and priority 3
These selection is not static, so today I may choose A, B, C but tomorrow those categories can be D, G, A
So if I have:
Priority 1: A
Priority 2: B
Priority 3: C
I want the number of users who fall into category A
I want the number of users who fall into category B AND are not in category A
I want the number of users who fall into category C AND are not in category A or B
My original idea was to create a table with one row per user and one yes/no column per category, and then aggregate, but still the size of the final table is too huge for tableau to handle.
Any ideas?
Update: My idea is to prepare a table with aggregated numbers and a few thousand rows max, so that it can be processed with tableau
You can assign each of the 30 categories a unique placeholder 1 to 30. Each user will be thereafter assigned a binary number of 30digits based on the categories he is falling in. This binary number can then be converted into decimal number the greatest of which can be 2^31-1 i.e. 10 digit number which can be stored without exp format.
Whenever you will have to see the categories user falling in that can be done by applying reverse conversion i.e. decimal to binary and thereafter to string with padding zeros on left side. From this string you can search places of 1s at desired place.
I think you can try this methodology.

SSRS stacked graph with missing x-axis values going to zero

I have 3 columns (DateTime, GroupName, Value), some of these groups are closely related and I would like to display these in a stacked graph. The problem I am facing (I THINK) is that I don't have entries for all groups at all times.
(cannot find a decent way to add a table, so here is some code)
Datetime Groupname Value
1 a whatever
1 b whatever
1 c whatever
2 a whatever
2 b whatever
3 a whatever
3 b whatever
3 c whatever
4 b whatever
So in the example I don't have an entry for C at time 2. And I also don't have an entry for A and B at time 4.
Resulting in:
edit: added to onedrive link
With my limited SQL skills I am not sure how to fix this. How do I get the graph to connect the points from the DateTime points where we do have data, and ignore DateTime points where we do not have data?
11-07-2016 Edit
Ok, so here some pictures of the actual data
No rep - Onedrive it is
https://1drv.ms/f/s!AhKMFQBAmZ7GgYEMzdpTBvuXTi5gAQ
the graph looks different than my first example because I set the X-axis to scalar.
On 7/5/2016 and 7/6/2016 (month/day/year notation) the CH3 is low and 0. If I remove CH3 from results the graph looks ok.
#sqlandmore.com
This is the query. Very basic.
Data is coming from a database were the datetime is not in a proper datetime format so thats gets converted into the wimsview.
SELECT
WimsView.TagID
,WimsView.SampleDateTime
,WimsView.SampleValue
,WimsView.TagName
FROM
WimsView
WHERE
WimsView.SampleDateTime > N'07/4/2016 00:00:00'
AND ((WimsView.TagName LIKE N'%MBA%') OR (WimsView.TagName LIKE N'%MBB%') OR (WimsView.TagName LIKE N'%GF1_DPC%')OR (WimsView.TagName LIKE N'%KF1_DPC%')OR (WimsView.TagName LIKE N'%CH3_DPC%'))
AND WimsView.SampleValue IS NOT NULL
You can't ignore specific values on the graph.
you can either change your select statement not to include them
you can calculate an "average" (if possible) for the missing value in order to fill the missing "points" in your graph
or another calculation (i.e - same value as previous one on the graph)
whatever you decide - it should be handled on a query level, not on a drwaing level

SSRS - Is there a way to have a table split by a page break to appear on the same page?

I am creating a report using MS Report Builder 3.0. For this report, I have a stored procedure built that filters down to the specific rows needed, and then I use a row group to group on a particular field (pass_no). The table that is displayed is 2 columns and 3 rows within the row group. The basic description of what I want to accomplish is instead of the rows running onto the next page, I want the rows to continue on the same page in a new set of 2 columns. Think of it like a newspaper where the text continues in a new column rather than running down onto the next page.
For the example I'm going to use here, there are 12 rows of data returned by the SP, and 8 unique values in the pass_no column which is what my row group is grouped on. So in the report I end up with 8 groups of 3 rows. I'm aiming to have the table display 6 pass_no values (so 6 groups of 3 rows) before, for lack of better terminology, starting a new table.
My first approach at this has been to create a column group and set the grouping expression to the following:
=Floor((RowNumber(Nothing) - 1) / 6)
While this works in creating a new set of 2 columns, the split for the new columns is based on the row number from the raw data returned by the SP rather than the number of rows sets created by the row group. So because there are 12 rows returned, and the 6th and 7th rows have the same pass_no value, the second set of columns duplicates that 1 set of data. Also, the top 6 rows of the second column set are blank with the second set of values appearing below the first set.
If I add an additional column group where it is also grouping by pass_no, then I don't get the duplicate values, but I do get a pair of columns for each pass_no as well (as would be expected). I've tried modifying the expression above a bit and changed Nothing to the row group name and have tried the table name, but neither of them have yielded the desired result.
I can't alter the SP to do the grouping there because there are other column values that are not identical and I pull that data into a cell value expression within the table using Join(LookupSet()).
I have also considered creating 2 tables and applying a filter to the table so the first table only displays the first 6 results and the second table displays the remaining results, but that also looks at the raw data rather than the groupings and TOP N can't be used on pass_no as it's a text value, not an integer. This would also cause problems if I need to go to 3 tables.
So long story short, is there a way to do a table break rather than a page break or to overflow columns onto the same page rather than onto a new page?
Here's the pertinent portions of the Dataset:
http://sqlfiddle.com/#!2/5082b/1
PASS_NO MASTERTRAN TRANS_NO DESCRIPTION IS_MOD
7913019000 4931019000 4931019000 General Admission Adult 0
7914019000 4932019000 4932019000 Sea Turtle Hosp Adult 0
7914019000 4932019000 4933019000 2:00 PM SEA TURTLE HOSP 1
7916019000 4934019000 4934019000 Sea Turtle Hosp Child 0
7916019000 4934019000 4935019000 2:00 PM SEA TURTLE HOSP 1
7917019000 4934019000 4934019000 Sea Turtle Hosp Child 0
7917019000 4934019000 4935019000 2:00 PM SEA TURTLE HOSP 1
7918019000 4934019000 4934019000 Sea Turtle Hosp Child 0
7918019000 4934019000 4935019000 2:00 PM SEA TURTLE HOSP 1
7922019000 4936019000 4936019000 General Admission Child 0
7923019000 4936019000 4936019000 General Admission Child 0
7924019000 4936019000 4936019000 General Admission Child 0
I think your data presents a bit of a problem.
As you've already figured out, typically for this sort of setup you'd set up a row group with an expression like:
=(RowNumber(Nothing) - 1) Mod 6
And a column group expression like:
=Ceiling(RowNumber(Nothing) / 6)
This would create a six row tablix that would grow horizontally as required.
See this SO question for a similar example.
However, you currently have the requirement of also grouping by another column - pass_no in your case. Normally you can approximate a group-level row number with an expression like:
=RunningValue(Fields!pass_no.Value, CountDistinct, "DataSet1")
Unfortunately, when you try to add this into one of the grouping expressions like:
=Ceiling(RunningValue(Fields!pass_no.Value, CountDistinct, "DataSet1") / 6)
You get the following error:
A group expression for the tablix 'Tablix1' includes the aggregate
function RunningValue. RunningValue cannot be used in group
expressions.
Based on all this, my recommendation is to try and get a Dataset that has one row per pass_no value and base the tablix on this, with the above row/column grouping expressions, i.e. no need to group on multiple pass_no rows. So in your example it would have eight rows. You could then have a separate Dataset with all the individual rows and use a lookupset function to concatenate the description, etc.
Your other option is to try and get everything on one Dataset only, including the aggregates as required. This might not be possible, but for description at least you can leverage any of the various techniques here to get a delimited list. Once you have this list you can replace the delimiter with vbCrLf to split it back over multiple rows.
All this is a very long-winded way of saying that I don't know if your requirement is possible with your data, but if you look at having at least one Dataset with one row per pass_no you should be able to make it work.

Dynamic use of MDX AVG function

Anyone have advice on how to build an average measure that is dynamic -- it doesn't specify a particular slice but instead uses your current view? I'm working within a front-end OLAP viewer (Strategy Companion) and I need a "dynamic" implementation based on the dimensions that are currently filtered in the data view.
My fact table looks something like this:
Key AmountA IndicatorA AmountB Other Data
1 5 1 null 25
2 6 1 null 52
3 7 1 2 106
4 null 0 4 108
Now I can specify a simple average for "[Measures].[AmountA]" with "[Measures].[AmountA] / [Measures].[IndicatorA]" which works great - "[IndicatorA]" sums up to the number of non-null values of "[AmountA]". And this also works great no matter what dimensions are selected in the view - it always divides by the count of rows that have been filtered in.
But what about [AmountB]? I don't have a null indicator column. I want to get an average value of [AmountB] for whatever rows have been filtered in for my current view. If I try to use the count of rows as a simple formula (psuedo-code "[Measures].[AmountB] / Count([Measures].[Key])") I get the wrong result, because it is counting all the null rows in the average.
So, I need a way to use the AVG function to specify the average of [AmountB] over the set of "whatever rows I'm currently filtering in, based on whatever dimensions I'm currently using". How do I specify this dynamic set?
I've tried several different uses of the AVG function and they have either returned null or summed up to huge numbers, clearly not the average I'm looking for.
Thanks-
Matt
Sorry, my first suggestion was wrong. If you don't have access to OLAP cube you can't write any mdx-query for this purpose (IMHO). Because, you don't have any detailed data (from your fact table) in this access level and you can use only aggregated data and dimensions from your cube.
Otherwise (if you have access to olap db), you can create this metric (count of not NULL rows) in your measure group and after that use it for AVG calculation (as calculated member in your cube or in section "WITH" in your mdx-query).