how to create a set based on a measure value in Tableau? - data-visualization

This is a very simple thing and I can't believe Tableau makes it so hard. I have a bunch of fields, and one measure has many zeroes. I just want to create a subset of the data where this measure > 0.
I can do it with a filter, but I since I will use it several times, it makes sense to create a set once and keep using it. Am I wrong to want to do that? Because I am finding it's just easier to just keep creating the filter in different sheets instead of trying to figure out the set.
I keep referring to this page, but they start out by telling you to right click on a dimension and create a set.
https://help.tableau.com/current/pro/desktop/en-us/sortgroup_sets_create.htm
I keep ending up here. What does it mean to apply a condition where the sum > 0? I want a set with any value > 0. That's not the same as a sum.

Actually your use case is not appropriate for sets. Sets in tableau work on IN/OUT principle. So the sets can be used as a T/F condition as well as used to differentiate the members IN and OUT of that set e.g. by differentiating these by colors.
What I can understand is that you just want to create a Calculating field which you can use as a ready filter as well as for differentiation also.
To illustrate let's take the following sample data
Now create a calculated field with the following calculation
[Measure] > 0
(Note- this would exclude negative values also. If your data set has negative values and you just want to exclude 0 values use <> instead.)
This calculated field will serve your purpose. See
and
Better seen with average
and
Good Luck.

Related

Dynamic measure that responds to dynamic dimension

I'll try to describe this scenario without introducing too much irrelevant info, but keeping it simple.
Using the newish Field Parameter feature in PowerBI, I created a Parameter called _Dimensions and another one called _Measures, selecting common columns in the former and common measures in the latter.
I then build a bar chart with [_Dimension Fields] for X-Axis, [_Measure Fields] for Y-axis, and a single-select slicer for each. Now when user selects a measure and a column, it draws a bar chart of their selected measure, sliced by their selected dimension.
What I'd like to do is actually make this a Pareto chart, which would entail putting in a second measure on Y-axis, but rather than having a pareto counterpart to every possible measure a user may select, I'd like to create a single measure that calculates running percent of total of [selected measure] along [selected dimension].
I was hopeful I could call the [_Dimension Fields] column that PowerBI created with its special properties from DAX, but that doesn't seem to treat them any different than any other column. I also tried NAMEOF, but that just returns a string. I was hoping it would act like INDIRECT does in Excel, treating the string as a reference, but alas.
Does the above problem statement make sense? Can anyone describe an elegant design approach to do this dynamically that does not involve just writing a version of every possible measure a user could select and then use a switch?
imagining the combo chart to look like this (pareto measure in line chart part)
edit: secondary question, but equally important to the end goal of a fully functional dynamic pareto: when user selects measure, I want the selected dimension to always be sorted desc by selected measure. This is how you do a pareto analysis, but PBI does not default to sort descending always, and each time you change the dimension (via slicer click) the chart resets sorting. Any way to ensure that the sort order is fixed correctly?
Calculation groups are the way to go and Tabular Editor is used to create these.
After much exploration, here is my solution. It's not 100% dynamic in that it requires writing custom DAX for each dimension and measure that you need to be available for dynamic use, but gets the job done for the scope of the report in question.
create field parameter from columns that I will want to dynamically use in viz: name it _Dimension
In my example, I will be using two columns from two tables: Carrier[CarrierNumber] and ShipmentLane[LaneCity]
create field parameter from measures that I will want to dynamically use in viz: name it _Measure
in my example, I have two measures I will want to be able to toggle between: Events_Late and Events_Late2. Both exist on OnTimePerformanceDetail table.
create measure to dynamically return value based on the selection of
_Measure in slicer on canvas. This seems like it should not be unnecessary with field parameter feature, but it is necessary for reasons that will be
clear if you try to do this without a custom measure.
create a pareto measure for each of the dimensions that may be
dynamically passed to viz. Each of these dynamically evaluates the base measure, but is specific to a single column for which the measure evaluates over:
create a dynamic pareto measure that chooses the correct pareto calculation based on the selection on _Dimension
create single select Slicers for _Dimension and _Measure
create combo chart, using _Dimension for X-axis, _Measure for Y-axis, and DynamicPareto for line Y-Axis. I have aliased DynamicPareto on the viz to Running% so that it shows nicely and clearly on legend
set the sort order of the chart to be ASC by Dynamic Pareto measure. This ensures that the dimension on X-axis is always sorted correctly
A few notes:
I named the dynamic pareto as "Discrete" because this only works as
designed when doing pareto on a discrete dimension, where the bars
are meant to be sorted desc by [measure]. If you are doing a
Percentile chart, which is basically the same thing, but the
dimension is sorted by dimension value instead of measure value, the
Pareto calculation needs to work slightly differently.
There are lots of Pareto measure patterns out there. I used the one
from this blog, because it's concise and performs well:
https://janizajcbi.com/2018/08/22/pareto-rule-abc-class-in-dax/
it is important that the slicers be set to single select
I discovered there is a Pareto 3rd party viz that is simple and
dynamic, but has very limited formatting features. Fine for quick
analysis, but if you have branding or formatting standards, it may
prove unusable, as in my case
in my production use case, I have a lot more dimensions and a lot
more measures that will be available. Started with just 2+2 to prove
out functionality. Just need to follow same pattern to add more
available dimensions and measures to mix.
my naming convention of * suffix is because this report is built on a
centralized data model. The * makes it easy to find measures that are
local to this report and not a base measure in the model I am
connected to.
the field parameter feature can only be used with a remote model like this if the preview feature of Use Direct Query for AAS and PBI datasets is enabled OR the field parameters are added to the base model. In my case, I'm adding the field parameters to the base model, and all of the measures here are local to the report, connected to remote model.

Dynamically creating a pivot table using fuzzy matching

So, I'm constantly being given data in new and different formats. I'm on a crusade to get my work to standardize data for easy use, and if I managed to convince the powers that be to standardize data, this problem becomes entirely moot. Until then, I have the following problem:
I get data in a variety of ways. Sometimes my gross sales are called total sales. Sometimes gross sales before discounts, total sales before discounts, Gross_Sales, etc. Discounts, deductions, exempt amounts, etc. form another column. So on and so forth. I'd like to be able to do the following:
1) Figure out what columns I want,
2) Turn those columns into a pivot table.
For part 1, I have two options, and I'm wondering if there's anymore: The 1st is to use Microsoft's fuzzy-matching add-in to help me match. I'd have a separate tab dedicated to fuzzy matching each column I need. The second is to just generate a long list of all the variants, and to test each one until I find a hit, assign it, and move onto testing the next one.
The second part is turning all of this into a pivot table - the resouces I have so far are https://www.thespreadsheetguru.com/blog/2014/9/27/vba-guide-excel-pivot-tables and How to Create a Pivot Table in VBA
Is there a better method? Is there another way?
Edit: Slightly better method - Grab the data columns, place them into a table, and pivot everything off of that table - it removes the need to re-create pivot tables, just need to move the data over.
Having the same problem, I use a mix of your two methods.
My data consists of a bunch of logs for rejected x-ray images, and the reject reason is a free text field. My solution was to create a table where the first column contains my desired output categories, and then each subsequent column contains a different variation of it.
For example, a row might have (column one/ouput first entry):
Positioning, POS, Positioning Error, Patient Positioning
Note that these are all fairly different from each other. Where the fuzzy matching comes in - it is used to capture all the smaller differences and mispellings around those other columns. When the fuzzy matching section decides a given reason matches a column's entry, it is then replaced with the appropriate desired output reason from column 1 of the table. In my example, a reason of 'Possitioning Err' [sic] would match to column 3 (Positioning Error) and then get converted to Positioning.
Then wash rinse repeat over the rest of your data as needed. This approach was super useful and fairly flexible in helping standardize my data. It was also computationally more expensive, but you'd only need to run the matching portion once I guess.
As for the actual mechanics of going about doing this - I use 2010, so no inbuilt functionality. I run the fuzzy matching code on a temporary worksheet until best percentage matches are found, and then overwrite the actual source data afterwards.

How can I limit the numbers of column in Matrix

Unlike Tablix, I was not able to use limiting expression such as =ceiling(rownumber(nothing)/6) in Matrix.
Do you have any ideas to achieve limiting no. of columns in matrix- in design only, without touching dataset.
Or I should create it in Tablix?
Any suggestions please?
I think the only way you could achieve this would be to specify the column and row that you want your data to appear in from within your source dataset.
This could be achieved by taking a row_number and then doing integer division (<row_number value>/6) to get the row it should fall into and then modular division (<row_number value>%6) to get the column it should fall into.
From here you can build up your tablix grouping on your row and column fields.

Creating a Calculated Field between 2 datasets in SSRS 2008 R2 Express

I have a very simple task. I have two datsets. The first is a single value that has returned me a number of calls that has been made. The second is a list of targets. I want to be able to divide the first number of calls by its call target in order to calculate the total %.
The lookups of these are dependant on the criteria that is selected in the report itself. I know it's possible to do inside SQL its just a huge pain in the ass. I want to be able to calculate two fields and I thought something so simple would be a breeze for SSRS but apparently not. So the error i get is:
*The expression used for the calculated field'=fields!DMCP.Value / FIRST(Fields!DMCTarget.Value,"AllTargets")' includes an aggregate,rownumber,runningvalue, previous or lookup function. aggregate,rownumber,runningvalue, previous and lookup functions cannot be used in calculated field expressions.*
I can take the first function out but it doesn't help me. It says it can only calculate in the same dataset, problem is these values can never co-exist in the same dataset. Anything I can do????
Thanks in advance

Reporting Services - Two filters on the same chart Category Group?

I have sales data that I'd like to plot on my chart. However, at a specific point in time, we had a change taking place I'd like to ensure is clearly visible in the chart, preferably by dividing the sales data (which is stored in a single SQL Server column) into two different chunks, which would allow me to then treat them as different data series.
I used to solve this in Excel by storing the post-event data in a different column (by simply dragging them to a different column), and thus I was able to treat them as a different series (the blue and green line in the chart below. The red and orange line are pre-event and post-event averages):
I'd like to reproduce this effect in SSRS, but am not sure how to tackle it. I've tried using an approach where I added two category groups, both pointing to the date-time column, and applying filters to them (one <= the cutoff date, the other >=).
I then added my sales data twice, with the idea I could somehow connect them to the individual category groups, but that does not seem possible.
Has anyone tried anything like this before, or would have a different approach to achieve what I'm trying to get?
Thanks!
I managed to get this to work, and figured I'd share how to do it.
My dataset contains a field called DATEKEY, which stores the date in the format YYYYMMDD. It's possible to use this in an expression and evaluate the date for a specific row. In case the expression evaluates to true, we display the value. If not, we display a blank string.
In case we want to show the values prior to the date, the expression would be:
=IIF(Fields!DATEKEY.Value <= 20130601, Avg(Fields!My_NUMBER.Value), "")
The second series can then be made by reversing the symbol:
=IIF(Fields!DATEKEY.Value >= 20130601, Avg(Fields!My_NUMBER.Value), "")
The graph then looks like this: