IN SSAS how to remove null value in distinct count measures - ssas

I have column in fact table .the column in some row has 'Null' value.i have measure based on this column with aggregate function Set to DistinctCount
this measure count null value too.
but i don't want to count null value what should i do?

Most efficient would be to filter out NULL values in the data source view (using a named query for example). This won't affect performance too much as a distinct count measure is calculated in a separate measure group anyway.

One popular solution that works is to count from a view of the table that filters out the nulls. This works, but I would bet that it requires another scan of the fact table.
Another solution is like fighting fire with fire.
Add a computed column that is 0 if it's null and 1 if it's not:
CASE WHEN _DollarsLY IS NULL THEN 0 ELSE 1 END AS _DistinctCountHackLY
Then you can do something like this in a cube calculation:
iif(_DistinctCountHackLY=2 or _DollarsLY=null,_DistinctUPCLY-1,_DistinctUPCLY)

Related

Count number of null column in sql query result

I have a table which is queried using pivot and the select statement returns values for Jan,Feb,Mar,....,Dec. Along with this result, i need one more column that displays the count of number of Columns(from Jan to Dec) whose value is null.
This can be done using case when and adding 1 for each month when its null....Is there is more efficient way to achieve this.
Thanks

Get latest data for all people in a table and then filter based on some criteria

I am attempting to return the row of the highest value for timestamp (an integer) for each person (that has multiple entries) in a table. Additionally, I am only interested in rows with the field containing ABCD, but this should be done after filtering to return the latest (max timestamp) entry for each person.
SELECT table."person", max(table."timestamp")
FROM table
WHERE table."type" = 1
HAVING table."field" LIKE '%ABCD%'
GROUP BY table."person"
For some reason, I am not receiving the data I expect. The returned table is nearly twice the size of expectation. Is there some step here that I am not getting correct?
You can 1st return a table having max(timestamp) and then use it in sub query of another select statement, following is query
SELECT table."person", timestamp FROM
(SELECT table."person",max(table."timestamp") as timestamp, type, field FROM table GROUP BY table."person")
where type = 1 and field LIKE '%ABCD%'
Direct answer: as I understand your end goal, just move the HAVING clause to the WHERE section:
SELECT
table."person", MAX(table."timestamp")
FROM table
WHERE
table."type" = 1
AND table."field" LIKE '%ABCD%'
GROUP BY table."person";
This should return no more than 1 row per table."person", with their associated maximum timestamp.
As an aside, I surprised your query worked at all. Your HAVING clause referenced a column not in your query. From the documentation (and my experience):
The fundamental difference between WHERE and HAVING is this: WHERE selects input rows before groups and aggregates are computed (thus, it controls which rows go into the aggregate computation), whereas HAVING selects group rows after groups and aggregates are computed.

SSAS: Show distinct count measure with unknown member

I have a measure that counts distincted ID's on some fact table.
Let's say it looks like this:
[id] [linkedtableid] [datecolumn]
1 someid date1
2 someid date1
3 someid date1
4 someid date1
5 null date1
You may see that for date1 there is 5 distinct rows. But in my case it results count = 4. I thought that this can be connected somehow with UnknownMember processing, but I ended up with nothing with this assumption. I've already tried everything in my cube solution, but can't find the reason of such behavior. It seems like row with null value in it just doesn't count by distinct count function.
Also, if I fill this null value in relational DB and then reprocess the cube, all is counting correctly.
I probably missed something, maybe some option somewhere.
Resolved by removing unneeded relations between the measure for distinct count and dimensions. There was 2 other dimensions, one connected through direct link, one through referenced. I don't know why nulls were not calculated there, maybe because of unability to link via reference link with null-valued field.

How to not count the null value from a fact table in SSAS?

I have many measures of distinct count in a cube. My problem is that those measures count the null value as well. I've found two solutions to eliminate the null value:
I've created named queries in data source view for each measure where i put the condition that the column that i need does not contains null [where column is not null] (but this solution is not that practical, because if you have many measures, that do not need to count the null value you have to make a lot of fact tables as named queries to eliminate the null)
I've created an additional column as Named calculation in the fact table, where i tested if the column that i need contains null to put 1 else to put 0 (CASE WHEN Column IS NULL THEN 1 ELSE 0). After that i created a measure of maximum on this additional column and i created a measure of distinct count on the column that i needed . And finally, i created a calculation where i tested the following: IIF([measure that i need]- [Maximum of additional column]<0,null,[measure that i need]- [Maximum of additional column])
Both solutions works but my question is if there is another solution more simple than those two mentioned or if there is an option in SSAS.
If someone knows please share the information.
In Sql it is possible to use
select count(column_name) from table.
this doesn't count the null values.
count(*) does count the null values.

How not to display columns which are NULL in a view

I've set up a view which combines all the data across several tables. Is there a way to write this so that only columns which contain non-null data are displayed, and those columns which contain all NULL values are not included?
ADDED:
Sorry, still studying and working on my first big project so every day seems to be a new experience at the minute. I haven't been very clear, and that's partly because I'm not sure I'm going about things the right way! The client is an academic library, and the database records details of specific collections. The view I mentioned is to display all the data held about an item, so it is bringing together tables on publication, copy, author, publisher, language and so on. A small number of items in the collection are papers, so have additional details over and above the standard bibliographic details. What I didn't want was a user to get all the empty fields relating to papers if what was returned only consisted of books, therefore the paper table fields were all null. So I thought perhaps there would be a way to not show these. Someone has commented that this is the job of the client application rather than the database itself, so I can leave this until I get to that phase of the project.
There is no way to do this in sql.
CREATE VIEW dbo.YourView
AS
SELECT (list of fields)
FROM dbo.Table1 t1
INNER JOIN dbo.Table2 t2 ON t1.ID = t2.FK_ID
WHERE t1.SomeColumn IS NOT NULL
AND t2.SomeOtherColumn IS NOT NULL
In your view definition, you can include WHERE conditions which can exclude rows that have certain columns that are NULL.
Update: you cannot really filter out columns - you define the list of columns that are part of your view in your view definition, and this list is fixed and cannot be dynamically changed......
What you might be able to do is us a ISNULL(column, '') construct to replace those NULLs with an empty string. Or then you need to handle excluding those columns in your display front end - not in the SQL view definition...
The only thing I see you could do is make sure to select only those columns from the view that you know aren't NULL:
SELECT (list of non-null fields) FROM dbo.YourView
WHERE (column1 IS NOT NULL)
and so forth - but there's no simple or magic way to select all columns that aren't NULL in one SELECT statement...
You cannot do this in a view, but you can do it fairly easily using dynamic SQL in a stored procedure.
Of course, having a schema which shifts is not necessarily good for clients who consume the data, but it can be efficient if you have very sparse data AND the consuming client understands the varying schema.
If you have to have a view, you can put a "header" row in your view which you can inspect client-side on the first row in your loop to see if you want to not bother with the column in your grid or whatever, you can do something like this:
SELECT * FROM (
-- This is the view code
SELECT 'data' as typ
,int_col
,varchar_col
FROM TABLE
UNION ALL
SELECT 'hdr' as typ
-- note that different types have to be handled differently
,CASE WHEN COUNT(int_col) = 0 THEN NULL ELSE 0 END
,CASE WHEN COUNT(varchar_col) = 0 THEN NULL ELSE '' END
FROM TABLE
) AS X
-- have to get header row first
ORDER BY typ DESC -- add other sort criteria here
If we're reading your question right, there won't be a way to do this in SQL. The output of a view must be a relation - in (over-)simplified terms, it must be rectangular. That is, each row must have the same number of columns.
If you can tell us more about your data and give us some idea of what you want to do with the output, we can perhaps offer more positive suggestions.
In general, add a WHERE clause to your query, e.g.
WHERE a IS NOT NULL AND b IS NOT NULL AND c IS NOT NULL
Here, a b c are your column names.
If you are joining tables together on potentially NULL columns, then use an INNER JOIN, and NULL values will not be included.
EDIT: I may have misunderstood - the above filters out rows, but you may be asking to filter out columns, e.g. you have several columns and you only want to display columns that contain at least one null value across all the rows you are returning. Using dynamic SQL offers a solution, since the set columns varies depending upon your data.
Here's a SQL query that builds another SQL query containing the appropriate columns. You could run this query, and then submit it's result as another query. It assumes 'pk' is some column that is always non-null, e.g. a primary key - this means we can prefix additional row names with a comma.
SELECT CONCAT("SELECT pk"
CASE (count(columnA)) WHEN 0 THEN '' ELSE ',columnA' END,
CASE (count(columnB)) WHEN 0 THEN '' ELSE ',columnB' END,
// etc..
' FROM (YourQuery) base')
FROM
(YourQuery) As base
The query works using Count(column) - the aggregate function ignores NULL values, and so returns 0 for a column consisting entirely of NULLs. The query builder assumes that YourQuery uses aliases to ensure there no duplicate column names.
While you cant put this into a view, you could wrap it up as a stored procedure that copies the data to another table - the result table. You may also set up a trigger so that the result table is updated whenever the base tables change.
I suspect what's going on is that an end user is running CrystalReports and complaining about all the empty columns that have to be removed manually.
It would actually be possible to create a stored procedure that would create a view on the fly, leaving out dataless columns. But then you would have to run this proc before using the view.
Is that acceptable?