How to handle Null or Blank in a filter activity logic expressions - azure-synapse

I am trying to filter out the null record from previous lookup activity in azure synapse workspace. To do this, I have used the below mentioned expressions in filter activity. But I am not able to get rid of the null records in the filter activity output. I have used the coalesce() function also to do that.
FYI, I know there is IsNull expression in data flow activity, but I don't want to use Data flow in pipeline as the pipeline is small and dataflow execution is extra cost.
PFB, details and attached screenshot.
Approach used : Filter activity to filter null records using coalesce() function and without coalesce().
Expression:
#if(equals(coalesce(activity('LookupID').output.value[0].ID,''),''), true,false)
#if(equals(activity('LookupID').output.value[0].ID, null), false,true)
#if(equals(activity('LookupID').output.value[0].ID, ''),''), false,true)
With any of the above expressions, the filter output has all the records including null value records.
Please share your suggestion if I am missing anything.

The reason is because the filter condition. You can use the following condition instead to filter out null records returned from look up activity.
I have a look up activity called Lookup id which returns id and gname where 1 id record is null and 2 are not null.
Now I have used the following dynamic content as the items and condtion for filter activity:
items: #activity('Lookup id').output.value
conditions: #if(equals(item().id, null), false,true)
The filter would work as expected and the output would be as shown below (only two records returned):

Related

How to use Kusto to return a max() row from a table, while showing other columns not used in the max grouping

Given the following Log analytics KQL query :
SigninLogs
| where ResultType == 0
| summarize max(TimeGenerated) by UserPrincipalName
I need to display other columns from those selected rows in the SigninLogs table. I've tried different approaches with no success. Joining back to the same table again seems unfeasible as joins appear to only be available using a single column. Other approaches using in failed because the needed columns weren't available in the above source query.
You can use the arg_max() aggregation function: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/arg-max-aggfunction

Query with multiple column conditions in single table

My table consisting multiple columns data with combination of data and null values.Here i want to display only data not null value when i select for that i have write one query like below.
select Capability
, BusinessStrategyMessaging
from tb_MBUPSheetData
where (Capability is not null)
and (BusinessStrategyMessaging is not null)
But it showing empty table not displaying table data.I want to satisfy two columns and display data when i select.
Please give me suggestions please.
If I understand you right you want to show registries that is populated on your table based on the columns you put on your select. So there is NOTHING wrong with your query. The thing is that in your table there is NO registries that attend those two conditions:
(Capability is not null)
and (BusinessStrategyMessaging is not null)
At least not at the same time. So I think what you want is one OR another. To do so you have to put your conditions as this:
( (Capability is not null)
OR (BusinessStrategyMessaging is not null) )
Note that I put some extra parenthesis because if you have any other conditions it will not affect it.

SSIS Subtracting from 2 sources

I currently have two Excel sources. I'm hoping to subtract a count of rows from one Excel source (SourceA) with a row value from the other excel source (SourceB).
I've used a conditional split on each to specify which rows I want to use. SourceA returns one row which is what I wanted. Source B returns a number of rows which is what I expected.
From the SourceA data flow, I've now added an aggregate transformation to count the number of rows .
I then use a Union All, a data conversion transformation and then a Derived Column transformation. In this Derived Column transformation, I use the column from SourceB - the aggregate count of SourceA.
Then I link it to a SQL Server Destination and configure the mappings. I run the data flow and everything works. However, when I look at the results, it only gives me a NULL value (it did not calculate it for me).
How can I achieve this subtraction of a row value and an aggregate count?
The aggregate transformation is asynchronous meaning that it won't produce an output for each input row. You probably need a Merge Join instead of a Union. You may want to consider using a variable to hold the aggregated value as well.
In order to achieve what you are looking by following:
1) Get the Row count From Source A and Source B by creating variable1 and variable2.
2) You can do drag Derived column or another variable you can do the subtraction of Variable1-Variable2.
3) Map the Final variable or Derived column to the destination.

How to not count the null value from a fact table in SSAS?

I have many measures of distinct count in a cube. My problem is that those measures count the null value as well. I've found two solutions to eliminate the null value:
I've created named queries in data source view for each measure where i put the condition that the column that i need does not contains null [where column is not null] (but this solution is not that practical, because if you have many measures, that do not need to count the null value you have to make a lot of fact tables as named queries to eliminate the null)
I've created an additional column as Named calculation in the fact table, where i tested if the column that i need contains null to put 1 else to put 0 (CASE WHEN Column IS NULL THEN 1 ELSE 0). After that i created a measure of maximum on this additional column and i created a measure of distinct count on the column that i needed . And finally, i created a calculation where i tested the following: IIF([measure that i need]- [Maximum of additional column]<0,null,[measure that i need]- [Maximum of additional column])
Both solutions works but my question is if there is another solution more simple than those two mentioned or if there is an option in SSAS.
If someone knows please share the information.
In Sql it is possible to use
select count(column_name) from table.
this doesn't count the null values.
count(*) does count the null values.

How not to display columns which are NULL in a view

I've set up a view which combines all the data across several tables. Is there a way to write this so that only columns which contain non-null data are displayed, and those columns which contain all NULL values are not included?
ADDED:
Sorry, still studying and working on my first big project so every day seems to be a new experience at the minute. I haven't been very clear, and that's partly because I'm not sure I'm going about things the right way! The client is an academic library, and the database records details of specific collections. The view I mentioned is to display all the data held about an item, so it is bringing together tables on publication, copy, author, publisher, language and so on. A small number of items in the collection are papers, so have additional details over and above the standard bibliographic details. What I didn't want was a user to get all the empty fields relating to papers if what was returned only consisted of books, therefore the paper table fields were all null. So I thought perhaps there would be a way to not show these. Someone has commented that this is the job of the client application rather than the database itself, so I can leave this until I get to that phase of the project.
There is no way to do this in sql.
CREATE VIEW dbo.YourView
AS
SELECT (list of fields)
FROM dbo.Table1 t1
INNER JOIN dbo.Table2 t2 ON t1.ID = t2.FK_ID
WHERE t1.SomeColumn IS NOT NULL
AND t2.SomeOtherColumn IS NOT NULL
In your view definition, you can include WHERE conditions which can exclude rows that have certain columns that are NULL.
Update: you cannot really filter out columns - you define the list of columns that are part of your view in your view definition, and this list is fixed and cannot be dynamically changed......
What you might be able to do is us a ISNULL(column, '') construct to replace those NULLs with an empty string. Or then you need to handle excluding those columns in your display front end - not in the SQL view definition...
The only thing I see you could do is make sure to select only those columns from the view that you know aren't NULL:
SELECT (list of non-null fields) FROM dbo.YourView
WHERE (column1 IS NOT NULL)
and so forth - but there's no simple or magic way to select all columns that aren't NULL in one SELECT statement...
You cannot do this in a view, but you can do it fairly easily using dynamic SQL in a stored procedure.
Of course, having a schema which shifts is not necessarily good for clients who consume the data, but it can be efficient if you have very sparse data AND the consuming client understands the varying schema.
If you have to have a view, you can put a "header" row in your view which you can inspect client-side on the first row in your loop to see if you want to not bother with the column in your grid or whatever, you can do something like this:
SELECT * FROM (
-- This is the view code
SELECT 'data' as typ
,int_col
,varchar_col
FROM TABLE
UNION ALL
SELECT 'hdr' as typ
-- note that different types have to be handled differently
,CASE WHEN COUNT(int_col) = 0 THEN NULL ELSE 0 END
,CASE WHEN COUNT(varchar_col) = 0 THEN NULL ELSE '' END
FROM TABLE
) AS X
-- have to get header row first
ORDER BY typ DESC -- add other sort criteria here
If we're reading your question right, there won't be a way to do this in SQL. The output of a view must be a relation - in (over-)simplified terms, it must be rectangular. That is, each row must have the same number of columns.
If you can tell us more about your data and give us some idea of what you want to do with the output, we can perhaps offer more positive suggestions.
In general, add a WHERE clause to your query, e.g.
WHERE a IS NOT NULL AND b IS NOT NULL AND c IS NOT NULL
Here, a b c are your column names.
If you are joining tables together on potentially NULL columns, then use an INNER JOIN, and NULL values will not be included.
EDIT: I may have misunderstood - the above filters out rows, but you may be asking to filter out columns, e.g. you have several columns and you only want to display columns that contain at least one null value across all the rows you are returning. Using dynamic SQL offers a solution, since the set columns varies depending upon your data.
Here's a SQL query that builds another SQL query containing the appropriate columns. You could run this query, and then submit it's result as another query. It assumes 'pk' is some column that is always non-null, e.g. a primary key - this means we can prefix additional row names with a comma.
SELECT CONCAT("SELECT pk"
CASE (count(columnA)) WHEN 0 THEN '' ELSE ',columnA' END,
CASE (count(columnB)) WHEN 0 THEN '' ELSE ',columnB' END,
// etc..
' FROM (YourQuery) base')
FROM
(YourQuery) As base
The query works using Count(column) - the aggregate function ignores NULL values, and so returns 0 for a column consisting entirely of NULLs. The query builder assumes that YourQuery uses aliases to ensure there no duplicate column names.
While you cant put this into a view, you could wrap it up as a stored procedure that copies the data to another table - the result table. You may also set up a trigger so that the result table is updated whenever the base tables change.
I suspect what's going on is that an end user is running CrystalReports and complaining about all the empty columns that have to be removed manually.
It would actually be possible to create a stored procedure that would create a view on the fly, leaving out dataless columns. But then you would have to run this proc before using the view.
Is that acceptable?