How to count records from multiple columns eliminating null values in hive table

How to count records from multiple columns eliminating null values in hive table - hive

I'm using the below command to find the sum of records from 8 columns but getting null in the O/P as shown below.
Command part 1
command part 2
Output
How can this be fixed?

Yes, the thing is NULL + something results NULL. To solve this, wrap each sum() in the NVL(sum(),0), so if some particular sum() is NULL, it is converted to 0 and the whole total will be not null:
nvl(sum(case when col1='something' then 1 else 0 end),0)+ ...
Or always use else 0, like in the first expression (H).
Wrapping with NVL() will solve the problem even if column comes from the join and the rows are absent and sum is NULL.

Related

T-SQL COUNT(*) counts NULLS but I don't want to count NULLS in row count

I have a scenario where I'm trying not to count a row where it has a zero, blank or NULL. But I'm not sure how to. I have used ISNULL to replace it with blank but the result shows zero. I don't want zero because it messes up my averages etc. The screen shot below should show 17 in the bottom as total but it's showing 18 because it's counting the null as row count. This null row has an N/A as a value in the data set but my count counts it as a row. I'm using count() because I have many other columns so I can not change the count(). Any ideas on how to show the total as 17 instead of 18?
Thank you

Some SQL implementations (I think this is also proper ANSI standard but don't know that for sure) exhibit a different behaviour for COUNT(*) VS COUNT(field).
The former will include NULLs, the latter will exclude them.

you can use SUM like this
SELECT SUM(CASE WHEN ISNULL(testeq4,'') <> '' THEN 1 ELSE 0 end)
FROM YourTable

Try this:
SELECT SUM(CASE WHEN ISNULL(TestQ4,0)=0 THEN 0 ELSE 1 END)

Two different condition for two different colums using case statement in SQL

Given a table of random numbers as follows:
** Person table schema **
Name
Marks1
Marks2
I want to return a table with similar structure and headings, where if the sum of a column is odd, the column shows the maximum value for that column, and when the sum is even, it shows the minimum value by using a case statement.
** output table schema **
Marks1
Marks2
I've tried the following code.
select Marks1,Marks2 ,
(case
when mod(sum(Marks1),2)=0 then
min(Marks1)
else max(Marks1)
end) as Marks1 ,
(case
when mod(sum(Marks2),2)=0 then
min(Marks2)
else max(Marks2)
end) as Marks2
from numbers
group by Marks1;
Sample output -
TABLE
Ash 56 45
David 45 35
Output -
56 35
As 56+45 = 101 odd number so output 56(max number). Whereas in marks2 column, 45+35 =80, even number so output 35(min number).
Can anyone tell me what's wrong with it? Thanks in advance.

Use a CTE to get your min(), max(), and sum() values. Then use case to determine what values to display.
Since your problem statement and sample results do not match, I followed your sample results to return max() on an odd sum(). You can switch this by changing the two case statements from 1 to 0.
Working fiddle
with totals as (
select sum(marks1) as marks1sum,
min(marks1) as marks1min,
max(marks1) as marks1max,
sum(marks2) as marks2sum,
min(marks2) as marks2min,
max(marks2) as marks2max
from numbers
)
select case mod(marks1sum, 2)
when 1 then marks1max
else marks1min
end as marks1,
case mod(marks2sum, 2)
when 1 then marks2max
else marks2min
end as marks2
from totals;

You are reusing marks1 and marks2 when aliasing your third and fourth column which is colliding. Try using different name.

SQL - concatenate values in columns but keep only first non-null value

I have the following table in Postgresql. The first 4 records are the base data and the others were generated with the ROLLUP function.
I want to add a column "grp_1" that will display the first non-null value of the columns grp1_l1, grp2_l2 and grp2_l3
I can get to the desired result by nesting 3 "case" functions using the SQL below, but my real table has 4 groups with each 8 to 10 columns (so a lot of nested "case" function).
sql:
SELECT grp1_l1, grp1_l2, grp1_l3, case when grp1_l1 is not null then grp1_l1 else case when grp1_l2 is not null then grp1_l2 else case when grp1_l3 is not null then grp1_l3 else null end end end as grp1, value
FROM public.query_test;
Is there a better and more scalable to handle this requirement ? Any suggestions are welcome.
The id will not always have 3 digits, that is just the case in my example here

Use coalesce() it's defined as "returns the first of its arguments that is not null" - which is exactly what you want.
coalesce(grp1_l1, grp1_l2, grp1_l3)

Count() Specifying Uncounted Value?

Using Microsoft SQL Server, if you use COUNT(column name) it returns the number of rows in that column which have a non-null value (i.e., it counts the rows, ignoring nulls).
Is there any way to do something similar, but allowing you to tell it which values to ignore? For example, if I wanted to count all the rows in a table which have a value which is NOT 1, I could do something like COUNTNOT(column name,1). That would count all the rows in the specified column which have a value NOT 1.

You may use conditional aggregation:
SELECT COUNT(CASE WHEN some_val <> 1 THEN 1 END) AS cnt
FROM yourTable;
The above logic is that COUNT will count one whenever some value is not equal to 1. Otherwise, it falls on the ELSE conditional, which if not present defaults to the value NULL. Since NULL is not counted, any value other than 1 would contribute zero to the count.

Why not put what you want to exclude in a WHERE clause?
SELECT COUNT(some_val) AS cnt
FROM yourTable
WHERE some_val <> 1

You need to be careful about NULL values. I would recommend:
select sum(case when column in (<values to ignore>) then 0 else 1 end)
This will count NULL values as not in the list (even if NULL is in the list). To ignore NULL values (as well), switch the logic to:
select sum(case when column not in (<values to ignore>) then 1 else 0 end)
and be sure NULL is not in the list.

Avoiding the null values to replace 0 values in report

I am using SQL Server 2005 BOXIR2.
My doubt, from universe table there is an eventcode having different types of codes like Enquiry,FollowUp,LostofSales,Contact,etc
I make a measure that is from object properties formula count(Tablename.EventCode)save and export it, when I used this EventCode in Webireport, it show values for paricular EventCode, but zero values are not read it show null blank as below example .
I WANT TO GET THE ZERO VALUES FOR WHICH IT IS IN BLANK(NULL).
count(Tablename.EventCode)
Enquiry,FollowUp,LostofSales,Contact
10 20 15
5 12 5
6 4 3
Can u please help me how to get get zero values for null,Formula

I'm not sure exactly what you are asking, but I think you may be looking for ISNULL()
SELECT ISNULL(table_name.column_name, 0)
will return 0 if table_name.column_name is null

If you're getting NULLs when performing an aggregate, it's probably because one of elements is NULL. All you need to do is coalesce those entries to a known value (such as zero).
SELECT COUNT(COALESCE(Tablename.EventCode, 0)) FROM Tablename

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to count records from multiple columns eliminating null values in hive table - hive

I'm using the below command to find the sum of records from 8 columns but getting null in the O/P as shown below. Command part 1 command part 2 Output How can this be fixed?

Related

T-SQL COUNT(*) counts NULLS but I don't want to count NULLS in row count

Two different condition for two different colums using case statement in SQL

SQL - concatenate values in columns but keep only first non-null value

Count() Specifying Uncounted Value?

Avoiding the null values to replace 0 values in report

Categories

Resources