PostgreSQL - Removing NULLS row and column from conditional aggregation results - sql

I have a query for a multidimensional table using conditional aggregation
select A,
SUM(case when D = 3 then D end) as SUM_D1,
SUM(case when D = 4 then D end) as SUM_D2)
The result:
A SUM_D1 SUM_D2
-------------------
a1 100 NULL
a1 200 NULL
a3 NULL NULL
a4 NULL NULL
However, I would like to hide all NULL rows and columns as follows:
A SUM_D1
-----------
a1 100
a1 200
I have looked for similar problems but they are not my expected answer.
Any help is much appreciated,
Thank you

I think this does what you want:
select A,
coalesce(sum(case when D = 3 then D end),
sum(case when D = 4 then D end)
) as sum_d
from t
group by A
having sum(case when d in (3, 4) then 1 else 0 end) > 0;
Note that this returns only one column -- as in your example. If both "3" and "4" are in the data, then the value is for the "3"s.
If you want a query that returns a variable number of columns, then you need to use dynamic SQL -- or some other method. SQL queries return a fixed number of columns.
One method would be to return the values as an array:
select a,
array_agg(d order by d) as ds,
array_agg(sumd order by d) as sumds
from (select a, d, sum(d) as sumd
from t
where d in (3, 4)
group by a, d
) d
group by a;

To filter all-NULL rows you can use HAVING
select *
from
(
select A,
SUM(case when D = 3 then D end) as SUM_D1,
SUM(case when D = 4 then D end) as SUM_D2)
...
) as dt
where SUM_D1 is not null
and SUM_D2 is not null
Of course, if you got simple conditions like the ones in your example you better filter before aggregation:
select A,
SUM(case when D = 3 then D end) as SUM_D1,
SUM(case when D = 4 then D end) as SUM_D2)
...
where D in (3,4)
Now at least one calculation will return a value, thus no need to check for all-NULL.
To filter all-NULL columns you need some Dynamic SQL:
materialize the data in a temporary tabke using Insert/Select
scan each column for all-NULL select 1 from temp having count(SUM_D1) > 0
dynamically create the Select list based on this
run the Select
But why do you think you need this? It will be confusing for a user to run the same Stored Procedure and receive a different number of columns for each run.

I may have misinterpreted your question because the solution seems so simple:
select A,
SUM(case when D = 3 then D end) as SUM_D1,
SUM(case when D = 4 then D end) as SUM_D2)
where D is not null
This is not what you want, is it? :-)

Null appear because the condition that's not handled by case statement
select A,
SUM(case when D = 3 then D end) as SUM_D1,
SUM(case when D = 4 then D end) as SUM_D2
from
Table1
group by
A
having
(case when D = 3 or D = 4 then D end) is not null
As comment said if you want to suppress the null value.. You can use having to suppress null using is not null

Related

Group By on Column A and calculate value count on column B in SQL

I want to know about a query where I can perform a group by on Column A and can calculate the count of the values in column B and create a new table from it.
Column A and B have limited types of values. (Categories)
Table:
A
B
a
X
b
Y
a
X
a
Z
b
Z
a
X
Result:
X
Y
Z
a
3
0
1
b
0
1
1
This is PIVOT type query. Converting row value to column.
-- MySQL
SELECT A ""
, COUNT(CASE WHEN B = 'X' THEN 1 END) "X"
, COUNT(CASE WHEN B = 'Y' THEN 1 END) "Y"
, COUNT(CASE WHEN B = 'Z' THEN 1 END) "Z"
FROM test
GROUP BY A
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=33f2f4bcf423dccf7c79d6a8b2d64197
Use FILTER clause
-- PostgreSQL (v11)
SELECT A " "
, COUNT(B) FILTER (WHERE B = 'X') "X"
, COUNT(B) FILTER (WHERE B = 'Y') "Y"
, COUNT(B) FILTER (WHERE B = 'Z') "Z"
FROM test
GROUP BY A
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=fec763cb0e5fed99b96e055dd587a235

select 2 columns same table with different where clause

I have data like
a-b where c = 1
a-b where c = 2
how to select 2 columns same table with different where clause
i have try
select (select a-b from t where c = 1),
(select a-b from t where c = 2)
from t
Thank u
Your code should work, but you can use conditional aggregation:
select max(case when c = 1 then a - b end),
max(case when c = 2 then a - b end)
from t;

invalid identifier : sum of multiple column in sql

I'm trying to calculate multible columns in this query
SELECT
SUM (CASE WHEN B.ID = 1 THEN 1 END) AS OPD,
SUM (CASE WHEN B.ID = 2 THEN 1 END) AS IPD,
SUM (CASE WHEN B.ID = 3 THEN 1 END) AS DC,
SUM (CASE WHEN B.ID = 4 THEN 1 END) AS PROC,
SUM (CASE WHEN B.ID = 5 THEN 1 END) AS SUR,
(OPD + IPD + PROC) as Total
FROM REF_TB_APP_TRANSACTIONS A,
REF_VW_VISIT_TYPE B
WHERE A.REQ_VISIT_TYPE = B.ID
AND A.TO_EST_CODE = 20068;
but I got this error PROC invalid identifier
You can't add the three SUMS in the Total column in the SELECT directly, since you're using the aliases of those columns. You could just do your Total column with another SUM CASE.
SELECT
SUM (CASE WHEN B.ID = 1 THEN 1 END) AS OPD,
SUM (CASE WHEN B.ID = 2 THEN 1 END) AS IPD,
SUM (CASE WHEN B.ID = 3 THEN 1 END) AS DC,
SUM (CASE WHEN B.ID = 4 THEN 1 END) AS [PROC],
SUM (CASE WHEN B.ID = 5 THEN 1 END) AS SUR,
SUM (CASE WHEN B.ID IN (1,2,4)THEN 1 END) AS Total
FROM REF_TB_APP_TRANSACTIONS A,
REF_VW_VISIT_TYPE B
WHERE A.REQ_VISIT_TYPE = B.ID
AND A.TO_EST_CODE = 20068;
Depending on the DBMS you are using. You cant sum columns that are aliased like that, you would have to use a sub select and do the sum from there. If you verify your DBMS we can create query.
If MS SQL the below will work. A couple things:
PROC is reserved word, so either change that or put brackets around it (I went for brackets). Also it is preferred if you use JOINS vs. the way you had the queries.
SELECT OPD, IPD, DC, [PROC], SUR, (OPD + IPD + [PROC]) as Total
FROM (
SELECT
SUM (CASE WHEN B.ID = 1 THEN 1 END) AS OPD,
SUM (CASE WHEN B.ID = 2 THEN 1 END) AS IPD,
SUM (CASE WHEN B.ID = 3 THEN 1 END) AS DC,
SUM (CASE WHEN B.ID = 4 THEN 1 END) AS [PROC],
SUM (CASE WHEN B.ID = 5 THEN 1 END) AS SUR
FROM REF_TB_APP_TRANSACTIONS A
INNER JOIN REF_VW_VISIT_TYPE B ON A.REQ_VISIT_TYPE = B.ID
WHERE A.TO_EST_CODE = 20068
) SUB
You can't reference the aliased columns as part of the select because in the order of query execution, they don't exist yet.
You simply wrap your query so it becomes a derived table and then you can refer to them in an outer select, see:
select OPD, IPD, DC, [PROC], SUR, OPD + IPD + [PROC] as Total from (
SELECT
SUM (CASE WHEN B.ID = 1 THEN 1 END) AS OPD,
SUM (CASE WHEN B.ID = 2 THEN 1 END) AS IPD,
SUM (CASE WHEN B.ID = 3 THEN 1 END) AS DC,
SUM (CASE WHEN B.ID = 4 THEN 1 END) AS [PROC],
SUM (CASE WHEN B.ID = 5 THEN 1 END) AS SUR
FROM REF_TB_APP_TRANSACTIONS A
join REF_VW_VISIT_TYPE B on B.ID=A.REQ_VISIT_TYPE
where A.TO_EST_CODE = 20068
)x
Guessing because you have a semi-colon this is SQLServer, in which case you will need to use [] around the reserved word PROC
I've also properly joined your tables as it's not 1989 any more :-0

Easiest way to select distinct with least number of null

I want to create a view over a table that has 500k rows and 10 columns. In that table there are duplicate id but with different amount of information, because some of the columns are NULL. My objective is to keep one column in case of duplicates, but want to keep the one with less number of NULL values.
Let me explain it with a quick example. I am working with a query similar to this.
CREATE TABLE test (ID INT, b char(1), c char (1), d char(1))
INSERT INTO test(ID,b,c,d) VALUES
(1,NULL,NULL,NULL),
(1,'B', NULL,NULL),
(1,'B','C',NULL),
(1,'B','C','D'),
(2,'E','F',NULL),
(2,'E',NULL,NULL),
(3,NULL,NULL,NULL),
(3,'G',NULL,NULL)
SELECT DISTINCT ID,b,c,d FROM test
DROP TABLE test
The result is
ID b c d
--------------------
1 NULL NULL NULL
1 B NULL NULL
1 B C NULL
1 B C D
2 E F NULL
2 E NULL NULL
3 NULL NULL NULL
3 G NULL NULL
However, the output I want to see is
ID b c d
--------------------
1 B C D
2 E F NULL
3 G NULL NULL
So, based on the id and if there are duplicates, I want to have the row with the least number of nulls. How is it possible?
Thank you very much
If you want the row with the least number of NULLs, then you would basically count them:
select t.*
from test t
order by ( (case when b is null then 1 else 0 end) +
(case when c is null then 1 else 0 end) +
(case when d is null then 1 else 0 end)
) desc
fetch first 1 row only;
However, if you want one row per id with a non-NULL value in each column (if available) then #maSTAShuFu's answer is appropriate.
EDIT:
If you want one row per client, then simply use row_number():
select t.*
from (select t.*,
row_number() over (partition by client_id
order by ( (case when b is null then 1 else 0 end) +
(case when c is null then 1 else 0 end) +
(case when d is null then 1 else 0 end)
) desc
) as seqnum
from t
) t
where seqnum = 1;
using MAX.
SELECT
MAX(ID) ID,
MAX(B) B,
MAX(C) C,
MAX(D) D
FROM test

Sql - conditional subqueries

Say I have a table with three columns,a b and c.
I want to query this table with three values, in such a way:
Select all rows where a = value 1
If there are less than 10 such rows, return those, otherwise select all rows within those where b = value 2
Repeat for c and value 3
Is it possible to do this within a single query?
Try this (assuming SQL Server):
SELECT TOP 10 *
FROM YourTable X
ORDER BY (CASE WHEN X.a = value1 THEN 0 ELSE 1 END),
(CASE WHEN X.b = value2 THEN 0 ELSE 1 END),
(CASE WHEN X.c = value3 THEN 0 ELSE 1 END)