Count per category - sql

have a table as below -
COL1 | COL2 | COL3
1 1 1
1 1 2
1 2 0
1 2 1
2 3 1
2 3 2
2 4 0
2 4 1
3 1 0
3 2 0
.
.
.
I want to select COL1 where all COL2 have sum(COL3) is > 0. If I am sure there are 20 distinct values in COL2, Then how can i pull all COL1 values that have all 20 COL2 filled with COL3 > 0. So the end result should be
COL1 | COL2 | COL3
1 1 3
1 2 1
2 3 3
2 4 1
I have tried a lot of ways to do this but no success.

Just use group by and having.
select col1,col2,sum(col3)
from tbl
group by col1,col2
having sum(col3)>0

select t1.*
from yourTable t1
inner join
(
select t.col1
from
(
select col1, col2, sum(col3) as col_sum
from yourTable
group by col1, col2
) t
group by t.col1
having sum(case when t.col_sum = 0 then 1 else 0 end) = 0
) t2
on t1.col1 = t2.col1

I use a CTE and a Group by with a where condition
;WITH CTE as (
select COL1,COL2,SUM(COL3) as COL3 FROM table1
Group By
COL1,COL2
)
select * from CTE
where COL3>0

Just group col2 and check if it's bigger then 0
select col1,col2,sum(col3)
from tbl
group by col2
having sum(col3)>0
http://sqlfiddle.com/#!9/537f8c/1

See if the below gives you the result that you are after. It is selecting the col1, col2 and a sum of col3 from a derived(?) table that is excluding the col3's that are 0:
select col1, col2, sum(col3)
from
(
select col1, col2, col3 from tbl where col3 <> 0
) as ds
group by col3

Related

SQL Postgres union data with missed values

I have two results of queries:
id | col1 | col2 | col3
1 1 null 3j
2 2 12 35
3 null 32 31
4 null 43 33
5 null 44 4
id | col1 | col2 | col3
6 1 null 3j
7 2 null 35
8 3 null 31
9 4 null 33
10 5 null null
I need to do union:
id | col1 | col2 | col3
6 1 null 3j
7 2 12 35
8 3 32 31
9 4 43 33
10 5 null null
5 null 44 4
The problem is some values are missing
I wrote this big sql query to solve this problem:
select *
from (
select max(id) as id,
max(col1) as col1,
max(col2) as col2,
max(col3) as col3
from (
select max(id) as id,
max(col1) as col1,
max(col2) as col2,
max(col3) as col3
from (
select max(id) as id,
max(col1) as col1,
max(col2) as col2,
max(col3) as col3
from (
select *
from t1
where id = 1
union
select *
from t2
where id = 2
) t
group by case
when col1 is null
or
length(col1) =
0 then id
else col1 end
) t1
group by case
when col2 is null
or length(col2) = 0
then id
else col2 end
) t2
group by case
when col3 is null
or length(col3) = 0 then id
else col3 end
) t3
may be are there some ideas to simplify it? Or are there other approaches to enrich data efficiently, because I also need to do intersection, right, left, inner union and I don't want to build so monsters queries
well you cat try something like this:
union
select max(col1),
max(col2),
max(col3)
from t1
where id = 1
or id = 2
group by coalesce(nullif(col1, ''),
nullif(col2, ''),
nullif(col3, ''));
upd:
outer union
select max(col1),
max(col2),
max(col3)
from t1
where id = 1
or id = 2
group by coalesce(nullif(col1, ''),
nullif(col2, ''),
nullif(col3, ''))
having count = 1;
inner union
select max(col1),
max(col2),
max(col3)
from t1
where id = 1
or id = 2
group by coalesce(nullif(col1, ''),
nullif(col2, ''),
nullif(col3, ''))
having count > 1;
left and right are outer intersect with common query with 'where'

How to create a pivot table where columns and rows are the same in Snowflake SQL?

I have a table like
col1 | col2 | col3 | col4 | col5
id1 | 1 0 0 1 0
id2 | 1 1 0 0 0
id3 | 0 1 0 1 0
id4 | 0 0 1 0 1
id5 | 1 0 1 0 0
id6 | 0 0 0 1 0
.
.
.
idN
How would I create a query such that I get a table like
col1 | col2 | col3 | col4 | col5
col1 | 3 1 1 1 0
col2 | 1 2 0 1 0
col3 | 1 1 2 0 1
col4 | 1 1 1 2 0
col5 | 0 0 1 0 1
where each entry in the result is the number of times that some value of 1 in one column occurred with another column that had a value of 1?
I can get the diagonal values by doing the following:
SELECT
sum(col1), sum(col2), sum(col3), sum(col4), sum(col5)
FROM (
SELECT
col1, col2, col3, col4, col5, col1 + col2 + col3 + col4 + col5 ) AS total
FROM (
SELECT
ROW_NUMBER()OVER(PARTITION BY id ORDER BY date) row_num, *
FROM (
SELECT DISTINCT(id), date, col1, col2, col3, col4, col5
FROM db.schema.table)
)
WHERE row_num = 1 AND total <= 1
ORDER BY total DESC);
I assume that I have to do some kind of pivot or various union all's but I can't seem to figure it out.
I think I would approach this by unpivoting the data and re-aggregating. The following gets the pairs and counts:
with u as (
select t.id, v.col
from t cross join lateral
(values ('col1', col1),
('col2', col2),
('col3', col3),
('col4', col4),
('col5', col5)
) v(col, val)
where val = 1
)
select u1.col, u2.col, count(*)
from u u1 join
u u2
on u1.id = u2.id
group by u1.col, u2.col;
This seems good enough for me, but you can use conditional aggregation:
select u1.col,
sum(case when u2.col = 'col1' then 1 else 0 end) as col1,
sum(case when u2.col = 'col2' then 1 else 0 end) as col2,
sum(case when u2.col = 'col3' then 1 else 0 end) as col3,
sum(case when u2.col = 'col4' then 1 else 0 end) as col4,
sum(case when u2.col = 'col5' then 1 else 0 end) as col5
from u u1 join
u u2
on u1.id = u2.id
group by u1.col;
Here is one approach that showcases one of Snowflake's powerful semi-structured functions (namely, OBJECT_CONSTRUCT(*)) and also exploits two meta-attributes (SEQ and KEY) that are returned by the FLATTEN function so that there is no need for a unique business key on the original (source) table:
WITH CTE_ROW AS (
SELECT OBJECT_CONSTRUCT(*) AS COL_DICT
FROM T
)
,CTE_ROW_COL AS (
SELECT F.SEQ - 1 AS ROW_OFFSET
,F.KEY AS COL_NAME
,COL_DICT[F.KEY]::INTEGER AS VAL
FROM CTE_ROW R
,LATERAL FLATTEN(R.COL_DICT) F
)
,CTE_CALC AS (
SELECT RC1.COL_NAME AS COL_NAME_1
,RC2.COL_NAME AS COL_NAME_2
,COUNT(*) AS COUNT_VAL
FROM CTE_ROW_COL RC1
JOIN CTE_ROW_COL RC2
ON RC2.ROW_OFFSET = RC1.ROW_OFFSET
AND RC2.VAL = 1
WHERE RC1.VAL = 1
GROUP BY RC1.COL_NAME
,RC2.COL_NAME
)
SELECT COL_NAME_1 AS COL_NAME
,SUM(IFF(COL_NAME_2='COL1', COUNT_VAL, 0)) AS COL1
,SUM(IFF(COL_NAME_2='COL2', COUNT_VAL, 0)) AS COL2
,SUM(IFF(COL_NAME_2='COL3', COUNT_VAL, 0)) AS COL3
,SUM(IFF(COL_NAME_2='COL4', COUNT_VAL, 0)) AS COL4
,SUM(IFF(COL_NAME_2='COL5', COUNT_VAL, 0)) AS COL5
FROM CTE_CALC
GROUP BY COL_NAME_1
ORDER BY COL_NAME_1
;

Can I change column order in SQL table based on a value that appears in different columns?

I have a table that looks like this:
Column1 | Column2 | Column3| Column4
4 | 3 | 2 | 1
2 | 1
3 | 2 | 1
I want to flip the columns so that 1 always start in column 1 and then the rest of the values follow to the right. Like this:
Column1 | Column2 | Column3 | Column4
1 | 2 | 3 | 4
1 | 2
1 | 2 | 3
This is an example table. The real table is a hierarchy of a company so 1 = CEO and 2 = SVP for example. 1 is always the same name but as the number gets higher (lower in chain of command) the more names that are in that level. I'm hoping for an automated solution that looks for 1, makes that the first column and then populates the columns. I am struggling because the value that 1 represents is in different columns so I can't just change the order of the columns.
I was able to accomplish this using VBA but I would prefer to keep it in SQL.
I don't have any useful code that I have tried so far.
You can use Case expression:
WITH CTE1 AS
(SELECT 4 AS COL1, 3 AS COL2 , 2 AS COL3, 1 AS COL4 FROM DUAL
UNION ALL
SELECT 2, 1, NULL, NULL FROM DUAL
UNION ALL
SELECT 3, 2, 1, NULL FROM DUAL
)
SELECT CASE WHEN COL1 <> 1 THEN 1 ELSE COL1 END AS COL1,
CASE WHEN COL2 <> 2 THEN 2 ELSE COL2 END AS COL2,
CASE WHEN COL3 <> 3 THEN 3 ELSE COL3 END AS COL3,
CASE WHEN COL4 <> 4 THEN 4 ELSE COL4 END AS COL4
FROM CTE1;
You can apply some CASEes checking all possibilities, this is assuming NULLs for missing data:
COALESCE(col4,col3,col2,col1) AS c1,
CASE
WHEN col4 IS NOT NULL THEN col3
WHEN col3 IS NOT NULL THEN col2
WHEN col2 IS NOT NULL THEN col1
END AS c2,
CASE
WHEN col4 IS NOT NULL THEN col2
WHEN col3 IS NOT NULL THEN col1
END AS c3,
CASE
WHEN col4 IS NOT NULL THEN col1
END AS c4
You want to sort the values. A generic SQL solution would use:
select max(case when seqnum = 1 then col end) as col1,
max(case when seqnum = 2 then col end) as col2,
max(case when seqnum = 3 then col end) as col3,
max(case when seqnum = 4 then col end) as col4
from (select col1, col2, col3, col4, col,
row_number() over (order by col) as seqnum
from ((select col1 as col, 1 as which, col1, col2, col3, col4 from t) union all
(select col2 as col, 2 as which, col1, col2, col3, col4 from t) union all
(select col3 as col, 3 as which, col1, col2, col3, col4 from t) union all
(select col4 as col, 4 as which, col1, col2, col3, col4 from t)
) t
where col is not null
) t
group by col1, col2, col3, col4;
This would be simpler in a database that supports lateral joins. And a unique id on each row would also help.

selection based on certain condition

select col1, col2, col3 from tab1
rownum col1 col2 col3
1 1 10 A
2 1 15 B
3 1 0 A
4 1 0 C
5 2 0 B
6 3 20 C
7 3 0 D
8 4 10 B
9 5 0 A
10 5 0 B
Output required is
col1 col2 col3
1 10 A
1 15 B
2 0 B
3 20 C
4 10 B
5 0 A
5 0 B
col1 and col2 are my lookup/joining columns columns, if col2 is having "non zero" data then I need to ignore/filter record with 0 (in above example I need to filter record rownum 3 4 and 7) If col2 is not having any data other than "non zero" in that case only select record with 0 (in above example col1 with value 1 and 5).
I m trying to write sql for this. Hope I have mentioned requirement clearly, please let me know if you need anything more from my side. Seem to have gone blank in this case.
Database - Oracle 10g
SELECT col1,
col2,
col3
FROM (SELECT col1,
col2,
col3,
sum(col2) OVER (PARTITION BY col1) sum_col2
FROM tab1)
WHERE ( ( sum_col2 <> 0
AND col2 <> 0)
OR sum_col2 = 0)
If col2 can be negative and the requirement is that the sum of col2 has "non-zero" data then the above is OK, however, if it is the requirement that any col2 value has "non-zero" data then it should be changed to:
SELECT col1,
col2,
col3
FROM (SELECT col1,
col2,
col3,
sum(abs(col2)) OVER (PARTITION BY col1) sum_col2
FROM tab1)
WHERE ( ( sum_col2 <> 0
AND col2 <> 0)
OR sum_col2 = 0)
SELECT t1.*
FROM tab1 t1
JOIN (SELECT "col1", MAX("col2") AS max2
FROM tab1
GROUP BY "col1") t2
ON t1."col1" = t2."col1"
WHERE ((max2 = 0 AND "col2" = 0)
OR
(max2 != 0 AND "col2" != 0))
ORDER BY "rownum"
DEMO

how to get the maximum occurrence value from a table for a combination?

I have the following table;
column 1 column 2 column 3
1 2 X
1 2 X
1 2 Y
1 3 Z
1 3 X
I need to write an SQL query to get the output as;
1 2 X (because X is the maximum occurrence)
1 3 Z or X(because number of occurrence of Z or X is same)
How do i do this ?
I think i have a solution for you, try this script using the functions RANK(), ROW_NUMBER() & DENSE_RANK(), you choose the function that fits with your needs :
with temp as (
select 1 as col1, 2 AS col2, 'X' as col3 union all
select 1 as col1, 2 AS col2, 'Y' as col3 union all
select 1 as col1, 2 AS col2, 'X' as col3 union all
select 1 as col1, 3 AS col2, 'Z' as col3 union all
select 1 as col1, 3 AS col2, 'T' as col3 union all
select 1 as col1, 3 AS col2, 'Y' as col3 union all
select 1 as col1, 3 AS col2, 'Y' as col3 union all
select 1 as col1, 4 AS col2, 'Y' as col3 union all
select 1 as col1, 4 AS col2, 'W' as col3)
,temp2 AS (
select
col1
,col2
,col3
,COUNT(1) nb_occurence
,RANK() OVER(PARTITION BY col1,col2 ORDER BY COUNT(1) DESC) Ordre_RANK
,ROW_NUMBER() OVER(PARTITION BY col1,col2 ORDER BY COUNT(1) DESC) Ordre_ROW_NUMBER
,DENSE_RANK() OVER(PARTITION BY col1,col2 ORDER BY COUNT(1) DESC) Ordre_DENSE_RANK
from temp
GROUP BY
col1
,col2
,col3 )
SELECT *
FROM temp2
--WHERE Ordre_RANK = 1
--WHERE Ordre_ROW_NUMBER = 1
--WHERE Ordre_DENSE_RANK = 1
I hope this will help you.