Can I modify below GroupBy clause into better one

Can I modify below GroupBy clause into better one - sql

Consider following cases
CASE 1: tbl_a
---------------
| colA | colB |
--------------- expected O/P: 1 0
| 1 | 0 |
---------------
CASE 2: tbl_a
---------------
| colA | colB |
--------------- expected O/P: 1 1
| 1 | 1 |
---------------
CASE 3: tbl_a
---------------
| colA | colB |
--------------- expected O/P: 1 1
| 1 | 0 |
| 1 | 1 |
---------------
CASE 4: tbl_a
---------------
| colA | colB |
--------------- expected O/P: NULL NULL
| null | null |
---------------
The query is simple, If there is a record where colA = 1 and colB = 1 then return it, if no such record exists then return the existing record for colA = 1.
I have tried various ways. I came with groupBy clause but is there a simple way to do it.
If I use ColA = 1 And colB = 1 then it fails for case 1 it returns no rows.
SELECT colA, Max(colb) group by (colA)
Is this the valid query? Any help is greatly appreciated.

Please try the following. It provides the desired results for the posted cases.
SELECT TOP 1 colA, colB
FROM tbl_a
WHERE colA = 1 OR colA IS NULL
ORDER BY colA DESC, colB DESC;

A different approach using row_number with ordering based on your priority and getting the row with the minimum row number as the result.
select colA,colB
from (select x.*,min(rn) over() as minrn
from (select t.*
,row_number() over(order by case when colA=1 and colB=1 then 1
when colA=1 then 2
else 3 end) as rn
from t
) x
) y
where rn=minrn

I think the query that you wrote fits the data to well, but doesn't work for the general case which you described.
Try the below script and play with the value for the #scenario variable to see what data it returns for different data.
Use / adapt the last query for your table structure.
declare #tbl table (colA int, colB int)
declare #scenario char(1) = 'D'
if #scenario = 'A'
insert #tbl values (1, 0)
else if #scenario = 'B'
insert #tbl values (1, 1)
else if #scenario = 'C'
insert #tbl values (1, 0), (1, 1)
else if #scenario = 'D'
insert #tbl values (null, null)
select *
from #tbl
where (colA = 1 and colB = 1)
or (colA = 1 and not exists (select 1 from #tbl where colA = 1 and colB = 1))
or (colA is null and colB is null and not exists (select 1 from #tbl where colA = 1 and colB = 1))
You can also test the query with "more random data" in each scenario, like below:
declare #tbl table (colA int, colB int)
declare #scenario char(1) = 'B'
if #scenario = 'A'
insert #tbl values (1, 0), (0, 1), (0, 0), (0, null)
else if #scenario = 'B'
insert #tbl values (1, 1), (1, 0), (null, null)
else if #scenario = 'C'
insert #tbl values (1, 0), (1, 1), (0, 0), (1, 0), (null, 0)
else if #scenario = 'D'
insert #tbl values (null, null)
select *
from #tbl
where (colA = 1 and colB = 1)
or (colA = 1 and not exists (select 1 from #tbl where colA = 1 and colB = 1))
or (colA is null and colB is null and not exists (select 1 from #tbl where colA = 1 and colB = 1))

Related

Dynamic bit-based flattening of multiple rows by pivoting into additional columns

I have data that looks like this:
ID | Value
-----------
1 | a
1 | b
2 | a
2 | c
3 | a
3 | d
And I would like it to look like this:
ID | Value_a | Value_b | Value_c | Value_d
---------------------------------------------
1 | 1 | 1 | 0 | 0
2 | 1 | 0 | 1 | 0
3 | 1 | 0 | 0 | 1
I think a dynamic conditional aggregation is required. Any help would be appreciated.

Conditional aggregation goes like:
select
id,
max(case when value = 'a' then 1 else 0 end) value_a,
max(case when value = 'b' then 1 else 0 end) value_b,
max(case when value = 'c' then 1 else 0 end) value_c,
max(case when value = 'd' then 1 else 0 end) value_d
from mytable
group by id

Here is a sample implementation of dynamic conditional aggregation:
--create test table
create table #values (
[ID] int
,[Value] char(1))
--populate test table
insert into #values
values
(1, 'a')
,(1, 'b')
,(2, 'a')
,(2, 'c')
,(3, 'a')
,(3, 'd')
--declare variable that will hold dynamic query
declare #query nvarchar(max) = ' select [ID] '
--build dynamic query and assign it to variable
select
#query = #query + max(',max(case when [value] = '''
+ [value] + ''' then 1 else 0 end) as Value_' + [value] )
from
#values
group by
[value]
--add group by clause to dunamic query
set #query = #query + ' from #values group by [id]'
--execute dynamic query
exec (#query)
this is the result:
Now you can add a value (for example id = 4 and value = 'e') replacing the original insert with this one:
insert into #values
values
(1, 'a')
,(1, 'b')
,(2, 'a')
,(2, 'c')
,(3, 'a')
,(3, 'd')
,(4, 'a')
,(4, 'e')
this is the new output:

Get list of records having name starting with specific value and ignore in case of other values for different rows

I'm trying to get the Col A records based on the first character on Col B. If the first character in Col B do matches with for all unique values of Col A, then the Col A value to be returned. If the first character in Col B does not match the other first values in Col B, then the col A value should not be returned. This is in Oracle SQL.
select T1.colA,count(*) from colTables T1 where T1.colA in (
select T2.colA from colTables T2 where T2.colB like '1%'
group by T2.colA)
group by T1.colA;
Col A | Col B
101|12541
101|15475
101|19874
102|12544
102|22549
102|12537
103|22549
103|28747
104|72549
104|82549
104|82549
105|12549
105|12531
105|12589
106|75448
106|71544
My query gives the following output
ColA | Count
101|3
102|3
105|3
but I want the output to be
ColA| Count
101|3
105|3
Also.. I am checking if there is any means where I can omit T2.colB like '1%' to get the output in the following way
ColA| Count
101|3 -- All values in col B starts with 1
103|2 -- All values in col B starts with 2
105|3 -- All values in col B starts with 1
106|2 -- All values in col B starts with 7

If you want all colB values to start with a 1, then:
select t.colA, count(*)
from colTables t
group by t.colA
having sum(case when t.colB like '1%' then 1 else 0 end) = count(*);
You could phrase this in other ways as well, such as:
having min(t.colB) >= '1' and
max(t.colB) < '2'
If you just want that the colB values start with the same letter for all colA, use:
having min(substr(t.colB, 1, 1)) = max(substr(t.colB, 1, 1))

Schema
CREATE TABLE tbl (
"cola" varchar(100),
"colb" varchar(100)
);
INSERT INTO tbl
("cola", "colb")
VALUES
('101', '12541'),
('101', '15475'),
('101', '19874'),
('102', '12544'),
('102', '22549'),
('102', '12537'),
('103', '22549'),
('103', '28747'),
('104', '72549'),
('104', '82549'),
('104', '82549'),
('105', '12549'),
('105', '12531'),
('105', '12589'),
('106', '75448'),
('106', '71544');
Query
Live test: https://www.db-fiddle.com/f/iquHToVTGz8JxnWSaT2ChD/4
select cola, count(*)
from tbl
group by cola
having (cola, count(*))
in (
select
cola, count(*)
from tbl
where colb like '1%'
group by cola
)
order by cola;
Output:
| cola | count |
| ---- | ----- |
| 105 | 3 |
| 101 | 3 |
Another approach, works on RDBMS that does not supports to tuple in IN clause. Use EXISTS:
select y.cola, count(*) as y_count
from tbl y
group by y.cola
having exists
(
select
null -- does not matter
from tbl x
-- this matters
where x.cola = y.cola
and x.colb like '1%'
group by x.cola
having count(x.cola) = count(y.cola)
)
order by y.cola;

Is there a way to perform operations on groups in SQL?

So I am very new to SQL and am probably not describing what I want to do accurately. I have a table with three columns and I want to group by one column and see what percentage of each group has a certain value in the other column. For example in the table:
id col1 col2
----------------
0 A 1
1 A 2
2 B 2
3 B 2
4 A 1
I would want to group by col1 and see what percentage of each group (A or B) has value 1 in col2. The result I want from this is:
col1 percentage_col2_equals_1
------------------------------
A 66.7
B 0.0
So far I have:
SELECT col1,
((SELECT COUNT(*) FROM my_table
WHERE col2 = 1
GROUP BY col1) /
(SELECT COUNT(*) FROM my_table
GROUP BY col1) * 100)
FROM my_table
GROUP BY col1;
But this does not work. Any help would be appreciated!

use case when
SELECT col1,(coalesce(count(case when col2=1 then col2 end),0)*100.00)/count(*)
from tablename
group by col1

Same answer as everyone, just putting this here due to Postgres' expressiveness :)
Live test: https://www.db-fiddle.com/f/goL488VaPuZYii7Wik3pFk/4
select
col1,
count(*) filter(where col2 = 1) ::numeric / count(*)
from tbl
group by col1;
Output:
| col1 | ?column? |
| ---- | ---------------------- |
| A | 0.66666666666666666667 |
| B | 0.00000000000000000000 |
To present it as percentage with 1 decimal place, multiply it by 100 and round to 1:
Live test: https://www.db-fiddle.com/f/goL488VaPuZYii7Wik3pFk/5
select
col1,
round(
count(*) filter(where col2 = 1) ::numeric / count(*) * 100,
1
) as p_a
from tbl
group by col1;
select
col1,
(
count(*) filter(where col2 = 1) ::numeric / count(*) * 100
)::numeric(100,1) as p_b
from tbl
group by col1;
Output:
| col1 | p_a |
| ---- | ---- |
| A | 66.7 |
| B | 0.0 |
| col1 | p_b |
| ---- | ---- |
| A | 66.7 |
| B | 0.0 |

The following query will return your expected result:
SELECT col1,
CAST(((SUM(IIF(col2 = 1, 1, 0))) * 100.0) / COUNT(*) AS DECIMAL(5, 1)) AS percentage_col2_equals_1
FROM my_table
GROUP BY col1;
Sample execution with sample data:
DECLARE #my_table TABLE (id INT, col1 CHAR(1), col2 INT);
INSERT INTO #my_table (id, col1, col2) VALUES
(0, 'A', 1),
(1, 'A', 2),
(2, 'B', 2),
(3, 'B', 2),
(4, 'A', 1);
SELECT col1, CAST(((SUM(IIF(col2 = 1, 1, 0))) * 100.0) / COUNT(*) AS DECIMAL(5, 1)) AS percentage_col2_equals_1
FROM #my_table
GROUP BY col1;
Output:
col1 percentage_col2_equals_1
---------------------------------
A 66.7
B 0.0

CREATE TABLE #TEMP
(ID INT,
COL1 VARCHAR(10),
COL2 INT
);
INSERT INTO #TEMP
SELECT 0, 'A',1
UNION
SELECT 1, 'A',2
UNION
SELECT 2, 'B',2
UNION
SELECT 3, 'B',2
UNION
SELECT 4, 'A',1;
SELECT T.COL1,
ROUND((CAST(COUNT(CASE
WHEN T.COL2 = 1
THEN T.COL2
ELSE NULL
END) AS DECIMAL) / (S.COL2)) * 100.0, 2) AS Percentage_1
FROM #TEMP T
JOIN
(
SELECT COUNT(COL2) COL2,
COL1
FROM #TEMP
GROUP BY COL1
) S ON S.COL1 = T.COL1
GROUP BY T.COL1,
S.COL2;

this will work:
CREATE TABLE Table1
("id" int, "col1" varchar2(1), "col2" int)
;
//do inserts
select aa."col1",((select count(*) from Table1 b
where b."col1"=aa."col1" and b."col2"=1 )*100/(select count(*)
from Table1 c where c."col1"='A' )) percentge
from Table1 aa
group by aa."col1"
;
output:
A 66.66666666666666666666666666666666666667
B 0

In SQLite the expression col2 = 1 returns 1 when true and 0 when false.
So you just need the average of col2 = 1 and then round it to 1 decimal:
select
col1,
round(100.0 * avg(col2 = 1), 1) percentage_col2_equals_1
from tablename
group by col1
See the demo.
Results:
| col1 | percentage_col2_equals_1 |
| ---- | ------------------------ |
| A | 66.7 |
| B | 0 |

How to subtract two rows from one another if they have they share a value in another column

I am currently working in a database with the following structure:
Var | Value | ID
--------------
A | 1 | 1
B | 2 | 1
C | 3 | 1
A | 2 | 2
B | 4 | 2
C | 6 | 2
What I am trying to achieve is to subtract the value of Var C from the other Var's (B and C) sharing the same ID as Var C. In this case the output would be:
Var | Value | ID
--------------
A | -2 | 1
B | -1 | 1
C | 3 | 1
A | -4 | 2
B | -2 | 2
C | 6 | 2
To be honest I have absolutely no idea how to start on achieving this. I am familiar with many other programming languages, but SQL is still a challenge with difficult/specific queries.

Do a self join:
select t1.var,
case when t1.var = 'C' then t1.value
else t1.value - t2.value
end as value,
t1.id
from tablename t1
join tablename t2 ON t1.id = t2.id
where t2.var = 'C'
Note that value is a reserved word in ANSI SQL, so it should be delimited as "Value".

You could pre-analyse the "C" Values and then use this to remove them?
DECLARE #Data TABLE (
[Var] VARCHAR(1),
Value INT,
ID INT);
INSERT INTO #Data SELECT 'A', 1, 1;
INSERT INTO #Data SELECT 'B', 2, 1;
INSERT INTO #Data SELECT 'C', 3, 1;
INSERT INTO #Data SELECT 'A', 2, 2;
INSERT INTO #Data SELECT 'B', 4, 2;
INSERT INTO #Data SELECT 'C', 6, 2;
WITH CValues AS (
SELECT
ID,
Value
FROM
#Data
WHERE
[Var] = 'C')
SELECT
d.[Var],
CASE WHEN d.[Var] != 'C' THEN d.Value - c.Value ELSE d.Value END AS Value,
d.ID
FROM
#Data d
LEFT JOIN CValues c ON c.ID = d.ID;
...but yes, a self-join is probably a better solution:
DECLARE #Data TABLE (
[Var] VARCHAR(1),
Value INT,
ID INT);
INSERT INTO #Data SELECT 'A', 1, 1;
INSERT INTO #Data SELECT 'B', 2, 1;
INSERT INTO #Data SELECT 'C', 3, 1;
INSERT INTO #Data SELECT 'A', 2, 2;
INSERT INTO #Data SELECT 'B', 4, 2;
INSERT INTO #Data SELECT 'C', 6, 2;
SELECT
d.[Var],
CASE WHEN d.[Var] != 'C' THEN d.Value - c.Value ELSE d.Value END AS Value,
d.ID
FROM
#Data d
LEFT JOIN #Data c ON c.[Var] = 'C' AND c.ID = d.ID;

T-SQL: "Compress" rows with equal values to one

Is there a chance to show only one row if there are any rows with same values?
I've the following scenario:
ID | Column A | Column B | Column C
1 | 2 | 'test' | 5
2 | 3 | 'test'| 6
3 | 2 | 'test'| 5
In this scenario I want only show the following resultset:
ID | Column A | Column B | Column C
1 | 2 | 'test' | 5
2 | 3 | 'test'| 6
Thanks for your help.
Regards, pro

Your rows are not exact duplicates, because of the id column. If you don't care which value of the id appears, you can do what you want as:
select max(id) as id, ColumnA, ColumnB, ColumnC
from t
group by ColumnA, ColumnB, ColumnC
If you don't need the id at all, this is simpler:
select distinct ColumnA, ColumnB, ColumnC
from t

Try this :
With cte As
( Select * , row_number() over (partition by ColumnA, ColumnB,ColumnC
order by ID ) as myrownumber from myTable
)
Select * from cte where myrownumber=1

select id, column1,column2,colum3 from
(
select *, row_number() over (partition by column1,column2,colum3 order by id) as sno
from table
) as t
where sno=1

CREATE TABLE #test
(
ID TINYINT NOT NULL,
colA TINYINT NOT NULL,
colB VARCHAR(10) NOT NULL,
colC TINYINT NOT NULL
);
INSERT INTO #test VALUES (1,2, 'test', 5);
INSERT INTO #test VALUES (2,3, 'test', 6);
INSERT INTO #test VALUES (3,2, 'test', 5);
SELECT
ID,
ColA,
ColB,
ColC
FROM
(
SELECT
ID,
ColA,
ColB,
ColC,
ROW_NUMBER() OVER(PARTITION BY ColA ORDER BY ColA DESC) AS RowNum
FROM #test
) AS WorkTable
WHERE RowNum = 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Can I modify below GroupBy clause into better one - sql

Please try the following. It provides the desired results for the posted cases. SELECT TOP 1 colA, colB FROM tbl_a WHERE colA = 1 OR colA IS NULL ORDER BY colA DESC, colB DESC;

Related

Dynamic bit-based flattening of multiple rows by pivoting into additional columns

Get list of records having name starting with specific value and ignore in case of other values for different rows

Is there a way to perform operations on groups in SQL?

How to subtract two rows from one another if they have they share a value in another column

T-SQL: "Compress" rows with equal values to one

Categories

Resources