Transform SQL Server table based on calculation - sql

I have a table like below
Column1
Column2
A
200
A
200
A
0
B
300
B
200
C
100
I would like to transform this table into the following table
With calculation: for each element of column1, SUM (column2) / count of (non-zero column2)
Column1
Column2
A
((200+ 200 + 0) / 2) = 200
B
((300 + 200) / 2) = 250
C
100 / 1 = 100
The only thing I can think of is looping through distinct elements of Column1 and run:
SELECT SUM(Column2)
FROM Table
WHERE Column1 = i / (SELECT COUNT(Column2)
FROM Table
WHERE Column1 = i AND Column2 <> 0)
and generate a table.
Is there a better way of doing this?

Use aggregation:
SELECT Column1,
SUM(Column2) / COUNT(CASE WHEN Column2 <> 0 THEN 1 END) AS Column2
FROM yourTable
GROUP BY Column1
HAVING COUNT(CASE WHEN Column2 <> 0 THEN 1 END) > 0;

You can use where clause to remove rows with 0 in column2 then use aggregation to have your desired result. But it will remove those column1 values which have 0 in all columnd2.
But Query2 will return rows with zero values in column2 instead of removing the removing the row.
Schema and insert statements:
create table testTable (Column1 varchar(50), Column2 int);
insert into testTable values('A', 200);
insert into testTable values('A', 200);
insert into testTable values('A', 0);
insert into testTable values('B', 300);
insert into testTable values('B', 200);
insert into testTable values('C', 100);
insert into testTable values('D', 0);
Query1:
SELECT Column1,
SUM(Column2) / COUNT(*) AS Column2
FROM testTable where column2<>0
GROUP BY Column1;
Output:
Column1
Column2
A
200
B
250
C
100
Query2:
SELECT Column1,
Coalesce(SUM(Column2) / nullif(COUNT(CASE WHEN Column2 <> 0 THEN 1 END),0),0) AS Column2
FROM testTable
GROUP BY Column1;
Output:
Column1
Column2
A
200
B
250
C
100
D
0
db<>fiddle here

You can go for derived table to filter out the 0 column2 rows. Then, you can apply GROUP BY.
declare #table table (Column1 char(1), Column2 int)
insert into #table values
('A',200),
('A',200),
('A',0 ),
('B',300),
('B',200),
('C',100);
SELECT Column1, (sum(column2) / count(column2) ) as column2
from
(
SELECT * FROM #TABLE where Column2 <> 0) as t
group by Column1
Column1
column2
A
200
B
250
C
100

Retrieve distinct value of column1 and ignore zero value of column2 while division with total sum of column2. And also consider here divide by zero error.
-- SQL Server
SELECT t.column1
, t.column2 / (CASE WHEN (t.total - t.total_zero) = 0 THEN 1 ELSE (t.total - t.total_zero) END)
FROM (SELECT column1
, SUM(column2) column2
, COUNT(CASE WHEN column2 = 0 THEN 1 END) total_zero
, COUNT(1) total
FROM test
GROUP BY column1) t
Please check this url https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=3853456941909ffffb8792415adc1f6f

Use AVG() window function because you want the average value of Column2.
If you want all the values of Column1 in the results even if they have only 0s in Column2:
SELECT DISTINCT Column1,
AVG(CASE WHEN Column2 <> 0 THEN Column2 END) OVER (PARTITION BY Column1) Column2
FROM tablename;
If you want results for the values of Column1 that have at least 1 Column2 not 0:
SELECT DISTINCT Column1,
AVG(Column2) OVER (PARTITION BY Column1) Column2
FROM tablename
WHERE Column2 <> 0;
See the demo.
Note that SQL Server truncates the average of integers to an integer, so if you want the result as a floating point number you should multiply Column2 by 1.0, like:
AVG(CASE WHEN Column2 <> 0 THEN 1.0 * Column2 END)
or:
AVG(1.0 * Column2)

Related

SQL: check if multiple values in the column SQL

Is it possible to check if multiple values are in the column and based on that filter out one of them using a WHERE clause?
Obviously this code won't work, but here is the logical example of what I'd like to achieve:
SELECT *
FROM table
WHERE IF column includes ('value1', 'value2') THEN NOT IN 'value1'
example with conditions True:
column
value1
value1
value2
value2
value1
value3
value4
value4
result:
column
value2
value2
value3
value4
value4
Side note: process has to be automated as in one upload, dataset might contain value1 which should remain in place and in the next one both of them will be populated and only value2 will be valid.
If both val1 and val2 exist then exclude val1 otherwise no filter...
declare #t table (col varchar(10))
insert into #t
values
('val1'),('val1'),('val2'),('val3')
select *
from #t
where col <> case when 2 = (select count(*) from (select col from #t where col in('val1','val2') group by col)a)
then 'val1'
else '' end
Results:
col
val2
val3
This is an example when both are not present
declare #t2 table (col varchar(10))
insert into #t2
values
('val1'),('val1'),('val3')
select *
from #t2
where col <> case when 2 = (select count(*) from (select col from #t2 where col in('val1','val2') group by col)a)
then 'val1'
else '' end
Results:
col
val1
val1
val3
Note: the else value needs to be a value that cannot exist in the column col
Note2: This is answered using t-sql
Using QUALIFY. The idea is to compare value of the column against an array generated ad-hoc with case expression to handle subsitution logic:
SELECT *
FROM tab
QUALIFY NOT ARRAY_CONTAINS(col::VARIANT,
ARRAY_AGG(DISTINCT CASE WHEN col IN ('value1', 'value2') THEN 'value1' END) OVER());
For sample data:
CREATE OR REPLACE TABLE tab AS
SELECT $1 AS col
FROM (VALUES
('value1'), ('value1'), ('value2'),
('value2'), ('value1'), ('value3'),
('value4'), ('value4')
)s;
Output:
A more explicit approach is using windowed COUNT_IF:
SELECT *
FROM tab
QUALIFY col NOT IN (CASE WHEN COUNT_IF(col IN ('value1', 'value2')) OVER() > 1
THEN 'value1'
ELSE ''
END);

Oracle SQL show null value

I am trying to do the following query:
SELECT column1,column2,column3
FROM tablename
WHERE
column1 IN
('string1','string2','string3','string4')
AND column2='some_value';
The results show only:
String1, string2,string3 because string4 does not have an equivalent some_value in column2.
How can I show all 4 values with the 4th showing as null like:
Column1|column2|column3
+----------------------+
|string1 value value |
|string2 value value |
|string3 value value |
|string4 null null |
+----------------------+
SELECT column1,column2,column3
FROM tablename
WHERE
column1 IN
('string1','string2','string3','string4')
AND column2='some_value'
OR (column2 is null or column3 is null);
You should saycolumn2 is null at the end. as it doesn't correspond to your where condition.
Add OR
SELECT column1,column2,column3
FROM tablename
WHERE
column1 IN ('string1','string2','string3','string4')
AND (
column2='some_value'
OR column2 is null
);
In this case the treatment is diferent, if value of column2 is only diferent from "some value" (parameter) return all lines.
- In this case:
- column2 with diferent value from parameter will be returned with empty value on column2 and column3.
- column2 with null value will be returned
SELECT column1, CASE WHEN column2 = 'some value' THEN COLUMN2 ELSE NULL END column2, CASE WHEN column2 = 'some value' THEN column3 ELSE NULL END column3
FROM TABLENAME
WHERE
column1 IN
('string1','string2','string3','string4')
AND (column2='value' OR column2 = COLUMN2 OR column2 IS null);

find max value in a row and update new column with the max column name

I have a table like this
number col1 col2 col3 col4 max
---------------------------------------
0 200 150 300 80
16 68 250 null 55
I want to find max value between col1,col2,col3,col4 in every row and update the last column "max" with the max value column name!
for example in first row max value is 300 the "max" column value will be "col3"
result like this:
number col1 col2 col3 col4 max
------------------------------------------
0 200 150 300 80 col3
16 68 250 null 55 col2
How can I do this?
QUERY
SELECT *,(
SELECT MAX(n)
FROM
(
VALUES(col1),(col2),(col3),(col4)
) AS t(n)
) AS maximum_value
FROM #tmp
Update statement
with MaxValues
as (select [number], [max] = (
select (
select max ([n])
from (values ([col1]) , ([col2]) , ([col3]) , ([col4])) as [t] ([n])
) as [maximum_value])
from [#tmpTable])
update [#tmpTable]
set [max] = [mv].[max]
from [MaxValues] [mv]
join [#tmpTable] on [mv].[number] = [#tmpTable].[number];
assuming number is a key column
SQL Fiddle
Check in SQL Fiddle
Schema
DECLARE #temp table ([number] int NOT NULL, [col1] int, [col2] int, [col3] int, [col4] int, [colmax] int);
INSERT #temp VALUES (0, 200, 150, 300, 80, null), (16, 68, 250, null, 55, null);
Query
SELECT number
,(
SELECT MAX(col) maxCol
FROM (
SELECT t.col1 AS col
UNION
SELECT t.col2
UNION
SELECT t.col3
UNION
SELECT t.col4
) a
) col
FROM #temp t
and the update statement is -
UPDATE tempCol
SET colmax = a.col
FROM (
SELECT (
SELECT MAX(col) maxCol
FROM (
SELECT t.col1 AS col
UNION
SELECT t.col2
UNION
SELECT t.col3
UNION
SELECT t.col4
) a
) col
FROM tempCol t
) a

combining resultset of many select queries

I have four Select queries for four different tables, each extracting only one record. For example:
Select * from table where col1 = 'something'
gives one row having 3 columns.
The second select query also gives one record having two columns(fields). Same for third and fourth select query.
I want to combine all four result sets into one having one row. How is it possible?
I will write the queries for you.
1st one:
Select Top 1 column1, column2
from table 1
where column 1 = 'something'
and col1 = (Select max(col1) where column 1 = 'something')
2nd query:
Select Top 1 column1, column3
from table 2
where column 1 = 'something'
and column3 = (Select max(column3) where column 1 = 'something')
3rd query uses the result obtained from query 2:
Select column4
from table 3
where column3 = (obtained from 2nd query) (there is only one row)
4th:
Select column5
from table 4
where column3 = (obtained from 2nd query) (there is only one row)
This means I have to join 2nd, 3rd, 4th query, then resulting set in 1st.
I can't use union since columns are different.
So only problem is with joining the result set.
You can use CROSS JOINs to accomplish this.
CREATE TABLE table1 (id int, column1 varchar(5), column2 varchar(15));
CREATE TABLE table2 (column3 varchar(5), column4 varchar(15));
CREATE TABLE table3 (id int, column5 varchar(5), column6 varchar(15));
INSERT INTO table1 VALUES (1, 'aaa', 'row1')
INSERT INTO table2 VALUES ('bbb', 'table2')
INSERT INTO table3 VALUES (1, 'ccc', 'table3')
INSERT INTO table1 VALUES (1, 'ddd', 'table1')
SELECT * FROM (SELECT * FROM table1) a
CROSS JOIN (SELECT * FROM table2) b
CROSS JOIN (SELECT * FROM table3) c
Result:
id column1 column2 column3 column4 id column5 column6
1 aaa row1 bbb table2 1 ccc table3
1 ddd table1 bbb table2 1 ccc table3
Update after clarification:
CREATE TABLE table1
(
id int IDENTITY(1,1)
, searchstring nvarchar(25)
);
CREATE TABLE table2
(
id2 int IDENTITY(10, 10)
, searchstring2 nvarchar(25)
, newsearchstring nvarchar(50)
);
CREATE TABLE table3
(
id3 int IDENTITY(100, 100)
, id2 int
, table3srow nvarchar(25)
)
INSERT INTO table1 VALUES ('something');
INSERT INTO table1 VALUES ('something else');
INSERT INTO table1 VALUES ('something'); -- ID = 3, this row will be selected by 1st query
INSERT INTO table2 VALUES ('something', 'newvalue1');
INSERT INTO table2 VALUES ('something else', 'this will not be shown');
INSERT INTO table2 VALUES ('something', 'this will be returned by query 2'); -- ID = 30, this row will be selected by 2nd query
INSERT INTO table3 VALUES (10, 'not relevant');
INSERT INTO table3 VALUES (20, 'not relevant');
INSERT INTO table3 VALUES (30, 'This is from table 3'); -- This row will be returned by 3rd query
SELECT * FROM
(SELECT TOP 1 id, searchstring FROM table1 WHERE searchstring = 'something' and id = (SELECT MAX(id) FROM table1 WHERE searchstring = 'something')) AS query1,
(SELECT TOP 1 id2, newsearchstring FROM table2 WHERE searchstring2 = 'something' and id2 = (SELECT MAX(id2) FROM table2 WHERE searchstring2 = 'something')) AS query2,
(SELECT id2, table3srow FROM table3) as query3
WHERE query3.id2 = query2.id2
Use the same approach for table4 as indicated for table3.

SQL to find rows where two columns have the same value

I have 3 columns in Oracle database having table mytable and i want records having only duplicate values in 2nd and 3rd column.
SQL> select * from mytable ;
column1 column2 column3
A 50 50----required output
A 10 20----have different values i.e. 10 and 20
A 50 50----required output
A 30 70----have different values i.e. 30 and 70
B 20 20----required output
B 40 30----have different values i.e. 40 and 30
I want the following output with count(*):
column1 column2 column3
A 50 50
A 50 50
B 20 20
Any help is much appreciated
select column1, count (*)
from mytable
where column2 = column3
group by column1, column2;
From your question it is not clear about primary key as A in First Column is being repeated many times.
You can try the following:
select column1, column2, column3, count(*) from
mytable where column2 = column3 group by column1, column2, column3;
Here are sample example , i am doing this SQL Server but i am sure this query work in ORACLE also
EXAMPLE :
Create table #Test (colA int not null, colB int not null, colC int not null, id int not null identity) on [Primary]
GO
INSERT INTO #Test (colA,colB,colC) VALUES (1,1,1)
INSERT INTO #Test (colA,colB,colC) VALUES (1,1,1)
INSERT INTO #Test (colA,colB,colC) VALUES (1,1,1)
INSERT INTO #Test (colA,colB,colC) VALUES (1,2,3)
INSERT INTO #Test (colA,colB,colC) VALUES (1,2,3)
INSERT INTO #Test (colA,colB,colC) VALUES (1,2,3)
INSERT INTO #Test (colA,colB,colC) VALUES (4,5,6)
GO
Select * from #Test
GO
select count(colA) as tot_duplicate_count , colA ,colB ,colC from #Test where id <=
(Select Max(id) from #Test t where #Test.colA = t.colA and
#Test.colB = t.colB and
#Test.colC = t.colC)
group by colA ,colB ,colC
having count(colA) > 1
This query this total count of duplicate record per data row