How to get unique combinations - sql

This is sample table and I wanted to fetch unique combinations. Need help with SQL query.
a z
b x
c w
d s
e t
z a
x b
w c
s d
t e
Required output:
a z
b x
c w
d s
e t

It looks like you want to select distinct pairs. You first need to transform the pairs a little so that x, y and y, x are treated as identical, then use DISTINCT clause:
CREATE TABLE #t (col1 CHAR(1), col2 CHAR(1));
INSERT INTO #t VALUES
('a', 'z'),
('b', 'x'),
('c', 'w'),
('d', 's'),
('e', 't'),
('z', 'a'),
('x', 'b'),
('w', 'c'),
('s', 'd'),
('t', 'e');
SELECT DISTINCT
CASE WHEN col1 < col2 THEN col1 ELSE col2 END,
CASE WHEN col1 < col2 THEN col2 ELSE col1 END
FROM #t

You seem to have complete duplicates, so just use <:
select a, b
from t
where a < b;
If you don't have complete duplicates and you want to preserve the original values, I recommend union all:
select a, b
from t
where a < b
union all
select a, b
from t
where a > b and
not exists (select 1 from t t2 where t2.a = t.b and t2.b = t.a);

Or use a combination of the attempts:
select col1, col2 from t where col1 < col2
union
select col2, col1 from t where col1 >= col2;
This will work if you have only duplicates or if some combinations are only once in your table.

Related

Filtering data by comparing tables

I have two tables A & B. I want to extract column 1 from table A and make sure that I only extract data that does not exist in column 1 & 2 in table B. How can I achieve this?
Additionally, column 2 in table B contains aggregated data, which means that the data looks like[a,b,c,d].
Example:
Table A
column_1
a
b
c
d
e
Table B
column_1
column_2
a
[a,b,c,d]
z
[c,b,s,f]
x
[g,h,i,j]
y
[k,l,m,n]
z
[o,p,q,r]
In this example, I want to extract only 'e' from table A as it is not in either column_1 or column_2 from table B.
One way is to concatenate to columns of Table B and unnest the resulting array and then join or use not in to filter values in Table A:
-- sample data
WITH dataset(column_1) AS (
values ('a'),
('b'),
('c'),
('d'),
('e')
),
dataset2(column_1, column_2) as (
values ('a', array['a', 'b' , 'c', 'd']),
('z', array['c', 'b' , 's', 'f']),
('x', array['g', 'h' , 'i', 'j']),
('y', array['k', 'l' , 'm', 'n']),
('z', array['o', 'p' , 'q', 'r'])
)
-- query
select *
from dataset
where column_1 not in (select distinct col
from dataset2,
unnest(column_1 || column_2) as t(col) );
Output:
column_1
e

How to insert two or more columns data into one column form one table to another table using Insert statement

I have two tables
Column_1 from SRC table will define to which columns
in the target table the SRC data values should get inserted into.
SRC TARGET
Col1 Col2 Col3 Col4 Tcol1 Tcol2 Tcol3 Tcol4
Test1 A B C Test1 A B C
Test2 X Y Z Test2 Z X Y
Test3 L M N Test3 M L N
Test3 L M N Test3 M L N
Test2 D E F Test2 F D E
I want to insert the data like the way how I shown above, depends on the col_1 in src table, target columns should get mapped .
Insert into TARGET(Tcol1,Tcol2,Tcol3)
select Col1 , Col2, Col3
from src;
but here I dont how to handle this situation like target table is fixed .
for the first scenario first row from the src table will map as is as shown in the above sql but when it comes to 2nd row here I have to insert the values of first column to 2nd column of target table and in the same way 3rd row also.
Im writing one procedure but it will only work for fixed target and fixed source tables but how could I write sql script in this scenario.
Thansk in advance.
Insert into TARGET(Tcol1,Tcol1,Tcol1)
select Tcol1,Tcol2,Tcol3
from
select Col1 as Tcol1,
Col2 as Tcol2,
Col3 as Tcol3
from src;
Is there any way that I can take one function and map the values based on column1 in SRC table.
Not an answer, but trying to understand the need.
This exactly gives your expected result, with hardcoded values based on the value of Col1:
with src (Col1, Col2, Col3, Col4) as (
select 'Test1', 'A', 'B', 'C' from dual union all
select 'Test2', 'X', 'Y', 'Z' from dual union all
select 'Test3', 'L', 'M', 'N' from dual union all
select 'Test3', 'L', 'M', 'N' from dual union all
select 'Test2', 'D', 'E', 'F' from dual
)
select Col1 as Tcol1,
case (Col1)
when 'Test1' then Col2
when 'Test2' then Col4
when 'Test3' then Col3
end as Tcol2,
case (Col1)
when 'Test1' then Col3
when 'Test2' then Col2
when 'Test3' then Col2
end as Tcol3,
case (Col1)
when 'Test1' then Col4
when 'Test2' then Col3
when 'Test3' then Col4
end as Tcol4
from src
TCOL1 TCOL2 TCOL3 TCOL4
----- ----- ----- -----
Test1 A B C
Test2 Z X Y
Test3 M L N
Test3 M L N
Test2 F D E
Is this correct? Does this logic apply to all the rows of your table? How to edit it?

COALESCE function won't return CHAR(1)

Using COALESCE function but getting the following error:
Conversion failed when converting the varchar value 'X' to data type int.
I have to join two tables on two conditions. I want that if the second condition doesn't hold but there is a blank cell (not null but blank '') in Table 1 then to join to that row. If the second condition doesn't hold then to return a zero.
Join Table 1 and Table 2 - return Table 2 and column 3 from Table 1.
Table 1
(A, 1, X),
(A, 2, Y),
(A, 3, Z),
(A, , X),
(B, 1, X),
(B, 2, Z),
(B, 3, Y),
Table 2
(A, 1),
(A, 2),
(A, 3),
(A, 5),
(B, 1),
(B, 2),
(B, 3),
(B, 5)
I want to get a return of
(A, 1, X),
(A, 2, Y),
(A, 3, Z),
(A, 5, X),
(B, 1, X),
(B, 2, Z),
(B, 3, Y),
(B, 5, NULL)
Code:
DECLARE #table1 TABLE (letter1 CHAR(1), num1 INT, letter2 CHAR(1))
DECLARE #table2 TABLE (letter1 CHAR(1), num1 INT)
INSERT INTO #table1 VALUES
('A', 1, 'X'),
('A', 2, 'Y'),
('A', 3, 'Z'),
('A', null, 'X'),
('B', 1, 'X'),
('B', 2, 'Y'),
('B', 3, 'Z')
INSERT INTO #table2 VALUES
('A', 1),
('A', 2),
('A', 3),
('A', 5),
('B', 1),
('B', 2),
('B', 3),
('B', 5)
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 IS NULL),
0
) AS missing_letter
FROM #table2 t2
Perhaps you need :
select t1.*, t2.*
from table1 t1 outer apply
( select top (1) t2.*
from table2 t2
where t1.col1 = t.col1 and t1.col2 in ('', t2.col2)
order by t2.col2 desc
) t2;
If I understand correctly, this has less to do with coalesce() and more to do with the joins:
select t2.*, coalesce(t1.letter2, t1def.letter2) as letter2
from table2 t2 left join
table1 t1
on t2.letter1 = t1.letter1 and t2.num1 = t1.num1 left join
table1 t1def
on t2.letter1 = t1def.letter1 and t1def.num1 is null;
The problem here is your datatype. COALESCE is short hand for a CASE expression. For example. COALESCE('a',1,'c') would be short hand for:
CASE WHEN 'a' IS NOT NULL THEN 'a'
WHEN 1 IS NOT NULL THEN 1
ELSE 'c'
END
The Documentation (COALESCE (Transact-SQL) describes this as well:
The COALESCE expression is a syntactic shortcut for the CASE
expression. That is, the code COALESCE(expression1,...n) is
rewritten by the query optimizer as the following CASE expression:
CASE
WHEN (expression1 IS NOT NULL) THEN expression1
WHEN (expression2 IS NOT NULL) THEN expression2
...
ELSE expressionN
END
A CASE expression follows Data type precedence, and int has a higher datatype precedence than varchar; thus everything will implicit cast to an int. This is why both the COALESCE and CASE expression will fail, because neither 'a' or 'c' can be converted to an int.
You'll need to therefore explicitly CONVERT your int to a varchar:
COALESCE('a',CONVERT(char(1),1),'c')
The documentation (cited above), however, also goes to state:
This means that the input values (expression1, expression2,
expressionN, etc.) are evaluated multiple times. Also, in compliance
with the SQL standard, a value expression that contains a subquery is
considered non-deterministic and the subquery is evaluated twice. In
either case, different results can be returned between the first
evaluation and subsequent evaluations.
For example, when the code COALESCE((subquery), 1) is executed, the
subquery is evaluated twice. As a result, you can get different
results depending on the isolation level of the query. For example,
the code can return NULL under the READ COMMITTED isolation level in a
multi-user environment. To ensure stable results are returned, use the
SNAPSHOT ISOLATION isolation level, or replace COALESCE with the
ISNULL function.
Considering you are using a subquery, a (nested) ISNULL might be the better choice here.
It's worth noting, as people seem to confuse them as they are functionally similar, but COALESCE and ISNULL do not behave the same. COALESCE uses Data Type precedence, however, ISNULL implicitly casts the second value to whatever the datatype of the first paramter is. Thus ISNULL('a',1) works fine, but COALESCE('a',1) does not.
Just change the zero to a null. You can't mix datatypes in a coalesce:
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 WHERE letter1 = t2.letter1 AND num1 IS NULL),
null
) AS missing_letter
FROM #table2 t2
The query works if the 0 in the COALESCE is replaced by '0'.
That way the COALESCE doesn't contain mixed data types.
SELECT t2.*,
COALESCE(
(SELECT TOP 1 letter2 FROM #table1 t1 WHERE t1.letter1 = t2.letter1 AND t1.num1 = t2.num1),
(SELECT TOP 1 letter2 FROM #table1 t1 WHERE t1.letter1 = t2.letter1 AND t1.num1 IS NULL),
'0'
) AS missing_letter
FROM #table2 t2
ORDER BY t2.letter1, t2.num1;
And you can avoid having to retrieve data from table1 twice.
By using an OUTER APPLY.
Since the expected results has a NULL for ('B',5), the COALESCE isn't even needed this way.
SELECT t2.letter1, t2.num1, t1.letter2 AS missing_letter
FROM #table2 AS t2
OUTER APPLY (
select top 1 t.letter2
from #table1 AS t
where t.letter1 = t2.letter1
and (t.num1 is null or t.num1 = t2.num1)
order by t.num1 desc
) AS t1
ORDER BY t2.letter1, t2.num1;
Result:
letter1 num1 missing_letter
------- ---- --------------
A 1 X
A 2 Y
A 3 Z
A 5 X
B 1 X
B 2 Y
B 3 Z
B 5 NULL

Delete rows in table that are sum of other rows per group

Group rows by T, and in each group find the row that is the largest or smallest (if values are negative) sum of other rows from that group, and delete that row (one for each group), if group does not have enough elements to find sum or enough but none of the rows indicates sum of others nothing happens
CREATE TABLE Test (
T varchar(10),
V int
);
INSERT INTO Test
VALUES ('A', 4),
('B', -5),
('C', 5),
('A', 2),
('B', -1),
('C', 10),
('A', 2),
('B', -4),
('C', 5),
('D', 0);
expected result:
A 2
A 2
B -1
B -4
C 5
C 5
D 0
Like the comments, the requirements seem strange. The below code assumes that the summing is already pre-populated and merely removes the largest/smallest as long as the highest value is not 0.
if object_id('tempdb..#test') is not null
drop table #test
CREATE TABLE #Test (
T varchar(10),
V int
);
INSERT INTO #Test
VALUES ('A', 4), ('B', -5), ('C', 5), ('A', 2), ('B', -1), ('C', 10), ('A', 2), ('B', -4), ('C', 5), ('D', 0);
if object_id('tempdb..#test2') is not null
drop table #test2
SELECT
T,
V,
ABS(V) as absV
INTO #TEST2
FROM #TEST
SELECT * FROM #TEST2
if object_id('tempdb..#max') is not null
drop table #max
SELECT
T,
MAX(absV) AS MaxAbsV
INTO #Max
FROM #TEST2
GROUP BY T
HAVING MAX(AbsV) != 0
DELETE #TEST2
FROM #TEST2
INNER JOIN #MAX ON #TEST2.T = #MAX.T AND #TEST2.absV = #Max.MaxAbsV
SELECT * FROM #TEST2
ORDER BY T ASC
; with cte as
(
select T, V,
R = row_number() over (partition by T order by ABS(V) desc),
C = count(*) over (partition by T)
from Test
)
delete c
from cte c
inner join
(
select T, S = sum(V)
from cte
where R <> 1
group by T
) s on c.T = s.T
where c.C >= 3
and c.R = 1
and c.V = s.S
Using ABS and NOT Exists
DECLARE #Test TABLE (
T varchar(10),
V int
);
INSERT INTO #Test
VALUES ('A', 4), ('B', -5), ('C', 5), ('A', 2), ('B', -1), ('C', 10), ('A', 2), ('B', -4), ('C', 5), ('D', 0);
;WITH CTE as (
select T,max(ABS(v ))v from #Test
WHERE V <> 0
GROUP BY T )
SELECT T,V FROM #Test T where NOT exists (Select 1 FROM cte WHERE T = T.T AND v = ABS(T.V) )
ORDER BY T.T
Determine first if the rows are positive or negative by checking if SUM(V) is positive. And then determine if the smallest or largest value is equal to the SUM of the other rows, by subtracting from SUM(V) the MIN(V) if negative or MAX(V) if positive:
DELETE t
FROM Test t
INNER JOIN (
SELECT
T,
SUM(V) - CASE WHEN SUM(V) >= 0 THEN MAX(V) ELSE MIN(V) END AS ToDelete
FROM Test
GROUP BY T
HAVING COUNT(*) >= 3
) a
ON a.T = t.T
AND a.ToDelete = t.V
ONLINE DEMO
You can use the below query to get the required output :-
select * into #t1 from test
select * from
(
select TT.T as T,TT.V as V
from test TT
JOIN
(select T,max(abs(V)) as V from #t1
group by T) P
on TT.T=P.T
where abs(TT.V) <> P.V
UNION ALL
select A.T as T,A.V as V from test A
JOIN(
select T,count(T) as Tcount from test
group by T
having count(T)=1) B on A.T=B.T
) X order by T
drop table #t1
You are looking for a value per group that is the sum of all the group's other values. E.g. 4 of (2,2,4) or -5 of (-5,-4,-1).
This is usually only one record per group. But it can be multiple times the same number. Here are examples for ties: (0,0) or (-2,2,4,4), or (-2,-2,4,4,4) or (-10,3,3,3,3,4).
As you see, you are looking in any way for values that equal half of the group's total sum. (Of course. We are looking for n+n, where one n is in one record and the other n is the sum of all the other records.)
The only special case is when there is only one value in the group which is zero. That we don't want to delete of course.
Here is an update statement that cannot deal with ties, but would delete all maximum values instead of just one:
delete from test
where 2 * v =
(
select case when count(*) = 1 then null else sum(v) end
from test fullgroup
where fullgroup.t = test.t
);
In order to deal with ties you would need artificial row numbers, so as to delete only one record of all candidates.
with candidates as
(
select t, v, row_number() over (partition by t order by t) as rn
from
(
select
t, v,
sum(v) over (partition by t) as sumv,
count(*) over (partition by t) as cnt
from test
) comparables
where sumv = 2 * v and cnt > 1
)
delete
from candidates
where rn = 1;
SQL fiddle: http://sqlfiddle.com/#!6/6d97e/1
See if below query helps:
DELETE [Audit].[dbo].[Test] FROM [Audit].[dbo].[Test] as AA
INNER JOIN (select T,
CASE
WHEN MAX(V) < 0 THEN MIN(V)
WHEN MIN(V) > 0 THEN MAX(V) ELSE MAX(V)
END as MAX_V,
CASE
WHEN SUM(V) > 0 THEN SUM(V) - MAX(V)
WHEN SUM(V) < 0 THEN SUM(V) - MIN(V) ELSE SUM(V)
END as SUM_V_REST
from [Audit].[dbo].[Test]
Group by T
Having Count(V) > 1) as BB ON AA.T = BB.T and AA.V = BB.MAX_V

SQL difficult calculation

Asume we have a table with columns A,B,C, where A has data type of a String, B and C are both Integers. I need to make another table from this one with 4 columns - A, B, sum(C) and D, where D is the sum of values in C with the same value in column A (as this one) and value in B less by 1 (than this one). For example, for given values "a" at A and 1 at B it should return (in column D) the sum of all of the values in C where value at A is "a" and value at B is zero.
I think you are looking for something similar to this query:
declare #t table (a varchar(1), b int, c int)
insert #t values
('a', 1, 1),
('a', 2, 2),
('b', 3, 3),
('a', 4, 4)
select t.a, t.b,
(
select sum(t1.c)
from #t t1
) c,
(
select isnull(sum(t2.c), 0)
from #t t2
where t2.a = t.a and t2.b = t.b - 1
)
from #t t