I have table with:
A B
1 2
2 1
and i trying using sql command to get only one combination
A B
1 2
how can i do that?
A canonical way in standard SQL is:
select a, b
from t
where a < b
union all
select a, b
from t
where a > b and not exists (select 1 from t t2 where t2.a = t.b and t2.b = t.a);
Note that this assumes no duplicates or equal values. You can easily handle these using select distinct and <= comparisons. In my experience, this problem often arises when there are at most two rows per pair.
This preserves the original values. So, if you start with:
1 2
5 4
You will get that in the result set.
If you don't care about ordering, then many databases support least()/greatest():
select least(a, b) as a, greatest(a, b) as b
from t
group by least(a, b), greatest(a, b);
You can do the same thing with case expressions. Or, more simply as:
select distinct least(a, b) as a, greatest(a, b) as b
from t;
Related
My input is:
a b c
-------
A 5 3
A 4 2
B 3 1
B 5 3
I would like to get all a values having the same values in b and c, so the output should be as:
{A,B} 5 3
I am using the group by, but I am not achieving my goal.
In standard SQL, this would look like:
select b, c, listagg(a, ',') within group (order by a)
from t
group by b, c;
Not all databases support listagg(), but most have a method for concatenating strings.
In Hive, you would use collect_list() or collect_set():
select b, c, collect_list(a, ',')
from t
group by b, c;
You can convert the array back to a string, but I recommend keeping it as an array.
I have a table in the following format:
A B C D
7 7 2 12
2 2 3 4
2 2 2 4
2 2 2 3
5 5 2 7
I would like to calculate correlations between each of the columns using the build-in correlation function (https://prestodb.io/docs/current/functions/aggregate.html corr(y, x) → double)
I could run over all the columns and perform the corr calculation each time with:
select corr(A,B) from table
but I would like to reduce the number of times I access presto and run it in one query if its possible.
Would it be possible to get as a result the column names that pass a certain threshold or at least the correlation scores between all possible combinations in one query?
Thanks.
I would like to calculate correlations between each of the columns
Correlation involves two series of data (in SQL, two columns). So I understand your question as: how to compute the correlation for each and every possible combination of columns in the table. That would look like:
select
corr(a, b) corr_a_b,
corr(a, c) corr_a_c,
corr(a, d) corr_a_d,
corr(b, c) corr_b_c,
corr(b, d) corr_c_d,
corr(c, d) corr_c_d
from mytable
You can use a lateral join to unpivot the table, then a self join and aggregation:
with v as (
select v.*, t.id
from (select t.*,
row_number() over (order by a) as id
from t
) t cross join lateral
(values ('a', a), ('b', b), ('c', c), ('d', d)
) v(col, val)
)
select v1.col, v2.col, corr(v1.val, v2.val)
from v v1 join
v v2
on v1.id = v2.id and v1.which < v2.which
group by v1.col, v2.col;
The row_number() is only to generate a unique id for each row, which is then used for the self-join. You may already have a column with this information, so that might not be necessary.
i need to filter data using different conditions. One is that I need to queck if the values in one column (column d) are unique IF the values in another column (c) are greater than 1.
Lets assume:
Column a, b, c, d
So I don't want any entries, where c is greater than 1 while d has non unique values.
Select TOP 100 * From table
Where (a = 'Max' AND b = '2019') -- just an additional filter, which always applies
AND (c = 1 -- if c is one, that is fine
OR (c > 1 AND -- here I want to check if c is bigger than 1 AND if d is unique; but thats the part I need help with
);
Thank you very much in advance!
Create a CTE where you count the distinct values of column d and use it in the WHERE clause:
with cte as (
select count(distinct d) counter from tablename
)
...........................................
Where ....(c > 1 AND (select counter from cte) = 1)
I am seeking a way to SELECT rows conditionally without having only compound key A,B (refer to the picture).
Furthermore, I need to select rows where negative value and positive value of column C is present; skipping 0. There may be any combination of row count with A, B group the minimum is 2 where C has a negative or positive row.
The data found below is already queried.
Note: I was able to add another column D, because we can't use actual values for C:
D = CASE WHEN C < 0 THEN 1 ELSE 2 end
So the logic could be SELECT * WHERE SUM(D) >= 3.
I am fully able to complete this task with another language such as C#, but I have to get this done using only SQL.
I would also like to avoid temporary tables. Column D is not required.
Would this work?
Select tblA.*
FROM tblA
INNER JOIN
(select A,B
from tblA
Group By A,B
HAVING
SUM(case when C<0 then 1 else 2 end) >=3
)X
on X.A=tblA.A and X.B=tblA.B
SQLFiddle
http://sqlfiddle.com/#!9/2078f/2
In Oracle SQL Developer, how do I compare three tables where A + B = C table? I have to validate if all the data of A and B is converted into C. Also table A is in a different database from B and C, which are in the same database.
Let me assume that the different databases have one column, an id. You could use full outer join for this, assuming it is never NULL. However, this is probably easier using union all and aggregation.
You can get a list of ids that differ using the following query:
select id, sum(inab) as inab, sum(inc) as inc
from ((select id, 1 as inab, 0 as inc
from a
) union all
(select id, 1 as inab, 0 as inc
from b
) union all
(select id, 0 as inab, 1 as inc
from c
)
) c
group by id
having sum(inab) <> 1 or sum(inc) <> 1;
In practice, you would probably have multiple columns. Note: if there are duplicates in A+B or C, this just guarantees that the duplicate appears in both (rather than in both with the same count).