sql: select rows where group of elements occurs several times in the table - sql

I am searching for an implementation of the following pseodo-code:
SELECT A, B, C
FROM X
HAVING COUNT(A,B) > 1
Here is an example of what the code should do:
Assume table X looks as follows:
A B C D
--------------
1 1 0 2
1 1 1 1
2 1 1 0
The first and the second row have the same entries in columns A and B, the third column is identical in column B but different in column A. The desired output is columns A,B, and C of rows 1 and 2:
1 1 0
1 1 1
How could this be implemented? The problem with my pseodo-code is, that COUNT accepts either a single column or all columns (*), but it can't take two out of 4 columns. GROUP BY has the same property.

You can do this with an exists clause. This should work in all databases:
select a, b, c
from x
where exists (select 1
from x x2
where x.a = x2.a and x.b = x2.b and x.c <> x2.c
);
This assumes that the rows have difference c values.
This will perform best with an index on x(a, b).

For RDMS that supports analytic functions, you can do
SELECT a,b,c
FROM
(
SELECT a, b, c, count(1) OVER(PARTITION BY a,b) cnt
FROM X
)t1
WHERE t1.cnt >1
If analytic/windows function are not available , join should do the job
SELECT t1.a, t1.b, t1.c
FROM X t1
INNER JOIN
(
SELECT a,b
FROM X
GROUP BY a,b
HAVING COUNT(1) >1
)t2 ON (t2.a=t1.a AND t2.b=t1.b)

Related

Break out nested data within SQL, criteria across multiple rows (similar to dcast in R)

I'm trying to write a simple query to take a data set that looks like this:
ID | Col2
X B
X C
Y B
Y D
and return this:
ID | Col2 | Col3
X B C
Y B D
Essentially, I have an ID column that can have either B, C, or D in Col2. I am trying to identify which IDs only have B and D. I have a query to find both, but not only that combination. Query:
select ID, Col2
from Table1
where ID in (
select ID from Table1
group by ID
having count(distinct Col2) = 2)
order by ID
Alternatively, I could use help in finding a way to filter that query on B and D and leave off B and C. I have seen perhaps a self join, but am not sure how to implement that.
Thanks!
EDIT: Most of the data set has, for a given ID, all three of B, C, and D. The goal here is to isolate the IDs that are missing one, namely missing C.
I am trying to identify which IDs only have B and D. I have a query to find both
If this is what you want, you don't need multiple columns:
select id
from table1
where col2 in ('B', 'D')
group by id
having count(distinct col2) = 2;
If you want only 'B' and 'D' and no others, then:
select id
from table1
group by id
having sum(case when col2 = 'B' then 1 else 0 end) > 0 AND
sum(case when col2 = 'C' then 1 else 0 end) > 0 AND
sum(case when col2 not in ('B', 'D') then 1 else 0 end) = 0;
If there are only two columns, you can also easily pivot the values using aggregation:
select id, min(col2), nullif(max(col2), min(col2))
from table1
group by id;

SQL aggregate and filter functions

Consider following table:
Number | Value
1 a
1 b
1 a
2 a
2 a
3 c
4 a
5 d
5 a
I want to choose every row, where the value for one number is the same, so my result should be:
Number | Value
2 a
3 c
4 a
I manage to get the right numbers by using nested
SQL-Statements like below. I am wondering if there is a simpler solution for my problem.
SELECT
a.n,
COUNT(n)
FROM
(
SELECT number n , value k
FROM testtable
GROUP BY number, value
) a
GROUP BY n
HAVING COUNT(n) = 1
You can try this
SELECT NUMBER,MAX(VALUE) AS VALUE FROM TESTTABLE
GROUP BY NUMBER
HAVING MAX(VALUE)=MIN(VALUE)
You can try also this:
SELECT DISTINCT t.number, t.value
FROM testtable t
LEFT JOIN testtable t_other
ON t.number = t_other.number AND t.value <> t_other.value
WHERE t_other.number IS NULL
Another alternative using exists.
select distinct num, val from testtable a
where not exists (
select 1 from testtable b
where a.num = b.num
and a.val <> b.val
)
http://sqlfiddle.com/#!9/dd080dd/5

How to sort the query result by the number of specific column in ACCESS SQL?

For example:
This is the original result
Alpha Beta
A 1
B 2
B 3
C 4
After Order by the number of Alpha, this is the result I want
Alpha Beta
B 2
B 3
A 1
C 4
I tried to use GroupBy and OrderBy, but ACCESS always ask me to include all columns.
Why is 'B' placed before 'A' ? I don't understand this order..
Any way, doesn't seem like you need a group by, not from your data sample, but for your desired result you can use CASE EXPRESSION :
SELECT t.alpha,t.beta FROM YourTable t
ORDER BY CASE WHEN t.alpha = 'B' THEN 1 ELSE 0 END DESC,
t.aplha,
t.beta
EDIT: Use this query:
SELECT t.alpha,t.beta FROM YourTable t
INNER JOIN(SELECT s.alpha,count(*) as cnt
FROM YourTable s
GROUP BY s.alpha) t2
ON(t.aplha = t2.alpha)
ORDER BY t2.cnt,t.alpha,t.beta
The query counts number of rows for every distinct Alpha and sorts. General Sql, tweak for ACCESS if needed.
SELECT t1.alpha,t1.beta
FROM t t1
JOIN (
SELECT t2.alpha, count(t2.*) AS n FROM t t2 GROUP BY t2.alpha
) t3 ON t3.alpha = t1.alpha
ORDER BY t3.n, t1.alpha, t1.beta

Combine two (or multiple) columns of a table

I have a table
a b c
1 2
1 3
1 4 1
2 1 2
The column a and c should be combined if the value is the same. If there are not the same, it is always so that one is empty
So the result should be:
a b
1 2
1 3
1 4
2 1
Is there any function that can be applied in PostgreSQL?
According to your description:
The column a and c should be combined if the value is the same. If
there are not the same, it is always so that one is empty
all you need is an unconditional COALESCE.
SELECT COALESCE(a, c) AS a, b FROM tbl;
Assuming that by "empty" you mean NULL, not an empty string (''), in which case you'd add NULLIF:
SELECT COALESCE(NULLIF(a, ''), c) AS a, b FROM tbl;
COALESCE works for multiple parameters:
SELECT COALESCE(a, c, d, e, f, g) AS a, b FROM tbl;
Are you looking for something like this?
SELECT COALESCE(c, a), b
FROM your_table
WHERE COALESCE(c, a) = a

SQL (TSQL) - Select values in a column where another column is not null?

I will keep this simple- I would like to know if there is a good way to select all the values in a column when it never has a null in another column. For example.
A B
----- -----
1 7
2 7
NULL 7
4 9
1 9
2 9
From the above set I would just want 9 from B and not 7 because 7 has a NULL in A. Obviously I could wrap this as a subquery and USE the IN clause etc. but this is already part of a pretty unique set and am looking to keep this efficient.
I should note that for my purposes this would only be a one-way comparison... I would only be returning values in B and examining A.
I imagine there is an easy way to do this that I am missing, but being in the thick of things I don't see it right now.
You can do something like this:
select *
from t
where t.b not in (select b from t where a is null);
If you want only distinct b values, then you can do:
select b
from t
group by b
having sum(case when a is null then 1 else 0 end) = 0;
And, finally, you could use window functions:
select a, b
from (select t.*,
sum(case when a is null then 1 else 0 end) over (partition by b) as NullCnt
from t
) t
where NullCnt = 0;
The query below will only output one column in the final result. The records are grouped by column B and test if the record is null or not. When the record is null, the value for the group will increment each time by 1. The HAVING clause filters only the group which has a value of 0.
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
If you want to get all the rows from the records, you can use join.
SELECT a.*
FROM TableName a
INNER JOIN
(
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
) b ON a.b = b.b