How can I group by two rows in SQL? - sql

In the result of an SQL Select command I have two rows:
A | B
B | A
A|B and B|A means the same to me. I want, that only one of them would be selected in an SQL command.
How can I do that?
I have a select command , I join it self (natural join), like this:
SELECT a.coloumn ,b.coloumn
FROM table a,table b
where .... (not important)
and b.coloumn IN (
SELECT coloumn
FROM table
where ... (the same like above)
)
and b.coloumn != a.coloumn ;
And after that I have multiple coloumns.

You neither told us your column names nor your table name, but assuming you have two columns A and B in a table named the_table then the following will do:
select distinct least(a,b), greatest(a,b)
from the_table;

If you want to group by them using standard SQL:
select (case when a < b then a else b end) as a,
(case when a < b then b else a end) as b,
count(*) as cnt
from table t
group by (case when a < b then a else b end),
(case when a < b then b else a end);
Oracle supports the greatest() and least() functions, but not all databases do.

Another possible solution is:
select a, b from the_table
union
select b, a from the_table
This would work fine even if there are NULL values.

Related

SQL code to count only duplicates where all instances of the duplicate are null

I have a large data set with duplicate reference numbers (reference duplications range from 0 to 37 times). I want to count the number of references only where all instances are null in two columns. So using the table below, the code should return 1 because only Reference Code 3 has all null values, and the duplicates should only be counted once.
I would be grateful for any help.
This involves two steps: (1) isolate all the distinct pairs of values that only have null; (2) count each one once. One way to express this in a query is:
SELECT COUNT(*) FROM
(
SELECT refnum FROM #ref
GROUP BY refnum
HAVING MIN(colA) IS NULL
AND MIN(colB) IS NULL;
) AS x;
Use aggregation to get the codes:
select code
from t
group by code
having max(a) is null and max(b) is null;
If you want the count, use a subquery:
select count(*)
from (select code
from t
group by code
having max(a) is null and max(b) is null
) t;
With conditional aggregation:
select
refcode
from referencecodes
group by refcode
having sum(case when (a is null and b is null) then 0 else 1 end) = 0
The above will return the codes with only null values in a and b.
If you want the number of codes:
select count(r.refcode) from (
select
refcode
from referencecodes
group by refcode
having sum(case when (a is null and b is null) then 0 else 1 end) = 0
) r
Or with EXISTS:
select
count(distinct r.refcode)
from referencecodes r
where not exists (
select 1 from referencecodes
where (refcode = r.refcode) and (a is not null or b is not null)
)
See the demo

Filter if values provided otherwise return everything

Say I have a table t with 2 columns:
a int
b int
I can do a query such as:
select b
from t
where b > a
and a in(1,2,3)
order by b
where 1,2,3 is provided from the outside.
Obviously, the query can return no rows. In that case, I'd like to select everything as if the query did not have the and a in(1,2,3) part. That is, I'd like:
if exists (
select b
from t
where b > a
and a in(1,2,3)
)
select b
from t
where b > a
and a in(1,2,3)
order by b
else
select b
from t
where b > a
order by b
Is there a way to do this:
Without running two queries (one for exists, the other one the actual query)
That is less verbose than repeating queries (real queries are quite long, so DRY and all that stuff)
Using NOT EXISTS with a Sub Query to Determine if condition exists
SELECT b
FROM
t
WHERE
b > a
AND (
NOT EXISTS (SELECT 1 FROM #Table WHERE a IN (1,2,3))
OR a IN (1,2,3)
)
ORDER BY
b
The reason this works is because if the condition exists then the OR statement will include the rows and if the condition does not exist then the NOT EXISTS will include ALL rows.
Or With Common Table Expression and window Function with Conditional Aggregation.
WITH cte AS (
SELECT
b
,CASE WHEN a IN (1,2,3) THEN 1 ELSE 0 END as MeetsCondition
,COUNT(CASE WHEN a IN (1,2,3) THEN a END) OVER () as ConditionCount
FROM
t
)
SELECT
b
FROM
cte
WHERE
(ConditionCount > 0 AND MeetsCondition = 1)
OR (ConditionCount = 0)
ORDER BY
b
I find it a bit "ugly". Maybe it would be better to materialize output from your query within a temp table and then based on count from temp table perform first or second query (this limits accessing the original table from 3 times to 2 and you will be able to add some flag for qualifying rows for your condition not to repeat it). Other than that, read below . . .
Though, bear in mind that EXISTS query should execute pretty fast. It stops whether it finds any row that satisfies the condition.
You could achieve this using UNION ALL to combine resultset from constrained query and full query without constraint on a column and then decide what to show depending on output from first query using CASE statement.
How CASE statement works: when any row from constrained part of your query is found, return resultset from constrainted query else return everything omitting the constraint.
If your database supports using CTE use this solution:
with tmp_data as (
select *
from (
select 'constraint' as type, b
from t
where b > a
and a in (1,2,3) -- here goes your constraint
union all
select 'full query' as type, b
from t
where b > a
) foo
)
SELECT b
FROM tmp_data
WHERE
CASE WHEN (select count(*) from tmp_data where type = 'constraint') > 0
THEN type = 'constraint'
ELSE type = 'full query'
END
;

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)

SQL (TSQL) - Select values in a column where another column is not null?

I will keep this simple- I would like to know if there is a good way to select all the values in a column when it never has a null in another column. For example.
A B
----- -----
1 7
2 7
NULL 7
4 9
1 9
2 9
From the above set I would just want 9 from B and not 7 because 7 has a NULL in A. Obviously I could wrap this as a subquery and USE the IN clause etc. but this is already part of a pretty unique set and am looking to keep this efficient.
I should note that for my purposes this would only be a one-way comparison... I would only be returning values in B and examining A.
I imagine there is an easy way to do this that I am missing, but being in the thick of things I don't see it right now.
You can do something like this:
select *
from t
where t.b not in (select b from t where a is null);
If you want only distinct b values, then you can do:
select b
from t
group by b
having sum(case when a is null then 1 else 0 end) = 0;
And, finally, you could use window functions:
select a, b
from (select t.*,
sum(case when a is null then 1 else 0 end) over (partition by b) as NullCnt
from t
) t
where NullCnt = 0;
The query below will only output one column in the final result. The records are grouped by column B and test if the record is null or not. When the record is null, the value for the group will increment each time by 1. The HAVING clause filters only the group which has a value of 0.
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
If you want to get all the rows from the records, you can use join.
SELECT a.*
FROM TableName a
INNER JOIN
(
SELECT B
FROM TableName
GROUP BY B
HAVING SUM(CASE WHEN A IS NULL THEN 1 ELSE 0 END) = 0
) b ON a.b = b.b

How to count specific values in a table

I've a column that have 15 distinct values. I'd like to count how many there are of a few of them,
I've come up with e.g.
select a,COUNT(IFNULL(b != 1,NULL)),COUNT(IFNULL(b != 2,NULL)) from
mytable group by a
select a,SUM(CASE WHEN a = 1 THEN 1 ELSE 0)),SUM(CASE WHEN a = 2 THEN 1 ELSE 0)) from
mytable group by a
What's the best way of doing this ? (note, I need to pivot those values to columns,
a simple select a,b,count(*) from mytable where b=1 or b=2 group by a,b; won't do.)
Of the two methods suggested in the question, I recommend the second:
select a,
SUM(CASE WHEN b = 1 THEN 1 ELSE 0) b1,
SUM(CASE WHEN b = 2 THEN 1 ELSE 0) b2
from mytable
group by a
- as it is both simpler and (I think) easier to understand, and therefore to maintain. I recommend including column aliases, as they make the output easier to understand.
First of all you misunderstood the IFNULL function (you probably wanted IF). See the documentation http://dev.mysql.com/doc/refman/5.0/en/control-flow-functions.html .
The second query you have in your question will give you what you want. But SUM(a=x) is more than sufficient. In MySQL true is equal to 1 and false is equal to 0.
have u try cross join?
select *
from (
select a, sum(...) as aSum
from mytable
where a...
group
by a
) as forA
cross join (
select b, sum(...) as bsum
from (
select *
from mytable
where b...
group
by b
)
) as forB;