Counting Records with Unique Field Value - sql

Source Table
Assuming I have a table called MyTable with the content:
+----------+------+
| Category | Code |
+----------+------+
| A | A123 |
| A | B123 |
| A | C123 |
| B | A123 |
| B | B123 |
| B | D123 |
| C | A123 |
| C | E123 |
| C | F123 |
+----------+------+
I'm trying to count the number of Code values which are unique to each category.
Desired Result
For the above example, the result would be:
+----------+-------------+
| Category | UniqueCodes |
+----------+-------------+
| A | 1 |
| B | 1 |
| C | 2 |
+----------+-------------+
Since C123 is unique to A, D123 is unique to B, and E123 & F123 are unique to C.
What I've Tried
I'm able to obtain the result for a single category (e.g. C) using a query such as:
SELECT COUNT(a.Code) AS UniqueCodes
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = "C"
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> "C"
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
However, whilst I can hard-code a query for each category, I cannot seem to construct a single query to calculate this for every possible Category value.
Here is what I've tried:
SELECT c.Category,
(
SELECT COUNT(a.Code)
FROM
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category = c.Category
) a
LEFT JOIN
(
SELECT MyTable.Code
FROM MyTable
WHERE MyTable.Category <> c.Category
) b
ON a.Code = b.Code
WHERE b.Code IS NULL
) AS UniqueCodes
FROM
(
SELECT MyTable.Category
FROM MyTable
GROUP BY MyTable.Category
) c
Though, the c.Category is not defined within the scope of the nested SELECT query.
Could anyone advise how I could obtain the desired result?

I would use NOT EXISTS & do aggregation :
select category, count(*)
from MyTable t
where not exists (select 1 from MyTable t1 where t1.code = t.code and t1.category <> t.category)
group by category;

You can use two levels of aggregation:
select minc as category, count(*)
from (select code, min(category) as minc, max(category) as maxc
from t
group by code
) as c
where minc = maxc
group by minc;

This would also work:
select category, count(*) from(
select a.category, b.count from mytable a join (
select code, count(category) as count
from mytable
group by code
having count(category) = 1
) b on b.code = a.code
) c group by category

Learning from #isaace's answer, I also came up with this -
SELECT MyTable.Category, COUNT(*)
FROM
MyTable INNER JOIN
(SELECT Code FROM MyTable GROUP BY Code HAVING COUNT(Category) = 1) a
ON MyTable.Code = a.Code
GROUP BY MyTable.Category

Related

Select from multiple table, eliminating duplicates values

I have these tables and values:
Person Account
------------------ -----------------------
ID | CREATED_BY ID | TYPE | DATA
------------------ -----------------------
1 | 1 | T1 | USEFUL DATA
2 | 2 | T2 |
3 | 3 | T3 |
4 | 4 | T2 |
Person_account_link
--------------------------
ID | PERSON_ID | ACCOUNT_ID
--------------------------
1 | 1 | 1
2 | 1 | 2
3 | 2 | 3
4 | 3 | 4
I want to select all persons with T1 account type and get the data column, for the others persons they should be in the result without any account information.
(I note that person 1 has two accounts : account_id_1 and account_id_2 but only one row must be displayed (priority for T1 type if exist otherwise null)
The result should be :
Table1
-----------------------------------------------------
PERSON_ID | ACCOUNT_ID | ACCOUNT_TYPE | ACCOUNT_DATA
-----------------------------------------------------
1 | 1 | T1 | USEFUL DATA
2 | NULL | NULL | NULL
3 | NULL | NULL | NULL
4 | NULL | NULL | NULL
You can do conditional aggregation :
SELECT p.id,
MAX(CASE WHEN a.type = 'T1' THEN a.id END) AS ACCOUNT_ID,
MAX(CASE WHEN a.type = 'T1' THEN 'T1' END) AS ACCOUNT_TYPE,
MAX(CASE WHEN a.type = 'T1' THEN a.data END) AS ACCOUNT_DATA
FROM person p LEFT JOIN
Person_account_link pl
ON p.id = pl.person_id LEFT JOIN
account a
ON pl.account_id = a.id
GROUP BY p.id;
You would need an outer join, starting with Person and then to the other two tables. I would also aggregate with group by and min to tackle the situation where a person would have two or more T1 accounts. In that case one of the data is taken (the min of them):
select p.id person_id,
min(a.id) account_id,
min(a.type) account_type,
min(a.data) account_data
from Person p
left join Person_account_link pa on p.id = pa.person_id
left join Account a on pa.account_id = a.id and a.type = 'T1'
group by p.id
In Postgres, I like to use the FILTER keyword. In addition, the Person table is not needed if you only want persons with an account. If you want all persons:
SELECT p.id,
MAX(a.id) FILTER (a.type = 'T1') as account_id,
MAX(a.type) FILTER (a.type = 'T1') as account_type,
MAX(a.data) FILTER (a.type = 'T1') as account_data
FROM Person p LEFT JOIN
Person_account_link pl
ON pl.person_id = p.id LEFT JOIN
account a
ON pl.account_id = a.id
GROUP BY p.id;

Oracle SQL Get unique symbols from table

I have table with descriptions of smth. For example:
My_Table
id description
================
1 ABC
2 ABB
3 OPAC
4 APEЧ
I need to get all unique symbols from all "description" columns.
Result should look like that:
symbol
================
A
B
C
O
P
E
Ч
And it shoud work for all languages, so, as I see, regular expressions cant help.
Please help me. Thanks.
with cte (c,description_suffix) as
(
select substr(description,1,1)
,substr(description,2)
from mytable
where description is not null
union all
select substr(description_suffix,1,1)
,substr(description_suffix,2)
from cte
where description_suffix is not null
)
select c
,count(*) as cnt
from cte
group by c
order by c
or
with cte(n) as
(
select level
from dual
connect by level <= (select max(length(description)) from mytable)
)
select substr(t.description,c.n,1) as c
,count(*) as cnt
from mytable t
join cte c
on c.n <= length(description)
group by substr(t.description,c.n,1)
order by c
+---+-----+
| C | CNT |
+---+-----+
| A | 4 |
| B | 3 |
| C | 2 |
| E | 1 |
| O | 1 |
| P | 2 |
| Ч | 1 |
+---+-----+
Create a numbers table and populate it with all the relevant ids you'd need (in this case 1..maxlength of string)
SELECT DISTINCT
locate(your_table.description, numbers.id) AS symbol
FROM
your_table
INNER JOIN
numbers
ON numbers.id >= 1
AND numbers.id <= CHAR_LENGTH(your_table.description)
SELECT DISTINCT(SUBSTR(ll,LEVEL,1)) OP --Here DISTINCT(SUBSTR(ll,LEVEL,1)) is used to get all distinct character/numeric in vertical as per requirment
FROM
(
SELECT LISTAGG(DES,'')
WITHIN GROUP (ORDER BY ID) ll
FROM My_Table --Here listagg is used to convert all values under description(des) column into a single value and there is no space in between
)
CONNECT BY LEVEL <= LENGTH(ll);

How to get Order numbers where collection number has a top and a corresponding bottom?

This is how the main order table looks :-
| Order_num | Collection_Num |
+--------------+----------------+
| 20143045585 | 123456 |
| 20143045585 | 789012 |
| 20143045585 | 456897 |
| 20143758257 | 546465 |
+--------------+----------------+
These are the collections:-
| tops | bottom |
+--------------+----------------+
| 353735 | 745758 |
| 123456 | 789012 |
| 456456 | 456897 |
| 323456 | 546465 |
+--------------+----------------+
Desired Output:-
| Order_num |
+--------------+
| 20143045585 |
Here Order number 20143045585 has both a top and a bottom from the same row in table number 2 (each row in 2nd table forms a particular combination called 'A Collection' i.e. 1 top and corresponding bottom ).
What I want to know -
All the order numbers which have a top and a corresponding bottom in 'Collection_num' column.
Can anyone help me with a SQL code for this ?
Let me know if any of this is unclear.
select Order_num
From table_1 as A
where exists
(select tops from table_2 as B where B.tops = A.Collection_num)
AND
(select bottom from table2 as B where B.bottom = A.Collection_num)
I am assuming you just have the first table of data and each order can only have the two relevant collections or less. Perhaps:
select T1.Order_Num
,T1.Collection_Num AS Tops
,T2.Collection_Num AS Bottom
from Table1 T1
inner join Table1 T2
on T1.Order_Num = T2.Order_Num
and T1.Collection_Num < T2.Collection_Num
order by T1.Order_Num
You can try using subquery
select distinct order_num from #yourorder where collection_num
in (select tops from #yourcollections)
and order_num in
( select order_num from #yourorder where collection_num in
(select bottom from #yourcollections) )
Pretty sure that something like this should work for you. I am just using the ctes here to create the test data so it can be queried.
with Orders (OrderNum, CollectionNum) as
(
select 20143045585, 123456 union all
select 20143045585, 789012 union all
select 20143045585, 456897 union all
select 20143758257, 546465
)
, Collections (CollectionID, tops, bottoms) as
(
select 1, 353735, 745758 union all
select 2, 123456, 789012 union all
select 3, 456456, 456897 union all
select 4, 323456, 546465
)
select o.OrderNum
, t.tops
, b.bottoms
from Orders o
join Collections t on t.tops = o.CollectionNum
join
(
select o.OrderNum
, b.bottoms
, b.CollectionID
from Orders o
join Collections b on b.bottoms = o.CollectionNum
) b on b.CollectionID = t.CollectionID
Here is the query that I used:
Select *
From (select A.Order_num, B.Coll_ID, B.Bottoms from Orders_table as A
Join Collections_Table as B
on A.Collection_num = B.Bottoms
) as C
join
(select K.Order_num, M.Coll_ID, M.Tops from Orders_table as K
Join Collections_Table as M
on A.Collection_num = B.Tops
) as D
on C.Orders_B = D.Orders_Num AND C.Coll_ID = D.Coll_ID
)

SQL Server - Select first row that meets criteria

I have 2 tables that contain IDs. There will be duplicate IDs in one of the tables and I only want to return one row for each matching ID in table B. For example:
Table A
+-----------+-----------+
| objectIdA | objectIdB |
+-----------+-----------+
| 1 | A |
| 1 | B |
| 1 | D |
| 5 | F |
+-----------+-----------+
Table B
+-----------+
| objectIdA |
+-----------+
| 1 |
| 5 |
+-----------+
Would return:
+-----------+-----------+
| objectIdA | objectIdB |
+-----------+-----------+
| 1 | D |
| 5 | F |
+-----------+-----------+
I only need one entry from Table A that matches Table B. It doesn't matter which row of table A is returned.
I'm using SQL Server.
Thanks.
;WITH CTE
AS (
SELECT B.objectIdA
,A.objectIdB
,ROW_NUMBER() OVER (PARTITION BY B.objectIdA ORDER BY A.objectIdB DESC) rn
FROM TableA A
INNER JOIN TableB B ON A.objectIdA = B.objectIdA
)
SELECT C.objectIdA
,C.objectIdB
FROM CTE
WHERE rn = 1
You can do so,by using a subselect for table a to get one entry per objectIdA group
select b.*,a.[objectIdB]
from b
join
(select [objectIdA], max([objectIdB]) [objectIdB]
from a group by [objectIdA]
) a
on(b.[objectIdA] = a.[objectIdA])
Fiddle Demo
Edit deom comments to get a whole row from tablea you can use a self join for tablea
select b.*,a.*
from b
join a
on(b.[objectIdA] = a.[objectIdA])
join (select [objectIdA], max([objectIdB]) [objectIdB]
from a group by [objectIdA]) a1
on(a.[objectIdA] = a1.[objectIdA]
and
a.[objectIdB] = a1.[objectIdB])
Fiddle Demo 2
SELECT MAX(b.ID) AS ID
,MAX(Value) AS Value
,MAX(OtherCol1) AS OtherCol1
,MAX(OtherCol2) AS OtherCol2
,MAX(OtherCol3) AS OtherCol3
FROM TblA AS a
INNER JOIN TblB AS b ON a.TblBID = b.ID
GROUP BY TblBID
Table A
Table B
Table A Data
Table B Data
Query Result
You should use PARTITION OVER to achieve the results.
SELECT
t.objectIdA,
t.objectIdB
FROM (
SELECT
a.objectIdA,
a.objectIdB,
rowid = ROW_NUMBER() OVER (PARTITION BY a.objectIdA ORDER BY a.objectIdB DESC)
FROM TableA a
INNER JOIN TableB b ON (a.objectIdA = b.objectIdA)
) t
WHERE rowid <= 1
Fiddle Code: http://sqlfiddle.com/#!3/a2ccd/1

SQL query uses "wrong" join

I have an query which gives me the wrong result.
Tables:
A
+----+
| id |
+----+
| 1 |
| 2 |
+----+
B
+----+----+
| id | x | B.id = A.id
+----+----+
| 1 | 1 |
| 1 | 1 |
| 1 | 0 |
+----+----+
C
+----+----+
| id | y | C.id = A.id
+----+----+
| 1 | 1 |
| 1 | 2 |
+----+----+
What I want to do: Select all rows from A. For each row in A count in B all x with value 1 and all x with value 0 with B.id = A.id. For each row in A get the minimum y from C with C.id = A.id.
The result I am expecting is:
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
First Try:
This doesn't work.
SELECT a.id,
MIN(c.y),
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 4 | 2 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Second Try:
This works but I am sure it has a bad performance.
SELECT a.id,
MIN(c.y),
b.x,
b.y
FROM a
LEFT JOIN (SELECT b.id, SUM(IF(b.x = 1, 1, 0)) x, SUM(IF(b.x = 0, 1, 0)) y FROM b) b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id
+----+------+--------+---------+
| id | min | count1 | count 2 |
+----+------+--------+---------+
| 1 | 1 | 2 | 1 |
| 2 | NULL | 0 | 0 |
+----+------+--------+---------+
Last Try:
This works too.
SELECT x.*,
SUM(IF(b.x = 1, 1, 0)),
SUM(IF(b.x = 0, 1, 0))
FROM (SELECT a.id,
MIN(c.y)
FROM a
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id) x
LEFT JOIN b
ON ( b.id = x.id )
GROUP BY x.id
Now my question is: Is the last one the best choise or is there a way to write this query with just one select statement (like in the first try)?
Your joins are doing cartesian products for a given value, because there are multiple rows in each table.
You can fix this by using count(distinct) rather than sum():
SELECT a.id, MIN(c.y),
count(distinct (case when b.x = 1 then b.id end)),
count(distinct (case when b.x = 0 then b.id end))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
You can also fix this by pre-aggregating b (and/or c). And you would need to take that approach if your aggregation function were something like the sum of a column in b.
EDIT:
You are correct. The above query counts the distinct values of B, but B contains rows that are exact duplicates. (Personally, I think having a column with the name id that has duplicates is a sign of poor design, but that is another issue.)
You could solve it by having a real id in the b table, because then the count(distinct) would count the correct values. You can also solve it by aggregating the two tables before joining them in:
SELECT a.id, c.y, x1, x0
FROM a
LEFT JOIN (select b.id,
sum(b.x = 1) as x1,
sum(b.x = 0) as x0
from b
group by b.id
) b
ON ( a.id = b.id )
LEFT JOIN (select c.id, min(c.y) as y
from c
group by c.id
) c
ON ( a.id = c.id );
Here is a SQL Fiddle for the problem.
EDIT II:
You can get it in one statement, but I'm not so sure that it would work on similar data. The idea is that you can count all the cases where x = 1 and then divide by the number of rows in the C table to get the real distinct count:
SELECT a.id, MIN(c.y),
coalesce(sum(b.x = 1), 0) / count(distinct coalesce(c.y, -1)),
coalesce(sum(b.x = 0), 0) / count(distinct coalesce(c.y, -1))
FROM a
LEFT JOIN b
ON ( a.id = b.id )
LEFT JOIN c
ON ( a.id = c.id )
GROUP BY a.id;
It is a little tricky, because you have to handle NULLs to get the right values. Note that this is counting the y value to get a distinct count from the C table. Your question re-enforces why it is a good idea to have a unique integer primary key in every table.