Select value in a SQL query based on count() being non-zero - sql

I am trying to write a SQL query that would "calculate" status column of the row based on JOIN with statuses table.
My record is basically: id | name | statusId, which is foreign key to statuses table. That table has: id | statusName
I collect count() for each DISTINCT statusId. Now, I need return Id of any status based on the following idea - if count(status0) > 0, I need to return status0, else I need to check status1 then status2 etc.
Could I write a SQL query to return status for each row status with JOIN, WHERE, HAVING etc without if/else logic?

If you need the statuses that are being used, then use exists:
select s.*
from statuses s
where exists (select 1 from records r where r.statusid = s.id);
Actually, if you just want the ids, you can use:
select distinct r.statusid
from records;

Rephrasing the question, do you want to just get the statuses that are present in the bigger table? How about?
SELECT s.*
FROM status s
WHERE s.id IN (SELECT DISTINCT statusid FROM records);

Start with this it should get you on the right track...
select a.*, b.*, c.cnt from table1 a
join table2 b on a.statusid=b.id
join (
select statusname, count(*) cnt
from table1 a join table2 b on a.statusid=b.id
group by statusname)c
on b.statusname=c.statusname

Related

Best way to group records with MAX revision

I have a source table like this:
table_a :
id
revision
status
1
0
APPROVED
1
1
PENDING
I am trying to get distinct records from table_a having the latest revision and show the latest approved revision for each one of them.
result table_b :
id
latest_rev
latest_approved_rev
1
1
0
I have written the following query :
SELECT a.id,
a.revision AS latest_rev,
b.latest_approved_rev
FROM table_a a
LEFT JOIN (SELECT id,
MAX(revision) AS latest_approved_revision
FROM table_a
WHERE status = 'APPROVED'
GROUP BY id) b ON b.id = a.id
WHERE a.revision = (SELECT MAX(revision)
FROM table_a
WHERE id = a_id
My query seems to work fine, but I am wondering if I was missing something and/or if there is another way to make the query better/faster.
Seems you could achieve this with some (conditional) aggregation:
SELECT id,
MAX(revision) AS latest_rev,
MAX(CASE status WHEN 'APPROVED' THEN revision END) AS latest_approved_rev
FROM (VALUES(1,0,'APPROVED'),
(1,1,'PENDING'))V(id,revision,status)
GROUP BY id;
You have a correct query (if I understand your requirement). You are very close to an ideal query. But your outer WHERE clause contains a correlated (dependent) subquery, and those don't always perform well.
You can think of this as the JOIN of two subqueries. The one you have.
SELECT id,
MAX(revision) AS revision
FROM table_a
WHERE status = 'APPROVED'
GROUP BY id
and this one.
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
You FULL JOIN them together. Like this.
SELECT max.id,
latest.revision as latest_rev,
approved.revision as approved_rev
FROM (
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
) latest
FULL JOIN (
SELECT id,
MAX(revision) AS revision
FROM table_a
GROUP BY id
) approved on latest.id = approved.id
In this case you can actually use LEFT JOIN because you know every id in the approved subquery is also present in the latest subquery.

Compare the results of a ROW COUNT

I have 2 databases in the same server and I need to compare the registers on each one, since one of the databases is not importing all the information
I was trying to do a ROW count but it's not working
Currently I am doing packages of 100,000 rows approximate, and lookup at them in Excel.
Let's say I want a query that does a count for each ID in TABLE A and then compares the count result VS TABLE B count for each ID, since they are the same ID the count should be the same, and I want that brings me the ID on which there where any mismatch between counts.
--this table will contain the count of occurences of each ID in tableA
declare #TableA_Results table(
ID bigint,
Total bigint
)
insert into #TableA_Results
select ID,count(*) from database1.TableA
group by ID
--this table will contain the count of occurences of each ID in tableB
declare #TableB_Results table(
ID bigint,
Total bigint
);
insert into #TableB_Results
select ID,count(*) from database2.TableB
group by ID
--this table will contain the IDs that doesn't have the same amount in both tables
declare #Discordances table(
ID bigint,
TotalA bigint,
TotalB bigint
)
insert into #Discordances
select TA.ID,TA.Total,TB.Total
from #TableA_Results TA
inner join #TableB_Results TB on TA.ID=TB.ID and TA.Total!=TB.Total
--the final output
select * from #Discordances
The question is vague, but maybe this SQL Code might help nudge you in the right direction.
It grabs the IDs and Counts of each ID from database one, the IDs and counts of IDs from database two, and compares them, listing out all the rows where the counts are DIFFERENT.
WITH DB1Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseOne.dbo.TableOne
GROUP BY ID
), DB2Counts AS (
SELECT ID, COUNT(ID) AS CountOfIDs
FROM DatabaseTwo.dbo.TableTwo
GROUP BY ID
)
SELECT a.ID, a.CountOfIDs AS DBOneCount, b.CountOfIDs AS DBTwoCount
FROM DB1Counts a
INNER JOIN DB2Counts b ON a.ID = b.ID
WHERE a.CountOfIDs <> b.CountOfIDs
This SQL selects from the specific IDs using the "Database.Schema.Table" notation. So replace "DatabaseOne" and "DatabaseTwo" with the names of your two databases. And of course replace TableOne and TableTwo with the names of your tables (I'm assuming they're the same). This sets up two selects, one for each database, that groups by ID to get the count of each ID. It then joins these two selects on ID, and returns all rows where the counts are different.
You could full outer join two aggregate queries and pull out ids that are either missing in one table, or for which the record count is different:
select coalesce(ta.id, tb.id), ta.cnt, tb.cnt
from
(select id, count(*) cnt from tableA) ta
full outer join (select id, count(*) cnt from tableB) tb
on ta.id = tb.id
where
coalesce(ta.cnt, -1) <> coalesce(tb.cnt, -1)
You seem to want aggregation and a full join:
select coalesce(a.id, b.id) as id, a.cnt, b.cnt
from (select id, count(*) as cnt
from a
group by id
) a full join
(select id, count(*) as cnt
from b
group by id
) b
on a.id = b.id
where coalesce(a.cnt, 0) <> coalesce(b.cnt, 0);

Join table on Count

I have two tables in Access, one containing IDs (not unique) and some Name and one containing IDs (not unique) and Location. I would like to return a third table that contains only the IDs of the elements that appear more than 1 time in either Names or Location.
Table 1
ID Name
1 Max
1 Bob
2 Jack
Table 2
ID Location
1 A
2 B
Basically in this setup it should return only ID 1 because 1 appears twice in Table 1 :
ID
1
I have tried to do a JOIN on the tables and then apply a COUNT but nothing came out.
Thanks in advance!
Here is one method that I think will work in MS Access:
(select id
from table1
group by id
having count(*) > 1
) union -- note: NOT union all
(select id
from table2
group by id
having count(*) > 1
);
MS Access does not allow union/union all in the from clause. Nor does it support full outer join. Note that the union will remove duplicates.
Simple Group By and Having clause should help you
select ID
From Table1
Group by ID
having count(1)>1
union
select ID
From Table2
Group by ID
having count(1)>1
Based on your description, you do not need to join tables to find duplicate records, if your table is what you gave above, simply use:
With A
as
(
select ID,count(*) as Times From table group by ID
)
select * From A where A.Times>1
Not sure I understand what query you already tried, but this should work:
select table1.ID
from table1 inner join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1
Or if you have ID's in one table but not the other
select table1.ID
from table1 full outer join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1

How do I just get the first matching row?

I have a fairly complex SQL query - part of which requires to look up a company_ID value found in the first table to obtain the company_Name in the second table. The second table may have variants of the company name, but that is OK - I just need the first match.
So, tableA looks something like this (approx 2 dozen columns and many rows)
company_ID (CHAR(12))
161012348876
561254435253
103929478273
141567643542
tableB looks something like this
company_ID (Integer) Company_name
161012348876 Watson & Jones Ltd
161012348876 Watson and Jones
561254435253 Fictional Co. plc
103929478273 Made Up Corp.
161012348876 Watson Jones Ltd
141567643542 Thingymajig Gmbh.
This query will return multiple rows for 161012348876. What're good ways just to get one row returned for each matching company_id (i.e. 4 rows instead of 6)?
SELECT *, t2.company_name
FROM tableA t1
JOIN tableB t2 ON t1.company_id = cast(t2.company_id as CHAR(12))
I am using Teradata SQL.
Any help much appreciated.
SELECT *, t2.company_name
FROM tableA t1
JOIN tableB t2 ON t1.company_id = cast(t2.company_id as CHAR(12))
GROUP BY t1.company_id
Will return 1 row for each unique t1.company_id
The following query will get one Name for each company id. The Group by t2.company_id and MAX(t2.company_name) will get a unique name for each id and then join it with tableA.
SELECT t1.Company_ID, t2.company_name
FROM tableA t1
JOIN (SELECT t2.company_id , MAX(t2.company_Name) [aName]
FROM tableB t2 GROUP BY t2.company_id ) as t3
ON t1.company_id = cast(t3.company_id as CHAR(12))
Instead of user2989408's MAX subquery you can also do a
SELECT company_id , company_Name
FROM tableB
QUALIFY ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY company_name) = 1
--if you don't care about MIN/MAX or want a more random result:
QUALIFY COUNT(*) OVER (PARTITION BY company_id ROWS UNBOUNDED PRECEDING) = 1
But assuming that *company_id* is the PI of tableB the MAX will probably perform better.

How to compare tables and find duplicates and also find columns with different value

I have the following tables in Oracle 10g:
Table1
Name Status
a closed
b live
c live
Table2
Name Status
a final
b live
c live
There are no primary keys in both tables, and I am trying to write a query which will return identical rows without looping both tables and comparing rows/columns. If the status column is different then the row in the Table2 takes presedence.
So in the above example my query should return this:
Name Status
a final
b live
c live
Since you have mentioned that there are no Primary Key on both tables, I'm assuming that there maybe a possibility that a row may exist on Table1, Table2, or both. The query below uses Common Table Expression and Windowing function to get such result.
WITH unionTable
AS
(
SELECT Name, Status, 1 AS ordr FROM Table1
UNION
SELECT Name, Status, 2 AS ordr FROM Table2
),
ranks
AS
(
SELECT Name, Status,
ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY ordr DESC) rn
FROM unionTable
)
SELECT Name, Status
FROM ranks
WHERE rn = 1
SQLFiddle Demo
Something like this?
SELECT table1.Name, table2.Status
FROM table1
INNER JOIN table2 ON table1.Name = table2.Name
By always returning table2.Status you've covered both the case when they're the same and when they're different (essentially it doesn't matter what the value of table1.Status is).