two SQL COUNT() queries? - sql

I want to count both the total # of records in a table, and the total # of records that match certain conditions. I can do these with two separate queries:
SELECT COUNT(*) AS TotalCount FROM MyTable;
SELECT COUNT(*) AS QualifiedCount FROM MyTable
{possible JOIN(s) as well e.g. JOIN MyOtherTable mot ON MyTable.id=mot.id}
WHERE {conditions};
Is there a way to combine these into one query so that I get two fields in one row?
SELECT {something} AS TotalCount,
{something else} AS QualifiedCount
FROM MyTable {possible JOIN(s)} WHERE {some conditions}
If not, I can issue two queries and wrap them in a transaction so they are consistent, but I was hoping to do it with one.
edit: I'm most concerned about atomicity; if there are two sub-SELECT statements needed that's OK as long as if there's an INSERT coming from somewhere it doesn't make the two responses inconsistent.
edit 2: The CASE answers are helpful but in my specific instance, the conditions may include a JOIN with another table (forgot to mention that in my original post, sorry) so I'm guessing that approach won't work.

One way is to join the table against itself:
select
count(*) as TotalCount,
count(s.id) as QualifiedCount
from
MyTable a
left join
MyTable s on s.id = a.id and {some conditions}
Another way is to use subqueries:
select
(select count(*) from Mytable) as TotalCount,
(select count(*) from Mytable where {some conditions}) as QualifiedCount
Or you can put the conditions in a case:
select
count(*) as TotalCount,
sum(case when {some conditions} then 1 else 0 end) as QualifiedCount
from
MyTable
Related:
SQL Combining several SELECT results

In Sql Server or MySQL, you can do that with a CASE statement:
select
count(*) as TotalCount,
sum(case when {conditions} then 1 else 0 end) as QualifiedCount
from MyTable
Edit: This also works if you use a JOIN in the condition:
select
count(*) as TotalCount,
sum(case when {conditions} then 1 else 0 end) as QualifiedCount
from MyTable t
left join MyChair c on c.TableId = t.Id
group by t.id, t.[othercolums]
The GROUP BY is there to ensure you only find one row from the main table.

if you are just counting rows you could just use nested queries.
select
(SELECT COUNT(*) AS TotalCount FROM MyTable) as a,
(SELECT COUNT(*) AS QualifiedCount FROM MyTable WHERE {conditions}) as b

In Oracle SQL Developer I had to add a * FROM in my select, or else i was getting a syntax error:
select * FROM
(select COUNT(*) as foo FROM TABLE1),
(select COUNT(*) as boo FROM TABLE2);

MySQL doesn't count NULLs, so this should work too:
SELECT count(*) AS TotalCount,
count( if( field = value, field, null)) AS QualifiedCount
FROM MyTable {possible JOIN(s)} WHERE {some conditions}
That works well if the QuailifiedCount field comes from a LEFT JOIN, and you only care if it exists. To get the number of users, and the number of users that have filled in their address:
SELECT count( user.id) as NumUsers, count( address.id) as NumAddresses
FROM Users
LEFT JOIN Address on User.address_id = Address.id;

Related

Two Table Comparison in HIVE

I have two different set of tables. I want to compare the total count in both tables and want to display whether the two tables counts are matching or not. If matching, then 'Pass' else 'fail'.
SELECT (SELECT COUNT (*)
FROM Table1 t1
INNER JOIN Table2 t2
ON TRIM (t1.mgac_ac_id) = TRIM (t2.account))
AS cnt1,
(SELECT COUNT (*) FROM t3) AS cnt2 where cnt1=cnt2;
Above show code is incorrect. Could anyone help on code. Whether want to create any variables in HIVE?
OK, it's simple to complete this. like below:
select
case when tmp1.value = tmp2.value then 'Pass' else 'Fail' end as result
from
(select count(1) as value from table1) tmp1
join
(select count(1) as value from table2) tmp2 on 1=1

Many left joins on subqueries, need some way to increase performance

Below is an example of my query as it stands. I have at most, approximately 10 of these joins/subqueries all of basically the same format, but with different joins and where clauses.
SELECT DISTINCT mytable.label, tableA.counter, tableB.counter
FROM mytable
LEFT JOIN
(SELECT COUNT(id) as counter, label
FROM mytable
...joins...
...where...
GROUP BY label) tableA
ON tableA.label=mytable.label
LEFT JOIN
(SELECT COUNT(id) as counter, label
FROM mytable
...joins...
...where...
GROUP BY label) tableB
ON tableB.label=mytable.label
...
It's taking about 2-4 seconds and this is a high-traffic page, so that kind of speed isn't good enough. Can anyone recommend a way to improve performance here?
No need to GROUP here, as you're only returning 1 value. Try a subquery approach like this:
SELECT DISTINCT T.label,
(SELECT COUNT(id) as counter FROM tableA A WHERE A.blah = T.blah) as AValue,
(SELECT COUNT(id) as counter FROM tableB B WHERE B.blah = T.blah) as BValue
FROM mytable T
In addition to Jon Tirjan's solution, i'd share another one via using UNION and PIVOT table.
SELECT [A], [B]
FROM (
SELECT 'A' AS TableName, COUNT(id) as counter
FROM tableA
UNION ALL
SELECT 'B' AS TableName, COUNT(id) as counter
FROM tableB
) AS DT
PIVOT(SUM(counter) FOR TableName IN([A], [B])) AS PVT

Written a subquery that can return more than one field without using the Exists

The query below is supposed to pull records for fields with the max date.
I am getting an error
You have written a subquery that can return more than one field without using EXISTS reserved word in the Main query's FROM clause. Revise the SELECT statement of the subquery to request only one column.
Code:
SELECT *
FROM TableName
WHERE (((([Project_Name], [Date])) IN (SELECT Project_Name, MAX(Date)
FROM TableName
GROUP BY Project)));
Your probably thinking of a nested subquery used as a table, like the below:
select a.*, b.1, b.2
from FirstTable A
join (Select Id, firstcolumn as 1, secondcolumn as 2
from SecondTable) B on b.ID = a.ID
Works pretty much like a regular join except you are using a subquery. Hope that helps,
SELECT A.*
FROM TableName A
INNER JOIN (select Project_Name, max(Date) MaxDate
from TableName
group by Project) B
ON A.[Project_Name] = B.[Project_Name]
AND A.[Date] = B.MaxDate
A version using EXISTS() looks like this:
SELECT *
FROM TableName AS A
WHERE EXISTS(
SELECT * FROM (
SELECT B.Project_Name, MAX( B.Date ) AS MaxDate
FROM TableName AS B
GROUP BY B.Project_Name ) AS C
WHERE C.Project_Name = A.Project_Name AND C.MaxDate = A.Date
);
Although I have the feeling this will have poorer performance than a JOIN because the GROUP BY statement might have to be executed for each record and each call to the EXISTS() function...

ODBC Firebird Sql Query - Syntax

Trying to get an slightly more complex sql statement structured but can't seem to get the syntax right. Trying to select counts, of various columns, in two different tables.
SELECT
SUM(ColumninTable1),
SUM(Column2inTable1),
COUNT(DISTINCT(Column3inTable1))
FROM TABLE1
This works, however I can't for the life of me figure out how to add in a COUNT(DISTINCT(Column1inTable2) FROM TABLE2 with what syntax.
There are several solutions you can take:
Disjunct FULL OUTER JOIN
SELECT
SUM(MYTABLE.ID) as theSum,
COUNT(DISTINCT MYTABLE.SOMEVALUE) as theCount,
COUNT(DISTINCT MYOTHERTABLE.SOMEOTHERVALUE) as theOtherCount
FROM MYTABLE
FULL OUTER JOIN MYOTHERTABLE ON 1=0
UNION two queries and leave the column for the other table null
SELECT
MAX(theSum) as theSum,
MAX(theCount) as theCount,
MAX(theOtherCount) AS theOtherCount
FROM (
SELECT
SUM(ID) as theSum,
COUNT(DISTINCT SOMEVALUE) as theCount,
NULL as theOtherCount
FROM MYTABLE
UNION ALL
SELECT
NULL,
NULL,
COUNT(DISTINCT SOMEOTHERVALUE)
FROM MYOTHERTABLE
)
Query 'with a query per column' against a single record table (eg RDB$DATABASE)
SELECT
(SELECT SUM(ID) FROM MYTABLE) as theSum,
(SELECT COUNT(DISTINCT SOMEVALUE) FROM MYTABLE) as theCount,
(SELECT COUNT(DISTINCT SOMEOTHERVALUE) FROM MYOTHERTABLE) as theOtherCount
FROM RDB$DATABASE
CTE per table + cross join
WITH query1 AS (
SELECT
SUM(ID) as theSum,
COUNT(DISTINCT SOMEVALUE) as theCount
FROM MYTABLE
),
query2 AS (
SELECT
COUNT(DISTINCT SOMEOTHERVALUE) as theOtherCount
FROM MYOTHERTABLE
)
SELECT
query1.theSum,
query1.theCount,
query2.theOtherCount
FROM query1
CROSS JOIN query2
There are probably some more solutions. You might want to ask yourself if it is worth the effort of coming up with a (convoluted, hard to understand) single query to get this data were two queries are sufficient, easier to understand and in the case of large datasets: two separate queries might be faster.
In this case all "count" would return the same value.
Try to do the same using sub queries:
Select
(Select count (*) from Table1),
   (Select count (*) from table2)
from Table3

Where Statement w/ Distinct

I have a large table but for the purposes of this question, let's assume I have the follwoing column strucure:
I'd like to have a Where statement that returns only rows where the e-mail address is distinct in that particular column.
Thoughts?
SELECT BillingEMail
FROM tableName
GROUP BY BillingEMail
HAVING COUNT(BillingEMail) = 1
OR HAVING COUNT(*) = 1
SQLFiddle Demo
I don't know what RDBMS you are using (the reason why i can't introduce of using analytical functions) but you can do this by joining with a subquery if you want to get all columns
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT BillingEMail
FROM tableName
GROUP BY BillingEMail
HAVING COUNT(BillingEMail) = 1
)b ON a.BillingEMail = b.BillingEMail
SQLFIddle Demo
In most databases, you can do this
select t.AccountId, t.BillingEmail
from (select t.*, count(*) over (partition by BillingEmail) as cnt
from t
) t
where cnt = 1
The advantage of this approach is that you can get as many columns as you like from the table.
I prefer JW's approach, but here is another one using NOT EXISTS.
SELECT AccountID, [Billing Email]
FROM table t1
WHERE NOT EXISTS (
-- Make sure that no other row contains the same
-- email, but a different Account ID.
SELECT 1
FROM table t2
WHERE t1.[Billing Email] = t2.[Billing Email]
AND t1.AccountID <> t2.AccountID
)