Count different groups in the same query - sql

Imagine I have a table like this:
# | A | B | MoreFieldsHere
1 1 1
2 1 3
3 1 5
4 2 6
5 2 7
6 3 9
B is associated to A in an 1:n relationship. The table could've been created with a join for example.
I want to get both the total count and the count of different A.
I know I can use a query like this:
SELECT v1.cnt AS total, v2.cnt AS num_of_A
FROM
(
SELECT COUNT(*) AS cnt
FROM SomeComplicatedQuery
WHERE 1=1
-- AND SomeComplicatedCondition
) v1,
(
SELECT COUNT(A) AS cnt
FROM SomeComplicatedQuery
WHERE 1=1
-- AND SomeComplicatedCondition
GROUP BY A
) v2
However SomeComplicatedQuery would be a complicated and slow query and SomeComplicatedCondition would be the same in both cases. And I want to avoid calling it unnessesarily. Aside from that if the query changes, you need to make sure to change it in the other place too, making it prone to error and creating (probably unnessesary) work.
Is there a way to do this more efficiently?

Are you looking for this?
SELECT COUNT(*) AS total, COUNT(DISTINCT A) AS num_of_A
FROM (. . . ) q

Related

JOIN on aggregate function

I have a table showing production steps (PosID) for a production order (OrderID) and which machine (MachID) they will be run on; I’m trying to reduce the table to show one record for each order – the lowest position (field “PosID”) that is still open (field “Open” = Y); i.e. the next production step for the order.
Example data I have:
OrderID
PosID
MachID
Open
1
1
A
N
1
2
B
Y
1
3
C
Y
2
4
C
Y
2
5
D
Y
2
6
E
Y
Example result I want:
OrderID
PosID
MachID
1
2
B
2
4
C
I’ve tried two approaches, but I can’t seem to get either to work:
I don’t want to put “MachID” in the GROUP BY because that gives me all the records that are open, but I also don’t think there is an appropriate aggregate function for the “MachID” field to make this work.
SELECT “OrderID”, MIN(“PosID”), “MachID”
FROM Table T0
WHERE “Open” = ‘Y’
GROUP BY “OrderID”
With this approach, I keep getting error messages that T1.”PosID” (in the JOIN clause) is an invalid column. I’ve also tried T1.MIN(“PosID”) and MIN(T1.”PosID”).
SELECT T0.“OrderID”, T0.“PosID”, T0.“MachID”
FROM Table T0
JOIN
(SELECT “OrderID”, MIN(“PosID”)
FROM Table
WHERE “Open” = ‘Y’
GROUP BY “OrderID”) T1
ON T0.”OrderID” = T1.”OrderID”
AND T0.”PosID” = T1.”PosID”
Try this:
SELECT “OrderID”,“PosID”,“MachID” FROM (
SELECT
T0.“OrderID”,
T0.“PosID”,
T0.“MachID”,
ROW_NUMBER() OVER (PARTITION BY “OrderID” ORDER BY “PosID”) RNK
FROM Table T0
WHERE “Open” = ‘Y’
) AS A
WHERE RNK = 1
I've included the brackets when selecting columns as you've written it in the question above but in general it's not needed.
What it does is it first filters open OrderIDs and then numbers the OrderIDs from 1 to X which are ordered by PosID
OrderID
PosID
MachID
Open
RNK
1
2
B
Y
1
1
3
C
Y
2
2
4
C
Y
1
2
5
D
Y
2
2
6
E
Y
3
After it filters on the "rnk" column indicating the lowest PosID per OrderID. ROW_NUMBER() in the select clause is called a window function and there are many more which are quite useful.
P.S. Above solution should work for MSSQL

SQL COUNT with condition and without - using JOIN

My goal is something like following table:
Key | Count since date X | Count total
1 | 4 | 28
With two simple selects I could gain this values: (the key of the table consists of 3 columns [t$ncmp, t$trav, t$seqn])
1. SELECT COUNT(*) FROM db.table WHERE t$date >= sysdate-2 GROUP BY t$ncmp, t$trav, t$seqn
2. SELECT COUNT(*) FROM db.table GROUP BY t$ncmp, t$trav, t$seqn
How can I join these statements?
What I tried:
SELECT n.t$trav, COUNT(n.t$trav), m.total FROM db.table n
LEFT JOIN (SELECT t$ncmp, t$trav, t$seqn, COUNT(*) as total FROM db.table
GROUP BY t$ncmp, t$trav, t$seqn) m
ON (n.t$ncmp = m.t$ncmp AND n.t$trav = m.t$trav AND n.t$seqn = m.t$seqn)
WHERE n.t$date >= sysdate-2
GROUP BY n.t$ncmp, n.t$trav, n.t$seqn
I tried different variantes, but always got errors like 'group by is missing' or 'unknown qualifier'.
Now this at least executes, but total is always 2.
T$TRAV COUNT(N.T$TRAV) TOTAL
4 2 2
29 3 2
51 1 2
62 2 2
16 1 2
....
If it matter, I will run this as an OPENQUERY from MSSQLSERVER to Oracle-DB.
I'd try
GROUP BY n.t$trav, m.total
You typically GROUP BY the same columns as you SELECT - except those who are arguments to set functions.
My goal is something like following table:
If so, you seem to want conditional aggregation:
select key, count(*) as total,
sum(case when datecol >= date 'xxxx-xx-xx' then 1 else 0 end) as total_since_x
from t
group by key;
I'm not sure how this relates to your sample queries. I simply don't see the relationship between that code and your question.

SQL To delete number of items is less than required item number

I have two tables - StepModels (support plan) and FeedbackStepModels (feedback), StepModels keeps how many steps each support plan requires.
SELECT [SupportPlanID],COUNT(*)AS Steps
FROM [StepModels]
GROUP BY SupportPlanID
SupportPlanID (Steps)
-------------------------------
1 4
2 9
3 3
4 10
FeedbackStepModels keeps how many steps employee entered the system
SELECT [FeedbackID],SupportPlanID,Count(*)AS StepsNumber
FROM [FeedbackStepModels]
GROUP BY FeedbackID,SupportPlanID
FeedbackID SupportPlanID
---------------------------------------------
1 1 3 --> this suppose to be 4
2 2 9 --> Correct
3 3 0 --> this suppose to be 3
4 4 10 --> Correct
If submitted Feedback steps total is less then required total amount I want to delete this wrong entry from the database. Basically i need to delete FeedbackID 1 and 3.
I can load the data into List and compare and delete it, but want to know if we can we do this in SQL rather than C# code.
You can use the query below to remove your unwanted data by SQL Script
DELETE f
FROM FeedbackStepModels f
INNER JOIN (
SELECT [FeedbackID],SupportPlanID, Count(*) AS StepsNumber
FROM [FeedbackStepModels]
GROUP BY FeedbackID,SupportPlanID
) f_derived on f_derived_FeedbackID=f.FeedBackID and f_derived.SupportPlanID = f.SupportPlanID
INNER JOIN (
SELECT [SupportPlanID],COUNT(*)AS Steps
FROM [StepModels]
GROUP BY SupportPlanID
) s_derived on s_derived.SupportPlanID = f.SupportPlanID
WHERE f_derived.StepsNumber < s_derived.Steps
I think you want something like this.
DELETE FROM [FeedbackStepModels]
WHERE FeedbackID IN
(
SELECT a.FeedbackID
FROM
(
SELECT [FeedbackID],
SupportPlanID,
COUNT(*) AS StepsNumber
FROM [FeedbackStepModels]
GROUP BY FeedbackID,
SupportPlanID
) AS a
INNER JOIN
(
SELECT [SupportPlanID],
COUNT(*) AS Steps
FROM [StepModels]
GROUP BY SupportPlanID
) AS b ON a.SupportPlanID = b.[SupportPlanID]
WHERE a.StepsNumber < b.Steps
);

counting subquery in SQL

I have the following query to count how many times each process_track_id occurs in a table:
SELECT
a.process_track_id,
COUNT(1) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
This returns the following results:
process_track_id | num
1 14
2 44
3 16
5 8
6 18
7 17
8 14
This is great. Now is the part where I am stuck. I would like to get the following table:
num count
8 1
14 2
16 1
17 1
18 1
44 1
Where num are the distinct counts from the first table, and count is how many times that frequency occurs.
Here is what I have tried (it's a subquery, but I'm not sold on the method) and I haven't been able to get it to work just yet. I'm new to SQL and I think I'm missing out on some some key aspects of the syntax.
SELECT
X.id_count,
count(1) as 'num_count'
FROM
(SELECT
a.process_track_id,
COUNT(1) AS 'id_count'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
--COUNT(1) AS 'id_count'
) X;
Any ideas?
It's probably good to keep in mind that this may have to be run on a database with at least 1 million records, and I don't have the ability to create a new table in the process.
Thanks!
Here's the subquery method you were driving at:
SELECT id_count, COUNT(*) AS 'num_count'
FROM (SELECT a.process_track_id
,COUNT(*) AS 'id_count'
FROM transreport.process_name a
GROUP BY a.process_track_id
)sub
GROUP BY id_count
Not sure there's a better method as the aggregation needs to run once anyway.
Try this
SELECT x.num, COUNT(*) AS COUNT
FROM (
SELECT
a.process_track_id, -- <--- You may removed this column
COUNT(*) AS 'num'
FROM
transreport.process_name a
GROUP BY
a.process_track_id
) X
GROUP BY X.num

return count 0 with mysql group by

database table like this
============================
= suburb_id | value
= 1 | 2
= 1 | 3
= 2 | 4
= 3 | 5
query is
SELECT COUNT(suburb_id) AS total, suburb_id
FROM suburbs
where suburb_id IN (1,2,3,4)
GROUP BY suburb_id
however, while I run this query, it doesn't give COUNT(suburb_id) = 0 when suburb_id = 0
because in suburbs table, there is no suburb_id 4, I want this query to return 0 for suburb_id = 4, like
============================
= total | suburb_id
= 2 | 1
= 1 | 2
= 1 | 3
= 0 | 4
A GROUP BY needs rows to work with, so if you have no rows for a certain category, you are not going to get the count. Think of the where clause as limiting down the source rows before they are grouped together. The where clause is not providing a list of categories to group by.
What you could do is write a query to select the categories (suburbs) then do the count in a subquery. (I'm not sure what MySQL's support for this is like)
Something like:
SELECT
s.suburb_id,
(select count(*) from suburb_data d where d.suburb_id = s.suburb_id) as total
FROM
suburb_table s
WHERE
s.suburb_id in (1,2,3,4)
(MSSQL, apologies)
This:
SELECT id, COUNT(suburb_id)
FROM (
SELECT 1 AS id
UNION ALL
SELECT 2 AS id
UNION ALL
SELECT 3 AS id
UNION ALL
SELECT 4 AS id
) ids
LEFT JOIN
suburbs s
ON s.suburb_id = ids.id
GROUP BY
id
or this:
SELECT id,
(
SELECT COUNT(*)
FROM suburb
WHERE suburb_id = id
)
FROM (
SELECT 1 AS id
UNION ALL
SELECT 2 AS id
UNION ALL
SELECT 3 AS id
UNION ALL
SELECT 4 AS id
) ids
This article compares performance of the two approaches:
Aggregates: subqueries vs. GROUP BY
, though it does not matter much in your case, as you are querying only 4 records.
Query:
select case
when total is null then 0
else total
end as total_with_zeroes,
suburb_id
from (SELECT COUNT(suburb_id) AS total, suburb_id
FROM suburbs
where suburb_id IN (1,2,3,4)
GROUP BY suburb_id) as dt
#geofftnz's solution works great if all conditions are simple like in this case. But I just had to solve a similar problem to generate a report where each column in the report is a different query. When you need to combine results from several select statements, then something like this might work.
You may have to programmatically create this query. Using left joins allows the query to return rows even if there are no matches to suburb_id with a given id. If your db supports it (which most do), you can use IFNULL to replace null with 0:
select IFNULL(a.count,0), IFNULL(b.count,0), IFNULL(c.count,0), IFNULL(d.count,0)
from (select count(suburb_id) as count from suburbs where id=1 group by suburb_id) a,
left join (select count(suburb_id) as count from suburbs where id=2 group by suburb_id) b on a.suburb_id=b.suburb_id
left join (select count(suburb_id) as count from suburbs where id=3 group by suburb_id) c on a.suburb_id=c.suburb_id
left join (select count(suburb_id) as count from suburbs where id=4 group by suburb_id) d on a.suburb_id=d.suburb_id;
The nice thing about this is that (if needed) each "left join" can use slightly different (possibly fairly complex) query.
Disclaimer: for large data sets, this type of query might have not perform very well (I don't write enough sql to know without investigating further), but at least it should give useful results ;-)