Subquery performing a COUNT DISTINCT on the wrong grouping

Subquery performing a COUNT DISTINCT on the wrong grouping - sql

I'm fairly new to SQL and have a problem with a subquery that is performing a count distinct on the wrong grouping. I'd appreciate any help at all with this.
I have attendees at sessions for a particular group that I am querying for a MS SQL Server (SSRS 2008) Report.
I am trying to join TblGroup, TblGroupSession and TblGroupSUAttendee and count the DISTINCT number of GroupSUAttendee at any GROUP. The query below is counting the distinct number of GroupSUAttendee at any SESSION, so when I add the counts together for a group I am getting duplicates if a TblGroupSUAttendee has attended more than one session.
I need to keep one row per session in the query as I need that for other purposes, but it is fine for each session row to show the complete total of TblGroupSUAttendees for that group as I can reference that value once per group in my SSRS report.
Thoughts/advice/pointers much appreciated.
Thanks
Eils
SELECT
TblGroup.GroupId
,TblGroupSession.GroupSessionId
,TblGroupSession.GroupSessionDate
,TblGroupSUAttendee.GroupSUAttendeeCount
FROM
TblGroup
LEFT OUTER JOIN TblGroupSession
ON TblGroup.GroupId = TblGroupSession.GroupSessionGroupId
LEFT OUTER JOIN (select COUNT(DISTINCT GroupSUAttendeeId) AS GroupSUAttendeeCount,
GroupSUAttendeeGroupSessionId
FROM TblGroupSUAttendee
GROUP BY GroupSUAttendeeGroupSessionId) as TblGroupSUAttendee ON GroupSUAttendeeGroupSessionId = TblGroupSession.GroupSessionId
WHERE
GroupSessionDate >= #StartDate AND GroupSessionDate <= #EndDate

If you want to count attendees within groups, then use group by, but don't include per-session information. In other words, just combine the groups with the sessions, and the sessions with the attendees in one query. Then aggregate by GroupId and count the attendees:
SELECT g.GroupId,
COUNT(DISTINCT GroupSUAttendeeId) AS GroupSUAttendeeCount
FROM TblGroup g LEFT OUTER JOIN
tblGroupSession gs
ON g.GroupId = gs.GroupSessionGroupId LEFT OUTER JOIN
TblGroupSUAttendee ga
ON ga.GroupSUAttendeeGroupSessionId = gs.GroupSessionId
GROUP BY g.GroupId;

Related

Too much Data using DISTINCT MAX

I want to see the last activity each individual handset and the user that used that handset. I have a table UserSessions that stores the last activity of a particular user as well as what handset they used in that activity. There are roughly 40 handsets, yet I always get back way too many records, like 10,000 rows when I only want the last activity of each handset. What am I doing wrong?
SELECT DISTINCT MAX(UserSessions.LastActivity), Handsets.Name,Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE
Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY UserSessions.LastActivity, Handsets.Name,Users.Username
I expect to get one record per handset of the users last activity with that handset. What I get is multiple records on all handsets and dates over 10000 rows

You typically GROUP BY the same columns as you SELECT, except those who are arguments to set functions.
This GROUP BY returns no duplicates, so SELECT DISTINCT isn't needed.
SELECT MAX(UserSessions.LastActivity), Handsets.Name, Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY Handsets.Name, Users.Username

There is no such thing as DISTINCT MAX. You have SELECT DISTINCT which ensures that all columns referenced in the SELECT are not duplicated (as a group) across multiple rows. And there is MAX() an aggregation function.
As a note: SELECT DISTINCT is almost never appropriate with GROUP BY.
You seem to want:
SELECT *
FROM (SELECT h.Name, u.Username, MAX(us.LastActivity) as last_activity,
RANK() OVER (PARTITION BY h.Name ORDER BY MAX(us.LastActivity) desc) as seqnum
FROM UserSessions us JOIN
Handsets h
ON h.HandsetId = us.HandsetId INNER JOIN
Users u
ON u.UserId = us.UserId
WHERE h.Name in (1000,1001.1002,1003,1004....) AND
h.Deleted = 0
GROUP BY h.Name, u.Username
) h
WHERE seqnum = 1

Oracle - select statement to rollup multiple tables within a time frame

I have 3 Oracle tables for a project that link a demo Transaction table to Transaction_Customer and Transaction_Employee as shown below. Each transaction can have multiple customers involved and many employees involved.
I am trying to write a SQL query which will list each Customer_ID that has had transactions with multiple employees within a one period. I would like the output to include a single row for each Customer_ID with a comma separated list of which Employee_IDs had a transaction with that customer.
The output should look like this:
Customer_ID|Employees
601|007,008,009
The basic query to join the tables together looks like this:
select * from transactions t
left join transactions_customer tc
on t.t_id = tc.t_id
left join transactions_employee te
on t.t_id = te.t_id
How do I get this do I finish this assignment and get the query working the way intended?
Thank you!
Transactions
T_ID|Date|Amount
1|1/10/2017|100
2|1/10/2017|200
3|1/31/2017|150
4|2/16/2017|175
5|2/17/2017|175
6|2/18/2017|185
Transactions_Customer
T_ID|Customer_ID
1|600
1|601
1|602
2|605
3|606
4|601
5|607
6|607
Transactions_Employee
T_ID|Employee_ID
1|007
1|008
2|009
3|008
4|009
5|007
6|007

Is this what you want?
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from Transactions_Customer tc join
Transactions_Employee te
on tc.t_id = te.t_id
group by tc.Customer_id;
You only need the Transactions table for filtering on the date. Your question alludes to such filtering but does not exactly describe it, so I left it out.
Edit:
The customer data (and perhaps the employees data too) has duplicates. To avoid these in the output:
select tc.Customer_id,
listagg(te.employee_id, ',') within group (order by te.employee_id) as employees
from (select distinct tc.t_id, tc.customer_id
from Transactions_Customer tc
) tc join
(select distinct te.t_id, te.employee_id
from Transactions_Employee te
) te
on tc.t_id = te.t_id
group by tc.Customer_id;

get the count of records from two tables in sql

I had three tables like ORG_DETAILS, DISTRICT_MASTER AND WORKER_DETAILS..
I want the count of number of organizations and related workers count based on district_name. here is the query i am trying......
SELECT dm.DISTRICT_NAME ,
count(od.Org_ID)as orgcount,
count(wd.WORKER_ID)as workerscount
from ORG_DETAILS od
left join WORKER_DETAILS wd on wd.ORG_ID = od.ORG_ID
left join DISTRICT_MASTER dm on od.DISTRICT_ID = dm.DISTRICT_ID
GROUP BY dm.DISTRICT_NAME
i am getting duplicate values in count like, one extra org count and worker count....
please help me with this...
thank you.....

The simplest way to fix your problem is to use count(distinct):
SELECT dm.DISTRICT_NAME ,
count(distinct od.Org_ID)a s orgcount,
count(distinct wd.WORKER_ID)as workerscount
from ORG_DETAILS od left join
WORKER_DETAILS wd
on wd.ORG_ID = od.ORG_ID left join
DISTRICT_MASTER dm
on od.DISTRICT_ID = dm.DISTRICT_ID
GROUP BY dm.DISTRICT_NAME;
For performance reasons, doing the aggregation along each dimension before the join performs better if the counts are high.

Sum SQL statement outputs many rows when I expect only one

So I have 3 tables joined as shown:
What I want to do is query for the sum of all the holdings that fall into the criteria specified for the clients in my query. Here is what I have:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
GROUP BY Holdings.HoldingDate, Clients.Active, Clients.RiskCode, Clients.NewClient, Clients.BaseCurrency, Clients.ClientID
HAVING (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));
Here's an example of what I get as the result:
SumOfHoldingValue
1056071.96
466595.6
1074459.38
371142.54
814874.42
458203.65
8308697.09
254733.94
583796.33
443897.76
203787.11
1057445.84
1058751.26
317507.43
So there are quite a few criteria for the client table but the result is a list of SumOfHoldingValue when what I want is just one number. I.e. the sum of all the holding values. Why is it not grouping them all together to form one total?

Since you're not computing any aggregates on the values in the HAVING clause, I think you just want this:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
WHERE (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));
Which, with no GROUP clause will produce a single GROUP (over the entire set) and produce a single row.

If you just want totals - remove the group by. With the group by clause it gives you totals for every group separately.
If you need to filter data put the condition into Where clause instead

Your query contains a group by clause which returns each group on its own line.
You are also using a having clause. The having clause is applied after the group by. Usually, it would contain aggregation functions -- such as having count(*) > 1. In your case, it is used as a where clause.
Try rewriting the query like this:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
WHERE (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));

SQL Count from 2 tables

Background: There are multiple DBPArentProductKEys associated with a POG.ID.
I need to count the number of Pog.DBKEYs that occurs. Right now the results count total POG.IDs in the total database for value4, but I want the count associated with the DBparentproductkey.
select distinct
Count(pog.DBKey) as Total,
pos.DBParentProductKey
from
ix_spc_planogram as pog with (nolock), ix_spc_position as pos with (nolock),
ix_spc_product as pro with (nolock)
where
pog.dbkey = pos.dbparentplanogramkey
and pog.Value4 = 358
group by
pog.DBKey, pos.DBParentProductKey

Take pog.DBKey out of the GROUP BY.
Also, I think you are missing a join condition. You have no join condition against table pro.
Finally, DISTINCT shouldn't be needed.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Subquery performing a COUNT DISTINCT on the wrong grouping - sql

Related

Too much Data using DISTINCT MAX

Oracle - select statement to rollup multiple tables within a time frame

get the count of records from two tables in sql

Sum SQL statement outputs many rows when I expect only one

SQL Count from 2 tables

Categories

Resources