I have a large but somewhat straightforward SQL query. Basically, users on my site develop reputations for different types of activities, such as writing reviews, leaving comments, and adding entries to our database. For the most part, these points are stored in the reputable_actions table, and I retrieve them by LEFT JOINing the reputable_actions table repeatedly. This feels sloppy, but it mostly work.
The problem I'm experiencing is with two of the reputations, "reviewer" and "community." Unlike the others, they aren't stored in the reputable_actions table. Instead, their values are derived from the votes table, which I access by first LEFT JOINing the comments table. For some reason, joining the comments table causes all my other reputations to increase exponentially. In one trial, the "archivist" reputation was suppose to be 25, but when I joined the comments, it ballooned to 10050.
I'm a novice with SQL and I've tried what I know (namely, applying GROUP BY clauses to users.id), but I haven' had any luck yet. Some guidance would be greatly appreciated.
SELECT users.*,
SUM(COALESCE(reviewers.value, 0)) as reviewer,
SUM(COALESCE(communities.value,0)) as community,
SUM(COALESCE(developers.value,0)) as developer,
SUM(COALESCE(moderators.value,0)) as moderator,
SUM(COALESCE(marketers.value,0)) as marketer,
SUM(COALESCE(archivists.value,0)) as archivist,
SUM(COALESCE(karmas.value,0)) as karma
FROM `users`
LEFT JOIN comments AS impressions
ON impressions.user_id = users.id
AND impressions.type = 'impression'
LEFT JOIN comments AS replies
ON replies.user_id = users.id
AND replies.type = 'reply'
LEFT JOIN votes AS reviewers
ON reviewers.voteable_type = 'impression'
AND reviewers.voteable_id = impressions.id
LEFT JOIN votes AS communities
ON communities.voteable_type = 'reply'
AND communities.voteable_id = replies.id
LEFT JOIN reputable_actions AS developers
ON developers.reputation_type = 'developer'
AND developers.user_id = users.id
LEFT JOIN reputable_actions AS moderators
ON moderators.reputation_type = 'moderator'
AND moderators.user_id = users.id
LEFT JOIN reputable_actions AS marketers
ON marketers.reputation_type = 'marketer'
AND marketers.user_id = users.id
LEFT JOIN reputable_actions AS archivists
ON archivists.reputation_type = 'archivist'
AND archivists.user_id = users.id
LEFT JOIN reputable_actions AS karmas
ON karmas.reputation_type = 'karma'
AND karmas.user_id = users.id
GROUP BY users.id
Basically you need to do two separate group bys, and combine the results. There's a trick to avoid joining multiple times, it may not by faster if you have many other voteable types, or reputation types.
Select
u.*,
Coalesce(r.developer, 0) as developer,
Coalesce(r.moderator, 0) as moderator,
Coalesce(r.marketer, 0) as marketer,
Coalesce(r.archivist, 0) as archivist,
Coalesce(r.karma, 0) as karma,
Coalesce(v.impressions, 0) as impressions,
Coalesce(v.replies, 0) as replies
From
users u
Left Outer Join (
Select
user_id,
Sum(Case When reputation_type = 'developer' Then value Else 0 End) as developer,
Sum(Case When reputation_type = 'moderator' Then value Else 0 End) as moderator,
Sum(Case When reputation_type = 'marketer' Then value Else 0 End) as marketer,
Sum(Case When reputation_type = 'archivist' Then value Else 0 End) as archivist,
Sum(Case When reputation_type = 'karma' Then value Else 0 End) as karma
From
reputable_actions
Group By
user_id
) r On u.id = r.user_id
Left Outer Join (
Select
c.user_id,
Sum(Case When c.type = 'impression' Then v.value Else 0 End) as impressions,
Sum(Case When c.type = 'reply' Then v.value Else 0 End) as replies
From
comments c
inner join -- maybe left outer?
votes v
on v.voteable_type = c.type And v.voteable_id = c.id
Group By
user_id
) v On u.id = v.user_id
Example (with no data). If your tables are structured differently to this, let me know.
There are multiple rows in comments and/or votes that match one row in the "rest" of the join. This "multiplies" the resulting rows and multiplies the results of other aggregate functions with it (as you already noted).
The simplest solution is to get the SUM(reviewers.value) and SUM(communities.value) in a separate query.
BTW, you'll experience the same problem if there is ever more than one reputable_actions row (of the same reputation_type) matching the same users row.
Related
This is a query using a union join, with roles, permission tables and some other intermediate tables.I think it has some repetitions, how can I optimize it, or can I just change to another way of writing it?
SELECT DISTINCT
menu.perm_key
FROM
user_role
LEFT JOIN role ON user_role.role_id = role.id
LEFT JOIN role_menu ON user_role.role_id = role_menu.role_id
LEFT JOIN menu ON menu.id = role_menu.menu_id
WHERE
user_role.user_id = 1
AND role.`status` = 1
AND menu.`status` = 1
UNION
SELECT DISTINCT
role.role_key
FROM
user_role
LEFT JOIN role ON user_role.role_id = role.id
WHERE
user_role.user_id = 1
AND role.`status` = 1
Below is the result of usingSELECT DISTINCT role_key, perm_key from ...
role_key
perm_key
user
create_order
user
update_info
I expect it to become
key_info
user
create_order
update_info
UPDATE:
Based on the information in the comments, I'm updating my response.
Needing the two columns to return in one column, the only other way I see is to use some sort of function. I'm going with IF here, but COALESCE could work as well.
SELECT DISTINCT
IF(menu.perm_key IS NOT NULL, menu.perm_key, role.role_key) as key
FROM
user_role
LEFT JOIN role ON user_role.role_id = role.id
LEFT JOIN role_menu ON user_role.role_id = role_menu.role_id
LEFT JOIN menu ON menu.id = role_menu.menu_id
WHERE
user_role.user_id = 1
AND role.`status` = 1
AND menu.`status` = 1
Original Answer
Without more information, the only thing I see is that the UNION appears to be unnecessary.
SELECT DISTINCT
menu.perm_key, role.role_key
FROM
user_role
LEFT JOIN role ON user_role.role_id = role.id
LEFT JOIN role_menu ON user_role.role_id = role_menu.role_id
LEFT JOIN menu ON menu.id = role_menu.menu_id
WHERE
user_role.user_id = 1
AND role.`status` = 1
AND menu.`status` = 1
Unless you expected these values as separate rows, this should work fine. If you are expecting separate rows, I would question why. To me, each row should represent "the same thing" as any other row, but with different values. By using separate rows, one row represents the "perm key" while the next represents the "role key." Even if these point to the same object type or enum, they have different uses and I would therefore consider them as "different things" and not appropriate to return on separate rows. This would be especially true and awkward to work with if your query returns more than 2 rows.
I have a query I am trying to use to give my users a list of only clients that currently have open accounts. The issue I am having is that it is bringing in ALL clients even if they do not have any current accounts. I feel like I am missing something simple here but so far have not been able to get this to work as intended.
SELECT Client.Client
FROM dbo.Client
LEFT JOIN dbo.ClientLink On Client.Client = ClientLink.Client
LEFT JOIN dbo.ClientLink.ExternalId = AccountLink.ExternalId
LEFT JOIN dbo.AccountLink.Account = Account.Account
GROUP By Client.Client
HAVING SUM(CASE WHEN Account.Closed IS NULL THEN 1 ELSE 0 END) > 0
My guess is that if clients have no accounts, then the links between the tables may be removed. If so, the solution is to simply use INNER JOIN.
Your syntax is also all messed up. But the query you want might be:
SELECT c.Client
FROM dbo.Client c JOIN
dbo.ClientLink cl
AccountLink al
ON cl.ExternalId = al.ExternalId JOIN
Account a
ON a.Account = al.Account
GROUP By c.Client
HAVING SUM(CASE WHEN Account.Closed IS NULL THEN 1 ELSE 0 END) > 0
My advice would be to rethink your query and how you are storing your data this is fundamental get this wrong then querying the database becomes a nightmare and you get unexpected results for example does Account.Closed need to be null? if this was me I would create a default constraint that would set new accounts to be 0 (meaning open) then write a query to update accounts that needed to be closed and set this to 1.
again without really knowing how your schema looks providing an accurate answer is difficult.
I would then re-write the query to be.
SELECT Client.Client
FROM dbo.Client
LEFT JOIN dbo.ClientLink On Client.Client = ClientLink.Client
LEFT JOIN dbo.ClientLink.ExternalId = AccountLink.ExternalId
LEFT JOIN dbo.AccountLink.Account = Account.Account
WHERE Account.Closed = 0 --only return clients with opened accounts
if this is not possible to do then I would re-write your query as follows:
SELECT Client.Client, COUNT(Client.Client) AS NumberOfOpenAccounts
FROM dbo.Client
LEFT JOIN dbo.ClientLink On Client.Client = ClientLink.Client
LEFT JOIN dbo.ClientLink.ExternalId = AccountLink.ExternalId
LEFT JOIN dbo.AccountLink.Account = Account.Account
WHERE ISNULL(Account.Closed, 1) = 0 --only return clients with opened accounts
GROUP BY Client.Client
If all you are looking for is the client name, you can use DISTINCT instead of grouping. Something like the below statement might do the trick:
SELECT Distinct Client.Client
FROM Client
INNER JOIN ClientLink On Client.Client = ClientLink.Client
INNER JOIN AccountLink On ClientLink.ExternalId = AccountLink.ExternalId
INNER JOIN Account on AccountLink.Account = Account.Account
WHERE Account.Closed IS NULL
Hope this helps.
I have a query that consists of 1 table and 2 sub queries. The table being a listing of all customers, 1 sub query is a listing all of the quotes given over a period of time for customers and the other sub query is a listing of all of the orders booked for a customer over the same period of time. What I am trying to do is return a result set that is a customer, the number of quotes given, and the number of orders booked over a given period of time. However what I am returning is only a listening of customers over the period of time that have an equivalent quote and order count. I feel like I am missing something obvious within the context of the query but I am unable to figure it out. Any help would be appreciated. Thank you.
Result Set should look like this
Customer-------Quotes-------Orders Placed
aaa----------------4----------------4
bbb----------------9----------------18
ccc----------------18----------------9
select
[Customer2].[Name] as [Customer2_Name],
(count( Quotes.UD03_Key3 )) as [Calculated_CustomerQuotes],
(count( Customer_Bookings.OrderHed_OrderNum )) as [Calculated_CustomerBookings]
from Erp.Customer as Customer2
left join (select
[UD03].[Key3] as [UD03_Key3],
[UD03].[Key4] as [UD03_Key4],
[UD03].[Key1] as [UD03_Key1],
[UD03].[Date02] as [UD03_Date02]
from Ice.UD03 as UD03
inner join Ice.UD02 as UD02 on
UD03.Company = UD02.Company
And
CAST(CAST(UD03.Number09 AS INT) AS VARCHAR(30)) = UD02.Key1
left outer join Erp.Customer as Customer on
UD03.Company = Customer.Company
And
UD03.Key1 = Customer.Name
left outer join Erp.SalesTer as SalesTer on
Customer.Company = SalesTer.Company
And
Customer.TerritoryID = SalesTer.TerritoryID
left outer join Erp.CustGrup as CustGrup on
Customer.Company = CustGrup.Company
And
Customer.GroupCode = CustGrup.GroupCode
where (UD03.Key3 <> '0')) as Quotes on
Customer2.Name = Quotes.UD03_Key1
left join (select
[Customer1].[Name] as [Customer1_Name],
[OrderHed].[OrderNum] as [OrderHed_OrderNum],
[OrderDtl].[OrderLine] as [OrderDtl_OrderLine],
[OrderHed].[OrderDate] as [OrderHed_OrderDate]
from Erp.OrderHed as OrderHed
inner join Erp.Customer as Customer1 on
OrderHed.Company = Customer1.Company
And
OrderHed.BTCustNum = Customer1.CustNum
inner join Erp.OrderDtl as OrderDtl on
OrderHed.Company = OrderDtl.Company
And
OrderHed.OrderNum = OrderDtl.OrderNum) as Customer_Bookings on
Customer2.Name = Customer_Bookings.Customer1_Name
where Quotes.UD03_Date02 >= '5/15/2018' and Quotes.UD03_Date02 <= '5/15/2018' and Customer_Bookings.OrderHed_OrderDate >='5/15/2018' and Customer_Bookings.OrderHed_OrderDate <= '5/15/2018'
group by [Customer2].[Name]
You have several problems going on here. The first problem is your code is so poorly formatted it is user hostile to look at. Then you have left joins being logically treated an inner joins because of the where clause. You also have date literal strings in language specific format. This should always be the ANSI format YYYYMMDD. But in your case your two predicates are contradicting each other. You have where UD03_Date02 is simultaneously greater than and less than the same date. Thankfully you have =. But if your column is a datetime you have prevented any rows from being returned again (the first being your where clause). You have this same incorrect date logic and join in the second subquery as well.
Here is what your query might look like with some formatting so you can see what is going on. Please note I fixed the logical join issue. You still have the date problems because I don't know what you are trying to accomplish there.
select
[Customer2].[Name] as [Customer2_Name],
count(Quotes.UD03_Key3) as [Calculated_CustomerQuotes],
count(Customer_Bookings.OrderHed_OrderNum) as [Calculated_CustomerBookings]
from Erp.Customer as Customer2
left join
(
select
[UD03].[Key3] as [UD03_Key3],
[UD03].[Key4] as [UD03_Key4],
[UD03].[Key1] as [UD03_Key1],
[UD03].[Date02] as [UD03_Date02]
from Ice.UD03 as UD03
inner join Ice.UD02 as UD02 on UD03.Company = UD02.Company
And CAST(CAST(UD03.Number09 AS INT) AS VARCHAR(30)) = UD02.Key1
left outer join Erp.Customer as Customer on UD03.Company = Customer.Company
And UD03.Key1 = Customer.Name
left outer join Erp.SalesTer as SalesTer on Customer.Company = SalesTer.Company
And Customer.TerritoryID = SalesTer.TerritoryID
left outer join Erp.CustGrup as CustGrup on Customer.Company = CustGrup.Company
And Customer.GroupCode = CustGrup.GroupCode
where UD03.Key3 <> '0'
) as Quotes on Customer2.Name = Quotes.UD03_Key1
and Quotes.UD03_Date02 >= '20180515'
and Quotes.UD03_Date02 <= '20180515'
left join
(
select
[Customer1].[Name] as [Customer1_Name],
[OrderHed].[OrderNum] as [OrderHed_OrderNum],
[OrderDtl].[OrderLine] as [OrderDtl_OrderLine],
[OrderHed].[OrderDate] as [OrderHed_OrderDate]
from Erp.OrderHed as OrderHed
inner join Erp.Customer as Customer1 on OrderHed.Company = Customer1.Company
And OrderHed.BTCustNum = Customer1.CustNum
inner join Erp.OrderDtl as OrderDtl on OrderHed.Company = OrderDtl.Company
And OrderHed.OrderNum = OrderDtl.OrderNum
) as Customer_Bookings on Customer2.Name = Customer_Bookings.Customer1_Name
and Customer_Bookings.OrderHed_OrderDate >= '20180515'
and Customer_Bookings.OrderHed_OrderDate <= '20180515'
group by [Customer2].[Name]
COUNT() will just give you the number of records. You'd expect this two result columns to be equal. Try structuring it like this:
SUM(CASE WHEN Quote.UD03_Key1 IS NOT NULL THEN 1 ELSE 0 END) AS QuoteCount,
SUM(CASE WHEN Customer_Bookings.Customer1_Name IS NOT NULL THEN 1 ELSE 0 END) AS custBookingCount
I have the following SQL query, and need to figure out where the "signatures" data is actually being read from. It's not from the 'claims' table, and doesn't seem to be from the 'questionnaire_answers' table. I believe it will be a boolean value, if that helps at all.
I'm reasonably proficient at SQL, but the joins have left me a bit confused.
(There's some PHP, but it's not relevant to the question).
$SQL="SELECT surveyor, COUNT(signed_total) AS 'total', SUM(signed_total) AS 'signed_total' FROM (
SELECT DISTINCT claims.claim_id, CONCAT(surveyors.user_first_name, CONCAT(' ', surveyors.user_surname)) AS 'surveyor', CASE WHEN signatures.claim_id IS NOT NULL THEN 1 ELSE 0 END AS 'signed_total' FROM claims
INNER JOIN users surveyors ON claims.surveyor_id = surveyors.user_id
LEFT OUTER JOIN signatures ON claims.claim_id = signatures.claim_id
INNER JOIN questionnaire_answers ON questionnaire_answers.claim_id = claims.claim_id
WHERE (claims.claim_type <> ".$conn->qstr(TYPE_DESKTOP).")
AND (claims.claim_type <> ".$conn->qstr(TYPE_AUDIT).")
AND (claims.claim_cancelled_id <= 0)
AND (claims.date_completed BETWEEN '".UK2USDate($start_date)." 00:00:00' AND '".UK2USDate($end_date)." 23:59:59')
) AS tmp
GROUP BY surveyor
ORDER BY surveyor ASC
";
Thank you!
signatures is a table (see LEFT OUTER JOIN signatures in your query).
As written in FROM clause :
FROM claims
INNER JOIN users surveyors ON claims.surveyor_id = surveyors.user_id
LEFT OUTER JOIN signatures ON claims.claim_id = signatures.claim_id
The LEFT keyword means that the rows of the left table are preserved; So all rows from claims table are considered and NULL marks are added as placeholders for the attributes from the nonpreserved side of the join which is signatures table here.
So CASE WHEN signatures.claim_id IS NOT NULL THEN 1 ELSE 0 END AS 'signed_total' basically checks that if a match between these two tables exists based on claim_id then signed_total column should have value 1 else 0.
Hope that helps!!
I have two tables, one for users, and one for renewals. I want to select all users who have a row in the renewals table for a specific year, and I can do this fine. However, if there are more than one row for a specific user for the specific year in the renewals table, I get duplicates, which I don't want.
I assume it's because I still don't quite understand JOINS, so here is my query:
SELECT * FROM `users` AS US
RIGHT JOIN `usermeta` UM1
ON UM1.`user_id` = US.`ID`
RIGHT JOIN `membership_renewals` MR
ON MR.`user` = US.`ID` AND MR.year = '2011'
WHERE
UM1.meta_key = 'member'
AND UM1.meta_value = 1
AND US.`user_pass` NOT LIKE '-%'
You can do it with JOINS, but I like to do this kind of thing with EXISTS and subqueries, because it reads more like the rule I am trying to enforce.
SELECT * FROM `users` AS US
RIGHT JOIN `usermeta` UM1
ON UM1.`user_id` = US.`ID`
WHERE
Exists (Select 1 FROM `membership_renewals` MR
WHERE MR.`user` = US.`ID`
AND MR.year = '2011')
AND UM1.meta_key = 'member'
AND UM1.meta_value = 1
AND US.`user_pass` NOT LIKE '-%'
P.S. I really don't like using RIGHT JOIN unless I have to. If you can, just user INNER JOIN. If not, rearrange the FROM so you can use LEFT JOIN. Again, it is just for readability, but I don't know that i have ever used RIGHT JOIN.
SELECT *
FROM `users` AS US
RIGHT JOIN `usermeta` UM1
ON UM1.`user_id` = US.`ID`
RIGHT JOIN (select distinct mr.user from `membership_renewals` MR where MR.year = '2011') vt on vt.user = us.id
WHERE UM1.meta_key = 'member'
AND UM1.meta_value = 1
AND US.`user_pass` NOT LIKE '-%'
This will do it.