Comparing two sum function in where clause - sql

I want to check that an amount of likes the users received in all their personal pictures is at least twice as large as the number of likes received in the group pictures in which they are tagged.
In case the user is not tagged in any group photo but is tagged in a personal picture that has received at least one like, it will be returned.
My Question is:
How can I make a comparison between 2 sum functions
Where one result of the sum is returned in the nested query and compared with the external query.
Can I set an auxiliary variable to enter the sum value in it and compare it?
Select distinct UIP.userID
From tblUserInPersonalPic UIP
where **sum(UIP.numOfLikes) over (Partition by UIP.userID)*0.5** >
(Select distinct U.userID, sum(P.numOfLikes) over (Partition by U.userID)
From tblgroupPictures P left outer join
tblUserInGroupPic U On P.picNum=U.picNum
group by U.userID,P.numOfLikes,P.picNum)

It's kinda hard to know for sure, and of course I can't test my answer,
but I think you can do it with a couple of left joins, group by and having:
SELECT Personal.UserId
FROM tblUserInPersonalPic Personal
LEFT JOIN tblUserInGroupPic UserInGroup ON Personal.userID = UserInGroup.UesrId
LEFT JOIM tblgroupPictures GroupPictures ON UserInGroup.picNum = GroupPictures.picNum
GROUP BY Personal.userID
HAVING SUM(GroupPictures.numOfLikes) * 2 < SUM(Personal.numOfLikes)
Please note: When posting sql questions it's always best to provide sample data as DDL + DML (Create table + insert into statements) and desired results, so that who ever answers you can test the answer before posting it.

Try using two ctes..pseudo code.Also note distinct in second query will not even work,since you are returning two columns,so i changed it it below,so that you can get that column as well
;with tbl1
select a,sum(col1) as summ
select userid,sum(Anothersmcol) as sum2
from tbl2
select tbl1.columns,tbl2.columns
tbl1 t1
tbl2 t2
on t1.sumcol>t2.sumcol

You can't use window functions in a where clause. Define it in a subquery:
select *
from (
select sum(...) over (...) as Sum1
, OtherColumn
from YourTable
) sub
where Sum1 < (...your subquery...)


SQL join count and select query

I have two tables, one is a list of 'gangs' and one is a list of 'gang_members' the gang_members.gang_id refers to the they are in, I know how to count all the members in one gang, but I need to join the following queries into one:
SELECT count(gang_id) FROM gangs_members WHERE gang_id = <GANG ID>
I think this is possible, I could do it in a loop while it's going through the gangs but that would be inefficient
FROM gangs A
LEFT JOIN (SELECT gang_id, COUNT(*) AS RC FROM gangs_members GROUP BY gang_id) B ON A.gang_id=B.gang_id
Probably something like this
SELECT count(gang_id)
FROM gangs_members
WHERE gang_id IN (SELECT gang_id FROM gangs LIMIT 8)

Aggregate query with subquery (SUM)

I have the following query:
SELECT UserId, (
0.099 *
(SELECT AcceleratedProfitPercentage FROM CustomGroups cg
INNER JOIN UserCustomGroups ucg ON ucg.CustomGroupId = cg.Id
WHERE Packs.UserCustomGroupId = ucg.Id)
((SELECT AcceleratedProfitPercentage FROM CustomGroups cg
INNER JOIN UserCustomGroups ucg ON ucg.CustomGroupId = cg.Id
WHERE Packs.UserCustomGroupId = ucg.Id)*1.0) / (100*1.0)
As amount
SELECT ap.Id FROM Packs ap JOIN Users u ON ap.UserId = u.UserId
WHERE ap.MoneyToReturn > ap.MoneyReturned AND
u.Mass LIKE '1%');
which is producing correct output. However I have no idea how to aggregate it properly. I tried to use standard GROUP BY but I get the error (Column 'Packs.UserCustomGroupId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY claus). Any ideas? Here is the output I currently get:
I want to aggregate it by UserId. Thanks in advance.
The option that involves the least query-rewriting is to drop your existing query into a CTE or temp table, like so:
; with CTE as (MyQueryHere)
Select UserID, sum(amount)
from CTE
Group by UserID
Wow that is one crazy query you've got going on there.
Try this:
0.099 * SUM(t.Amount) AS [Amount SUM]
FROM Packs P
JOIN Users U
ON P.UserID = U.UserID
ON P.UserCustomGroupID = UCG.ID
ON UCG.CustomGroupID = CG.ID
ELSE CG.AcceleratedProfitPercentage / 100
END AS [Amount]
) t
WHERE P.MoneyToReturn > P.MoneyReturned
AND U.Mass LIKE '1%'
First, multiplying any number by 1 is pretty pointless, yet I see it twice in your original post. I'm not sure what led to that, but it's unnecessary.
Also, using CROSS APPLY will eliminate the need for you to repeat your subquery. Granted, it's slower (since it'll run on every row returned), but I think it makes sense in this case...Using left outer joins instead of CASE - SELECT - IS NULL makes your query much more efficient and much more readable.
Next, it appears that you are attempting to SUM percentages. Not sure what kind of data you're looking to return, but perhaps AVG would be more appropriate? I can't think of any practical reason why you would be looking to do that.
Lastly, APH's answer will most certainly work (assuming your original query works), but given the obfuscation and inefficiency of your query, I would definitely rewrite it.
Please let me know if you have any questions.

Why Does This subquery only return one record?

this is a subquery that I have. I am having a hard time understanding why this keeps popping back to me saying ("at most this subquery can only return one record")
FROM SoftwareAssigned
By my understanding this is saying "get a count of all records where the SoftID (softwareID) is the same"
What is really going on and how do I keep from making this mistake in the future?
The context is within this (attempted query:)
SELECT Software.Description, Software.QtyPurchased
, (
FROM SoftwareAssigned
) AS Assigned
,( Software.QtyPurchased -
FROM SoftwareAssigned
) AS Remaining
FROM Software
The query will "get a count for each specific SoftID value how many there are that has that id".
The query will return one row for each specific SoftID value that exists in the table.
If you want to count how many different SoftID values there are, you would use:
select count(distinct SoftID)
from SoftwareAssigned
To get a count from one table of the records that correspond to a record in another table, you would join the tables together and group on the values from the other table:
Software.Description, Software.QtyPurchased,
count(SoftwareAssigned.SoftID) as Assigned,
Software.QtyPurchased - count(SoftwareAssigned.SoftID) as Remaining
left join SoftwareAssigned on SoftwareAssigned.SoftID = Software.SoftID
group by
Software.SoftID, Software.Description, Software.QtyPurchased
I'm assuming Sofware has a SoftID column. It looks like you are hoping SQL will link between the sub query and the main query. It will only do ths if you tell it how:
s.QtyPurchased, (
SoftwareAssigned a
-- link to outer query
a.SoftID = s.SoftID
) as Assigned,
s.QtyPurchased - (
SoftwareAssigned a
-- link to outer query.
a.SoftID = s.SoftID
) as Remaining
Software s;
As it happens, there is a more compact way of writing this:
count(a.SoftID) as assigned,
s.QtyPurchased - count(a.SoftID) as Remaining
software s
left outer join
SoftwareAssigned a
on s.SoftID = a.SoftID
group by
Example SQLFiddle

Update using Distinct SUM

I have found a few good resources that show I should be able to merge a select query with an update, but I just can't get my head around of the correct formatting.
I have a select statement that is getting info for me, and I want to pretty much use those results to Update an account table that matches the accountID in the select query.
Here is the select statement:
SELECT DISTINCT SUM(b.workers)*tt.mealTax as MealCost,b.townID,b.accountID
FROM buildings AS b
INNER JOIN town_tax AS tt ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
So in short I want the above query to be merged with:
UPDATE accounts AS a
SET a.wealth = a.wealth - MealCost
Where MealCost is the result from the select query. I am sure there is a way to put this into one, I just haven't quite been able to connect the dots to get it to run consistently without separating into two queries.
First, you don't need the distinct when you have a group by.
Second, how do you intend to link the two results? The SELECT query is returning multiple rows per account (one for each town). Presumably, the accounts table has only one row. Let's say that you wanted the average MealCost for the update.
The select query to get this is:
SELECT accountID, avg(MealCost) as avg_Mealcost
FROM (SELECT SUM(b.workers)*tt.mealTax as MealCost, b.townID, b.accountID
FROM buildings AS b INNER JOIN
town_tax AS tt
ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
) a
GROUP BY accountID
Now, to put this into an update, you can use syntax like the following:
UPDATE accounts
set accounts.wealth = accounts.wealth + asum.avg_mealcost
from (SELECT accountID, avg(MealCost) as avg_Mealcost
FROM (SELECT SUM(b.workers)*tt.mealTax as MealCost, b.townID, b.accountID
FROM buildings AS b INNER JOIN
town_tax AS tt
ON tt.townID = b.townID
GROUP BY b.townID,b.accountID
) a
GROUP BY accountID
) asum
where accounts.accountid = asum.accountid
This uses SQL Server syntax, which I believe is the same as for Oracle and most other databases. Mysql puts the "from" clause before the "set" and allows an alias on "update accounts".

Equality of "select ... where in" and joins

Suppose I have a table1 like this:
id | itemcode
1 | c1
2 | c2
And a table2 like this:
item | name
c1 | acme
c2 | foo
Would the following two queries return the same result set under every condition?
SELECT id, itemcode
FROM table1
FROM table2
WHERE name [some arbitrary test])
SELECT id, itemcode
FROM table1
FROM table2
WHERE name [some arbitrary test]) items
ON table1.itemcode = items.item
Unless I'm really missing something stupid, I'd say yes. But I've done two queries which boil down to this form and I am getting different results. There are some nested queries using WHERE IN, but for the last step I've noticed a JOIN is much faster. The nested queries are all entirely isolated so I don't believe they are the problem, so I just want to eliminate the possibility that I've got a misconception regarding the above.
Thanks for any insights.
The two original queries:
SELECT imitm, imlitm, imglpt
FROM jdedata.F4101
WHERE imitm IN
(SELECT DISTINCT ivitm AS itemno
FROM jdedata.F4104
WHERE ivcitm IN
(SELECT DISTINCT ivcitm AS legacycode
FROM jdedata.F4104
WHERE ivitm IN
FROM trigdata.F4101_TRIG)
SELECT orig.imitm, orig.imlitm, orig.imglpt
FROM jdedata.F4101 orig
(SELECT DISTINCT ivitm AS itemno
FROM jdedata.F4104
WHERE ivcitm IN
(SELECT DISTINCT ivcitm AS legacycode
FROM jdedata.F4104
WHERE ivitm IN
FROM trigdata.F4101_TRIG))) itemns
ON orig.imitm = itemns.itemno
Although I still don't understand why the queries returned different results, it would seem our logic was flawed from the beginning since we were using the wrong columns in some parts. Mind that I'm not saying I made a mistake interpreting the queries as written above or had some typo, we just needed to select on some different stuff.
Normally I don't rest until I get to the bottom of things like these, but I'm very tired and am entering my first vacation since January that spans more than one day, so I can't really be bothered searching further right now. I'm sure the tips given here will come in handy later. Upvotes have been distributed for all the help and I've accepted Ypercube's answer, mostly because his comments have led me the furthest. But thanks all round! If I do find out more later, I'll try to remember pinging back in.
Since table2.item is not nullable, the 2 versions are equivalent. You can remove the distinct from the IN version, it's not needed. You can check these 3 versions and their execution plans:
SELECT id, itemcode FROM table1 WHERE itemcode IN
( SELECT item FROM table2 WHERE name [some arbitrary test] )
SELECT id, itemcode FROM table1 JOIN
( SELECT DISTINCT item FROM table2 WHERE name [some arbitrary test] )
items ON table1.itemcode = items.item
SELECT id, itemcode FROM table1 WHERE EXISTS
( SELECT * FROM table2 WHERE table1.itemcode = table2.item
AND (name [some arbitrary test]) )
Ideally I would want to see the differences between the result sets.
- Are you getting duplication of records
- Is one set always a sub-set of the other
- Does one set have both 'additional' and 'missing' records in comparison to the other?
That said, the logic should be equivilent. My best guess would be that you have some empty string entries in there; because Oracle's version of a NULL CHAR/VARCHAR is just an empty string. This can give very funky results if you're not prepared for it.
Both queries perform a semijoin i.e. no attributes from table2 appear in the topmost SELECT (the resultset).
To my eye, your first query is easiest to identify as a semijoin, EXISTS even more so. On the other hand, an optimizer would no doubt see it differently ;)
You can also try to do a direct join to the second table
SELECT DISTINCT id, itemcode
FROM table1
INNER JOIN table2 ON table1.itemcode = table2.item
WHERE name [some arbitrary test] )
You don't need distinct if item is primary key or unique
Exists and Inner Join should have the same execution speed, while IN is more expensive.
I'd look for some data type conversion in there.
create table t_vc (val varchar2(6));
create table t_c (val char(6));
insert into t_vc values ('12345');
insert into t_vc values ('12345 ');
insert into t_c values ('12345');
insert into t_c values ('12345');
select t_c.val||':'
from t_c
where val in (select distinct val from t_vc);
select c.val||':'
from t_vc v join (select distinct val from t_c) c on v.val=c.val;