I have a database that holds info for accounts, posts, and which posts a user likes.
AccountData
id || username
PostData
id || text || accountid
LikesDislikesData
liked(bool) || accountid || postid
I have a view set up because I need specific data from the DB to bind inside of my app. Here is the code I am using:
SELECT trippin.PostData.id, trippin.AccountData.username, trippin.PostData.posttext,
trippin.CategoryData.categoryname, trippin.PostData.__createdat as CreatedAt,
SUM(CASE WHEN likes.liked = 1 THEN 1 ELSE 0 END) as Likes,
SUM(CASE WHEN likes.liked = 0 THEN 1 ELSE 0 END) as DisLikes
FROM trippin.PostData
INNER JOIN trippin.AccountData ON trippin.PostData.accountid = trippin.AccountData.id
INNER JOIN trippin.CategoryData ON trippin.CategoryData.id = trippin.PostData.categoryid
LEFT OUTER JOIN trippin.LikesDislikesData likes ON likes.postid = trippin.PostData.id
GROUP BY (trippin.AccountData.username), (trippin.PostData.posttext), (trippin.PostData.id), (trippin.categorydata.categoryname), (trippin.PostData.__createdat)
The problem is that, any time I add a join to (likes.accountid = trippin.AccountData.id), rows are either duplicated or the output is just incorrect.
I think it might be a design issue, but I am not sure and cannot find anything that helps my exact problem.
So basically, each post is made by a user. Then, each post is liked or disliked (or nothing) by other users. I need all of this data inside of a view so I can pass it to my app.
select postID
, sum(case when liked = true then 1 else 0 end) as liked
, sum(case when liked = true then 1 else 0 end) as disliked
from trippin.LikesDislikesData
group by postID
The above statement should give you liked and disliked by post ID. Call it a subquery and join to your main:
SELECT trippin.PostData.id, trippin.AccountData.username, trippin.PostData.posttext,
trippin.CategoryData.categoryname, trippin.PostData.__createdat as CreatedAt,
Likes,
DisLikes
FROM trippin.PostData
INNER JOIN trippin.AccountData ON trippin.PostData.accountid = trippin.AccountData.id
INNER JOIN trippin.CategoryData ON trippin.CategoryData.id = trippin.PostData.categoryid
LEFT OUTER JOIN (
select postID
, sum(case when liked = true then 1 else 0 end) as liked
, sum(case when liked = true then 1 else 0 end) as disliked
from trippin.LikesDislikesData
group by postID)
likes ON likes.postid = trippin.PostData.id
Should work for what you are after. I removed your group by clause as you'll no longer need the sums syntax. Quite often you'll find creating a subquery to do counts and sums by ID and joining that to the main query by that ID will be the easiest solution to obtain counts...atleast without having to group by the entire select statement
Related
I have a query that uses multiple left joins and trying to get a SUM of values from one of the joined columns.
SELECT
SUM( case when session.usersessionrun =1 then 1 else 0 end) new_unique_session_user_count
FROM session
LEFT JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = session.userid
LEFT JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
WHERE session.appid = '6279df3bd2d3352aed591583'
AND (session.uploadedon BETWEEN '2022-04-18 08:31:26' AND '2022-05-18 08:31:26')
But this obviously gives a redundant session.usersessionrun=1 counts since it's a joined resultset.
Here the logic was to mark the user as new if the sessionrun for that record is 1.
I grouped by userid and usersessionrun and it shows that the records are repeated.
userid. sessionrun. count
628212 1 2
627a01 1 4
So what I was trying to do was something like
SUM(CASE distinct(session.userid) AND WHEN session.usersessionrun = 1 THEN 1 ELSE 0 END) new_unique_session_user_count
i.e. for every unique user count, session.usersessionrun = 1 should only be done once.
As you have discovered, JOIN operations can generate combinatorial explosions of data.
You need a subquery to count your sessions by userid. Then you can treat the subquery as a virtual table and JOIN it to the other tables to get the information you need in your result set.
The subquery (nothing in my answer is debugged):
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
This subquery summarizes your session table and has one row per userid. The trick to avoiding JOIN-created combinatorial explosions is using subqueries that generate results with only one row per data item mentioned in a JOIN's ON-clause.
Then, you join it with the other tables like this
SELECT summary.new_unique_session_user_count
FROM (
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
) summary
JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = summary.userid
JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
There may be better ways to structure this query, but it's hard to guess at them without more information about your table definitions and business rules.
Original Query:
select StudyID, count(CompletedDate), count(Removed), count(RemovalReason)
from Study a
full outer join Households b
on a.HouseholdID = b.HouseholdID
where StudyID = '123456'
and Removed = 1
and RemovalReason = 5
group by StudyID
How do I write out this query so that for each column (CompletedDate, Removed, and RemovalReason) is not restricted to the conditions (i.e. Removed = 1, Removal Reason = 5) and only applies to the specific column. If I execute this query, it will not show me the total count for CompletedDate because I'm restricting it to these conditions. Is there a way to write it directly next to count?
Table/Columns - Study:
HouseholdID (primary key),
StudyID,
CompletedDate
Table/Columns - Households:
HouseholdID (primary key),
Removed,
RemovalReason
I think you are looking for something like this, but your question is a little loose with details:
select StudyID
, count(CompletedDate)
, sum(case when Removed = 1 then 1 else 0 end)
, sum(case when RemovalReason = 5 then 1 else 0 end)
from Study a
join Households b
on a.HouseholdID = b.HouseholdID
where StudyID = '123456'
group by StudyID
I am trying to get the number of customers by their types and groups all in line as such:
GroupName | GroupNotes | Count(Type1) | Count(Type2) | Count(Type3)
but instead I can only get the groupid ,the typeid and the number of types in the group by using the following query
SELECT
CustomersGroups.idCustomerGroup , Customers.type , COUNT(*)
FROM
CustomersGroups
inner Join CustomersInGroup on CustomersGroups.idCustomerGroup = CustomersInGroup.idCustomerGroup
inner Join Customers on Customers.idCustomer = CustomersInGroup.idCustomer
Group by
CustomersGroups.idCustomerGroup, Customers.type
is there a way to show them in a single line , (and show the name of the group?)
This is a "pivot" query. Some databases directly support pivot syntax. In all, you can use conditional aggregation.
Perhaps more importantly, you should learn to use table aliases. These make queries easier to write and to read:
select cg.idCustomerGroup,
sum(case when c.type = 'Type1' then 1 else 0 end) as num_type1,
sum(case when c.type = 'Type2' then 1 else 0 end) as num_type2,
sum(case when c.type = 'Type3' then 1 else 0 end) as num_type3
from CustomersGroups cg inner Join
CustomersInGroup cig
on cg.idCustomerGroup = cig.idCustomerGroup inner Join
Customers c
on c.idCustomer = cig.idCustomer
Group by cg.idCustomerGroup;
I have 4 tables:
Competencies: a list of obviously competencies, static and a library
Competency Levels: refers to an associated group of competencies and has a number of competencies I am testing for
call_competency: a list of all 'calls' that have recorded the specified competency
competency_review_status: proving whether each call_competency was reviewed
Now I am trying to write this query to count a total and spit out the competency, id and whether a user has reached the limit. Everything works except for when I add the user. I am not sure what I am doing wrong, once I limit call competency by user in the where clause, I get a small subset that ONLY exists in call_competency returned when I want the entire list of competencies.
The competencies not reached should be false, ones recorded appropriate number true. A FULL list from the competency table.
I added the derived table, not sure if this is right, obviously it doesn't run properly, not sure what I'm doing wrong and I'm wasting time. Any help much appreciated.
SELECT comp.id, comp.shortname, comp.description,
CASE WHEN sum(CASE WHEN crs.grade = 'Pass' THEN 1 ELSE CASE WHEN crs.grade = 'Fail' THEN -1 ELSE 0 END END) >= comp_l.competency_break_level
THEN TRUE ELSE FALSE END
FROM competencies comp
INNER JOIN competency_levels comp_l ON comp_l.competency_group = comp.competency_group
LEFT OUTER JOIN (
SELECT competency_id
FROM call_competency
WHERE call_competency.user_id IN (
SELECT users.id FROM users WHERE email= _studentemail
)
) call_c ON call_c.competency_id = comp.id
LEFT OUTER JOIN competency_review_status crs ON crs.id = call_competency.review_status_id
GROUP BY comp.id, comp.shortname, comp.description, comp_l.competency_break_level
ORDER BY comp.id;
(Shooting from the hip, no installation to test)
It looks like the below should do the trick. You apparently had some of the joins mixed up, with a column from a relation that was not referenced. Also, the CASE statement in the main query could be much cleaner.
SELECT comp.id, comp.shortname, comp.description,
(sum(CASE WHEN crs.grade = 'Pass' THEN 1 WHEN crs.grade = 'Fail' THEN -1 ELSE 0 END) >= comp_l.competency_break_level) AS reached_limit
FROM competencies comp
JOIN competency_levels comp_l USING (competency_group)
LEFT JOIN (
SELECT competency_id, review_status_id
FROM call_competency
JOIN users ON id = user_id
WHERE email = _studentemail
) call_c ON call_c.competency_id = comp.id
LEFT JOIN competency_review_status crs ON crs.id = call_c.review_status_id
GROUP BY comp.id, comp.shortname, comp.description
ORDER BY comp.id;
I saw this error-message lots of times here, but I'm not getting the solution for my specific problem out of it (probably because I'm not an sql-expert), so please forgive me for posting a question to the same error.
This is the query I'm trying to execute:
SELECT DISTINCT U.FB_UserId
, U.Id AS GameUserID
, U.FbLocale
, U.FbGender
, U.FbBirthday
, U.RegistredAt
, U.LoginCount
, U.PlayCount
, U.MarketGroupId
, (SELECT COUNT(C.PriceFbCredits)
WHERE UserID = U.Id) AS Payments
, (SELECT SUM(CASE WHEN C.PriceFbCredits = 13 THEN 1 END)
WHERE UserID = U.Id) AS P13
, (SELECT SUM(CASE WHEN C.PriceFbCredits = 52 THEN 1 END)
WHERE UserID = U.Id) AS P52
, (SELECT SUM(CASE WHEN C.PriceFbCredits = 130 THEN 1 END)
WHERE UserID = U.Id) AS P130
FROM [dbo].[User] AS U WITH (NOLOCK) INNER JOIN [dbo].[FbCreditsCallback] AS C WITH (NOLOCK) ON C.UserId = U.Id
If I do, I get the error-message. OK, I know what it means and I kind of understand what to do, but I think if I do it doesn't give me the result I want...
I want some the data for a specific userid. some of the data needs to be summed, some of them need to be counted and in the result list each id should only appear once.
Now here's the funny thing (for me). If I write the inner SELECT-Queries like this, I'm not getting the error. But I don't know if the returned data is correct:
, (SELECT COUNT(PriceFbCredits)
FROM [dbo].[FbCreditsCallback]
WHERE UserID = U.Id) AS Payments
... To be honest, I kind of lost track and I'm hoping for some help.
below you'll see the correct SQL Server syntax when using grouping to make aggregation functions.
your WHERE UserID = U.Id is not needed at all since you are making it as part of the INNER JOIN clause.
So try this:
SELECT DISTINCT U.FB_UserId
, U.Id AS GameUserID
, U.FbLocale
, U.FbGender
, U.FbBirthday
, U.RegistredAt
, U.LoginCount
, U.PlayCount
, U.MarketGroupId
, COUNT(*) AS Payments
, SUM(CASE WHEN C.PriceFbCredits = 13 THEN 1 ELSE 0 END) AS P13
, SUM(CASE WHEN C.PriceFbCredits = 52 THEN 1 ELSE 0 END) AS P52
, SUM(CASE WHEN C.PriceFbCredits = 130 THEN 1 ELSE 0 END) AS P130
FROM [dbo].[User] AS U WITH (NOLOCK)
INNER JOIN [dbo].[FbCreditsCallback] AS C WITH (NOLOCK) ON C.UserId = U.Id
GROUP BY U.FB_UserId
, U.Id
, U.FbLocale
, U.FbGender
, U.FbBirthday
, U.RegistredAt
, U.LoginCount
, U.PlayCount
, U.MarketGroupId
as you wrote
i'm not an sql-expert
From now on, avoid using WITH(NOLOCK), is the same as if you asked SELECT [data] FROM [TABLE] WITH(I really, really don't care if it is accurate or not)
there are reasons to do that in some cases, but if you are a beginner with SQL I doubt that you are in such situation.
If you don't use GROUP BY then there will a conflict with the integrity on the select
ex
for a person it should have sum or count when u don't include that (person)
then for what do u expect a sum for