MySQL to PostgreSQL: GROUP BY issues - sql

So I decided to try out PostgreSQL instead of MySQL but I am having some slight conversion problems. This was a query of mine that samples data from four tables and spit them out all in on result.
I am at a loss of how to convey this in PostgreSQL and specifically in Django but I am leaving that for another quesiton so bonus points if you can Django-fy it but no worries if you just pure SQL it.
SELECT links.id, links.created, links.url, links.title, user.username, category.title, SUM(votes.karma_delta) AS karma, SUM(IF(votes.user_id = 1, votes.karma_delta, 0)) AS user_vote
FROM links
LEFT OUTER JOIN `users` `user` ON (`links`.`user_id`=`user`.`id`)
LEFT OUTER JOIN `categories` `category` ON (`links`.`category_id`=`category`.`id`)
LEFT OUTER JOIN `votes` `votes` ON (`votes`.`link_id`=`links`.`id`)
WHERE (links.id = votes.link_id)
GROUP BY votes.link_id
ORDER BY (SUM(votes.karma_delta) - 1) / POW((TIMESTAMPDIFF(HOUR, links.created, NOW()) + 2), 1.5) DESC
LIMIT 20
The IF in the select was where my first troubles began. Seems it's an IF true/false THEN stuff ELSE other stuff END IF yet I can't get the syntax right. I tried to use Navicat's SQL builder but it constantly wanted me to place everything I had selected into the GROUP BY and that I think it all kinds of wrong.
What I am looking for in summary is to make this MySQL query work in PostreSQL. Thank you.
Current Progress
Just want to thank everybody for their help. This is what I have so far:
SELECT links_link.id, links_link.created, links_link.url, links_link.title, links_category.title, SUM(links_vote.karma_delta) AS karma, SUM(CASE WHEN links_vote.user_id = 1 THEN links_vote.karma_delta ELSE 0 END) AS user_vote
FROM links_link
LEFT OUTER JOIN auth_user ON (links_link.user_id = auth_user.id)
LEFT OUTER JOIN links_category ON (links_link.category_id = links_category.id)
LEFT OUTER JOIN links_vote ON (links_vote.link_id = links_link.id)
WHERE (links_link.id = links_vote.link_id)
GROUP BY links_link.id, links_link.created, links_link.url, links_link.title, links_category.title
ORDER BY links_link.created DESC
LIMIT 20
I had to make some table name changes and I am still working on my ORDER BY so till then we're just gonna cop out. Thanks again!

Have a look at this link GROUP BY
When GROUP BY is present, it is not
valid for the SELECT list expressions
to refer to ungrouped columns except
within aggregate functions, since
there would be more than one possible
value to return for an ungrouped
column.
You need to include all the select columns in the group by that are not part of the aggregate functions.

A few things:
Drop the backticks
Use a CASE statement instead of IF() CASE WHEN votes.use_id = 1 THEN votes.karma_delta ELSE 0 END
Change your timestampdiff to DATE_TRUNC('hour', now()) - DATE_TRUNC('hour', links.created) (you will need to then count the number of hours in the resulting interval. It would be much easier to compare timestamps)
Fix your GROUP BY and ORDER BY

Try to replace the IF with a case;
SUM(CASE WHEN votes.user_id = 1 THEN votes.karma_delta ELSE 0 END)
You also have to explicitly name every column or calculated column you use in the GROUP BY clause.

Related

conditional IIF in a JOIN

I have the next data base:
Table Bill:
Table Bill_Details:
And Table Type:
I want a query to show this result:
The query as far goes like this:
SELECT
Bill.Id_Bill,
Type.Id_Type,
Type.Info,
Bill_Details.Deb,
Bill_Details.Cre,
Bill.NIT,
Bill.Date2,
Bill.Comt
FROM Type
RIGHT JOIN (Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill)
ON Type.Id_Type = Bill_Details.Id_Type
ORDER BY Bill.Id_Bill, Type.Id_Type;
With this result:
I'm not sure how to deal or how to include this:
Type.600,
Type."TOTAL",
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) >= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), "" ),
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) <= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), "" )
The previous code is the responsable of include new data in some fields, since all of the other fields will carry the same data of the upper register. I'll apreciate some sugestions to acomplish this.
Here is a revised version of the UNION which you removed from the question. The original query was a good start, but you just did not provide sufficient details about the error or problem you were experiencing. My comments were not meant to have you remove the problem query, only that you needed to provide more details about the error or problem. In the future if you have a UNION, make sure the each query of the UNION works separately. Then you could debug problems easier, one step at a time.
Problems which I corrected in the second query of the UNION:
Removed reference to table [Type] in the query, since it was not part of the FROM clause. Instead, I replaced it with a literal value.
Fixed FROM clause to join both [Bill] and [Bill_Details] tables. You had fields from both tables, so why would you not join on them just like in the first query of the UNION?
Grouped on all fields from table [Bill] referenced in the SELECT clause. You must either group on all fields, or include them in aggregate expressions like Sum() or First(), etc.
Replaced empty strings with Nulls for the False cases on Iif() statements.
SELECT
Bill.Id_Bill, Type.Id_Type, Type.Info,
Bill_Details.Deb,
Bill_Details.Cre,
Bill.NIT, Bill.Date2, Bill.Comt
FROM
Type RIGHT JOIN (Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill)
ON Type.Id_Type = Bill_Details.Id_Type;
UNION
SELECT
Bill.Id_Bill, 600 As Id_Type, "TOTAL" As Info,
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) >= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), Null ) As Deb,
IIF(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre) <= 0, ABS(SUM(Bill_Details.Deb) - Sum(Bill_Details.Cre)), Null ) As Cre,
Bill.NIT, Bill.Date2, Bill.Comt
FROM Bill INNER JOIN Bill_Details
ON Bill.Id_Bill = Bill_Details.Id_Bill
GROUP BY Bill.Id_Bill, Bill.NIT, Bill.Date2, Bill.Comt;

The "where" condition worked not as expected ("or" issue)

I have a problem to join thoses 4 tables
Model of my database
I want to count the number of reservations with different sorts (user [mrbs_users.id], room [mrbs_room.room_id], area [mrbs_area.area_id]).
Howewer when I execute this query (for the user (id=1) )
SELECT count(*)
FROM mrbs_users JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id
WHERE mrbs_entry.start_time BETWEEN "145811700" and "1463985000"
or
mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and mrbs_users.id = 1
The result is the total number of reservations of every user, not just the user who has the id = 1.
So if anyone could help me.. Thanks in advance.
Use parentheses in the where clause whenever you have more than one condition. Your where is parsed as:
WHERE (mrbs_entry.start_time BETWEEN "145811700" and "1463985000" ) or
(mrbs_entry.end_time BETWEEN "1458120600" and "1463992200" and
mrbs_users.id = 1
)
Presumably, you intend:
WHERE (mrbs_entry.start_time BETWEEN 145811700 and 1463985000 or
mrbs_entry.end_time BETWEEN 1458120600 and 1463992200
) and
mrbs_users.id = 1
Also, I removed the quotes around the string constants. It is bad practice to mix data types, and in some databases, the conversion between types can make the query less efficient.
The problem you've faced caused by the incorrect condition WHERE.
So, should be:
WHERE (mrbs_entry.start_time BETWEEN 145811700 AND 1463985000 )
OR
(mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
Moreover, when you use only INNER JOIN (JOIN) then it be better to avoid WHERE clause, because the ON clause is executed before the WHERE clause, so criteria there would perform faster.
Your query in this case should be like this:
SELECT COUNT(*)
FROM mrbs_users
JOIN mrbs_entry ON mrbs_users.name=mrbs_entry.create_by
JOIN mrbs_room ON mrbs_entry.room_id = mrbs_room.id
AND
(mrbs_entry.start_time BETWEEN 145811700 AND 1463985000
OR ( mrbs_entry.end_time BETWEEN 1458120600 AND 1463992200 AND mrbs_users.id = 1)
)
JOIN mrbs_area ON mrbs_room.area_id = mrbs_area.id

SQL SUM function doubling the amount it should using multiple tables

My query below is doubling the amount on the last record it returns. I have 3 tables - activities, bookings and tempbookings. The query needs to list the activities and attached information and pull the total number (using the SUM) of places booked (as BookingTotal) from the booking table by each activity and then it needs to calculate the same for tempbookings (as tempPlacesReserved) providing the reservedate field inside that table is in the future.
However the first issue is that if there are no records for an activity in the tempbookings table it does not return any records for that activity at all, to get around this i created dummy records in the past so that it still returns the record, but if I can make it so I don't have to do this I would prefer it!
The main issue I have is that on the final record of the returned results it doubles the booking total and the places reserved which of course makes the whole query useless.
I know that I am doing something wrong I just haven't been able to sort it, I have searched similar issues online but am unable to apply them to my situation correctly.
Any help would be appreciated.
P.S. I'm aware that normally you wouldn't need to fully label all the paths to the databases, tables and fields as I have but for the program I am planning to use it in I have to do it this way.
Code:
SELECT [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice],
SUM([LeisureActivities].[dbo].[bookings].[bookingPlaces]) AS 'bookingTotal',
SUM (CASE WHEN[LeisureActivities].[dbo].[tempbookings].[tempReserveDate] > GetDate() THEN [LeisureActivities].[dbo].[tempbookings].[tempPlaces] ELSE 0 end) AS 'tempPlacesReserved'
FROM [LeisureActivities].[dbo].[activities],
[LeisureActivities].[dbo].[bookings],
[LeisureActivities].[dbo].[tempbookings]
WHERE ([LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[bookings].[activityID]
AND [LeisureActivities].[dbo].[activities].[activityID]=[LeisureActivities].[dbo].[tempbookings].[tempActivityID])
AND [LeisureActivities].[dbo].[activities].[activityDate] > GetDate ()
GROUP BY [LeisureActivities].[dbo].[activities].[activityID],
[LeisureActivities].[dbo].[activities].[activityName],
[LeisureActivities].[dbo].[activities].[activityDate],
[LeisureActivities].[dbo].[activities].[activityPlaces],
[LeisureActivities].[dbo].[activities].[activityPrice];
Your current query is using an INNER JOIN between each of the tables so if the tempBookings table has no records, you will not return anything.
I would advise that you start to use JOIN syntax. You might also need to use subqueries to get the totals.
SELECT a.[activityID],
a.[activityName],
a.[activityDate],
a.[activityPlaces],
a.[activityPrice],
coalesce(b.bookingTotal, 0) bookingTotal,
coalesce(t.tempPlacesReserved, 0) tempPlacesReserved
FROM [LeisureActivities].[dbo].[activities] a
LEFT JOIN
(
select activityID,
SUM([bookingPlaces]) AS bookingTotal
from [LeisureActivities].[dbo].[bookings]
group by activityID
) b
ON a.[activityID]=b.[activityID]
LEFT JOIN
(
select tempActivityID,
SUM(CASE WHEN [tempReserveDate] > GetDate() THEN [tempPlaces] ELSE 0 end) AS tempPlacesReserved
from [LeisureActivities].[dbo].[tempbookings]
group by tempActivityID
) t
ON a.[activityID]=t.[tempActivityID]
WHERE a.[activityDate] > GetDate();
Note: I am using aliases because it is easier to read
Use new SQL-92 Join syntax, and make join to tempBookings an outer join. Also clean up your sql with table aliases. Makes it easier to read. As to why last row has doubled values, I don't know, but on off chance that it is caused by extra dummy records you entered. get rid of them. That problem is fixed by using outer join to tempBookings. The other possibility is that the join conditions you had to the tempBookings table(t.tempActivityID = a.activityID) is insufficient to guarantee that it will match to only one record in activities table... If, for example, it matches to two records in activities, then the rows from Tempbookings would be repeated twice in the output, (causing the sum to be doubled)
SELECT a.activityID, a.activityName, a.activityDate,
a.activityPlaces, a.activityPrice,
SUM(b.bookingPlaces) bookingTotal,
SUM (CASE WHEN t.tempReserveDate > GetDate()
THEN t.tempPlaces ELSE 0 end) tempPlacesReserved
FROM LeisureActivities.dbo.activities a
Join LeisureActivities.dbo.bookings b
On b.activityID = a.activityID
Left Join LeisureActivities.dbo.tempbookings t
On t.tempActivityID = a.activityID
WHERE a.activityDate > GetDate ()
GROUP BY a.activityID, a.activityName,
a.activityDate, a.activityPlaces,
a.activityPrice;

How do you explicitly show rows which have count(*) equal to 0

The query I'm running in DB2
select yrb_customer.name,
yrb_customer.city,
CASE count(*) WHEN 0 THEN 0 ELSE count(*) END as #UniClubs
from yrb_member, yrb_customer
where yrb_member.cid = yrb_customer.cid and yrb_member.club like '%Club%'
group by yrb_customer.name, yrb_customer.city order by count(*)
Shows me people which are part of clubs which has the word 'Club' in it, and it shows how many such clubs they are part of (#UniClubs) along with their name and City. However for students who are not part of such a club, I would still like for them to show up but just have 0 instead of them being hidden which is what's happening right now. I cannot get this functionality with count(*). Can somebody shed some light? I can explain further if the above is not clear enough.
I'm not familiar with DB2 so I'm taking a stab in the dark, but try this:
select yrb_customer.name,
yrb_customer.city,
CASE WHEN yrb_member.club like '%Club% THEN count(*) ELSE 0 END as #UniClubs
from yrb_member, yrb_customer
where yrb_member.cid = yrb_customer.cid
group by yrb_customer.name, yrb_customer.city order by count(*)
Basically you don't want to filter for %Club% in your WHERE clause because you want ALL rows to come back.
You're going to want a LEFT JOIN:
SELECT yrb_customer.name, yrb_customer.city,
COUNT(yrb_member.club) as clubCount
FROM yrb_customer
LEFT JOIN yrb_member
ON yrb_member.cid = yrb_customer.cid
AND yrb_member.club LIKE '%Club%
GROUP BY yrb_customer.name, yrb_customer.city
ORDER BY clubCount
Also, if the tuple (yrb_customer.name, yrb_customer.city) is unique (or is supposed to be - are you counting all students with the same name as the same person?), you might get better performance out of the following:
SELECT yrb_customer.name, yrb_customer.city,
COALESCE(club.count, 0)
FROM yrb_customer
LEFT JOIN (SELECT cid, COUNT(*) as count
FROM yrb_member
WHERE club LIKE '%Club%
GROUP BY cid) club
ON club.cid = yrb_customer.cid
ORDER BY club.count
The reason that your original results were being hidden was because in your original query, you have an implicit inner join, which of course requires matching rows. The implicit-join syntax (comma-separated FROM clause) is great for inner (regular) joins, but is terrible for left-joins, which is what you really needed. The use of the implicit-join syntax (and certain types of related filtering in the WHERE clause) is considered deprecated.

SQL nested Select....I think?

I have 2 tables UnitProd and Unit.
Unit = unitproductivityid,unitid, unitnumber, fleet
UnitProd = unitproductivityid, day, shipweight, stops
I have multiple units in each table and I am trying to do group by functions to get counts of different things.(The tables have more fields than specified this is just example purposes.)
So basically I have the following:
SELECT
u.[Fleet]
,u.[Unit]
,up.[Day]
,((SUM(up.[Shipment_Weight]))/2000) AS [ShipmentWeight]
,((SUM(up.[Shipment_Weight]))/COUNT(up.[Stops])) AS [ShpmntAvg]
FROM
[dbo].[UnitProductivity] u
INNER JOIN [dbo].[UnitProductivityDetails] up
ON u.UnitProductivityId = up.UnitProductivityId
GROUP BY u.fleet, u.unit
So basically the issue I am having is that some up.[Stops] fields have a 0 in them so I want to exclude these. So basically a unit has 1-30 days no matter what and some of those days have a 0 as [Stop] so I want to count(ONLY DAYS with a stop). Would I use a nested select here and how?
Thanks
Unless I am misunderstanding the question, you don't need a nested SELECT.
Just add the following before your GROUP BY:
WHERE up.[Stops] > 0
It doesn't have to be nested, but here's a simple way:
SELECT * FROM Unit WHERE unitproductivityid IN (SELECT unitproductivityid FROM UnitProd WHERE stops > 0) as UP
Good luck!
after your GROUP BY line, add HAVING count(up.[Stops]) > 0
With existing code you can do HAVING clause:
SELECT
u.[Fleet]
,u.[Unit]
,up.[Day]
,((SUM(up.[Shipment_Weight]))/2000) AS [ShipmentWeight]
,((SUM(up.[Shipment_Weight]))/COUNT(up.[Stops])) AS [ShpmntAvg]
FROM
[dbo].[UnitProductivity] u
INNER JOIN [dbo].[UnitProductivityDetails] up
ON u.UnitProductivityId = up.UnitProductivityId
GROUP BY u.fleet, u.unit
HAVING(COUNT(up.[Stops]) > 0
More about HAVING clause on: http://www.w3schools.com/sql/sql_having.asp