PostgreSQL: how do you SELECT DISTINCT relations and order by different fields depending on WHERE clause? - sql

Each account is associated with one person and one type of account. I want to SELECT a distinct subset of accounts. In order to be selected the accounts have meet at least one of two criteria. If an account occurs twice
I want to order this result set based on two different fields. This was my attempt:
Select DISTINCT a.*
FROM people AS p
JOIN accounts AS a
ON a.people_id = p.id
JOIN type_account AS t
ON t.type_id = a.id
WHERE t.id IN(1,3,5)
OR p.id IN(2,4,6)
ORDER BY(CASE
WHEN p.id IN(2,4,6) THEN p.updated_at
WHEN t.id IN(1,3,5) THEN p.created_at) AS position
And I got this error: SELECT DISTINCT, ORDER BY expressions must appear in select list
If I move the case statement to the select it possible for one account (associated with different people) to appear in the results twice, i.e. once when the first where clause is met and twice when the second where clause is met. In this case the accounts will be appearing twice in the result set.
I am having trouble wrapping my head around this one. Any help would be appreciated :)

Move your CASE statement(s) into the SELECT clause, then order on their position:
SELECT
CASE
WHEN p.id IN(2,4,6) THEN p.updated_at
WHEN t.id IN(1,3,5) THEN p.created_at
END AS position,
DISTINCT a.*
FROM people AS p
JOIN accounts AS a
ON a.people_id = p.id
JOIN type_account AS t
ON t.type_id = a.id
WHERE t.id IN(1,3,5)
OR p.id IN(2,4,6)
ORDER BY 1 DESC
LIMIT 1;

Related

Too much Data using DISTINCT MAX

I want to see the last activity each individual handset and the user that used that handset. I have a table UserSessions that stores the last activity of a particular user as well as what handset they used in that activity. There are roughly 40 handsets, yet I always get back way too many records, like 10,000 rows when I only want the last activity of each handset. What am I doing wrong?
SELECT DISTINCT MAX(UserSessions.LastActivity), Handsets.Name,Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE
Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY UserSessions.LastActivity, Handsets.Name,Users.Username
I expect to get one record per handset of the users last activity with that handset. What I get is multiple records on all handsets and dates over 10000 rows
You typically GROUP BY the same columns as you SELECT, except those who are arguments to set functions.
This GROUP BY returns no duplicates, so SELECT DISTINCT isn't needed.
SELECT MAX(UserSessions.LastActivity), Handsets.Name, Users.Username
FROM UserSessions
INNER JOIN Handsets on Handsets.HandsetId = UserSessions.HandsetId
INNER JOIN Users on Users.UserId = UserSessions.UserId
WHERE Handsets.Name in (1000,1001.1002,1003,1004....)
AND Handsets.Deleted = 0
GROUP BY Handsets.Name, Users.Username
There is no such thing as DISTINCT MAX. You have SELECT DISTINCT which ensures that all columns referenced in the SELECT are not duplicated (as a group) across multiple rows. And there is MAX() an aggregation function.
As a note: SELECT DISTINCT is almost never appropriate with GROUP BY.
You seem to want:
SELECT *
FROM (SELECT h.Name, u.Username, MAX(us.LastActivity) as last_activity,
RANK() OVER (PARTITION BY h.Name ORDER BY MAX(us.LastActivity) desc) as seqnum
FROM UserSessions us JOIN
Handsets h
ON h.HandsetId = us.HandsetId INNER JOIN
Users u
ON u.UserId = us.UserId
WHERE h.Name in (1000,1001.1002,1003,1004....) AND
h.Deleted = 0
GROUP BY h.Name, u.Username
) h
WHERE seqnum = 1

Firebird can't recognize calculated column in group by clause

I have the following SQL:
select
inv.salesman_id,
(select salesman_goals.goal from salesman_goals
where salesman_goals.salesman_id = inv.salesman_id
and salesman_goals.group_id = g.group_id
and salesman_goals.subgroup_id = sg.subgroup_id
and salesman_goals.variation_id = v.variation_id)
as goal,
sum(i.quantity) as qnt
from invoiceitem i
inner join invoice inv on inv.invoice_id = i.invoice_id
inner join product p on p.product_id = i.product_id
left join groups g on g.group_id = p.group_id
left join subgroup sg on sg.group_id = g.group_id and sg.subgroup_id = p.subgroup_id
left join variation v on v.group_id = sg.group_id and v.subgroup_id = sg.subgroup_id and v.variation_id = p.variation_id
group by
1,2
which returns three columns, the first one is the salesman id, the second is a sub select to get the sales quantity goal, and the third is the actual sales quantity.
Even grouping by the first and second columns, firebird throws an error when executing the query:
Invalid expression in the select list (not contained in either an aggregate function or the GROUP BY clause).
What's the reason for this?
There is a column "in the select list (not contained in either an aggregate function or the GROUP BY clause)". Namely each column you mention in your subselect other than inv.salesman_id. Such a column has many values per group. When there is a GROUP BY (or just a HAVING, implicitly grouping by all columns) a SELECT clause returns one row per group. There is no single value to return. So you want (as you put in an answer yourself):
group by
inv.salesman_id,
g.group_id,
sg.subgroup_id,
v.variation_id
OK guys i found the solution for this problem.
The thing is, if you have a sub query in a column which will be in the group by clause, the parameters inside this sub query must also appear in the group by. So in this case, all i had to do was:
group by
inv.salesman_id,
g.group_id,
sg.subgroup_id,
v.variation_id
And that's it. Hope it helps if someone has the same issue in the future.

How to list unused items from database

MDW_CUSTOMER_ACCOUNTS has the following fields: ACCOUNT_ID, MEAL_ID.
MDW_MEALS_MENU has the following fields: MEAL_ID, MEAL_NAME.
I am trying to generate a report on the number of times a particular meal has been subscribed to by a customer using the query,
SELECT count(a.account_id), b.meal_id, b.meal_name
FROM mdw_meals_menu b LEFT JOIN mdw_customer_accounts a
on b.meal_id=a.meal_id
WHERE
a.start_date BETWEEN to_date('01-APR-2013','DD-MON-YYYY')
AND to_date('30-JUN-2013','DD-MON-YYYY')
GROUP BY b.meal_id, b.meal_name
ORDER BY count(a.account_id) desc, b.meal_id;
This only lists the MEAL_IDs that has been subscribed to at least once. But it is not displaying the Ids that have not been subscribed to.
How do I get these MEAL_IDs to print with the count being 0?
i have modified the code, but still i get the same result.
Your where clause is effectively turning your outer join back into an inner join - conditions on an outer-joined table should generally be in the join clause, like so:
SELECT count(a.account_id), b.meal_id, b.meal_name
FROM mdw_meals_menu b
LEFT JOIN mdw_customer_accounts a
on b.meal_id=a.meal_id and
a.start_date BETWEEN to_date('01-APR-2013','DD-MON-YYYY')
AND to_date('30-JUN-2013','DD-MON-YYYY')
GROUP BY b.meal_id, b.meal_name
ORDER BY count(a.account_id) desc, b.meal_id;
You should use a left outer join .

Distinct on multi-columns in sql

I have this query in sql
select cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
I want to get rows distinct by pageid ,so in the end I will not have rows with same pageid more then once(duplicate)
any Ideas
Thanks
Baaroz
Going by what you're expecting in the output and your comment that says "...if there rows in output that contain same pageid only one will be shown...," it sounds like you're trying to get the top record for each page ID. This can be achieved with ROW_NUMBER() and PARTITION BY:
SELECT *
FROM (
SELECT
ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID) rowNumber,
c.id,
c.pageId,
c.quantity,
c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
) a
WHERE a.rowNumber = 1
You can also use ROW_NUMBER() OVER(PARTITION BY ... along with TOP 1 WITH TIES, but it runs a little slower (despite being WAY cleaner):
SELECT TOP 1 WITH TIES c.id, c.pageId, c.quantity, c.price
FROM orders o
INNER JOIN cartlines c ON c.orderId = o.id
WHERE userId = 5
ORDER BY ROW_NUMBER() OVER(PARTITION BY c.pageId ORDER BY c.pageID)
If you wish to remove rows with all columns duplicated this is solved by simply adding a distinct in your query.
select distinct cartlines.id,cartlines.pageId,cartlines.quantity,cartlines.price
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If however, this makes no difference, it means the other columns have different values, so the combinations of column values creates distinct (unique) rows.
As Michael Berkowski stated in comments:
DISTINCT - does operate over all columns in the SELECT list, so we
need to understand your special case better.
In the case that simply adding distinct does not cover you, you need to also remove the columns that are different from row to row, or use aggregate functions to get aggregate values per cartlines.
Example - total quantity per distinct pageId:
select distinct cartlines.id,cartlines.pageId, sum(cartlines.quantity)
from orders
INNER JOIN
cartlines on(cartlines.orderId=orders.id)where userId=5
If this is still not what you wish, you need to give us data and specify better what it is you want.

i want to modify this SQL statement to return only distinct rows of a column

select
picks.`fbid`,
picks.`time`,
categories.`name` as cname,
options.`name` as oname,
users.`name`
from
picks
left join categories
on (categories.`id` = picks.`cid`)
left join options
on (options.`id` = picks.oid)
left join users
on (users.fbid = picks.`fbid`)
order by
time desc
that query returns a result that like:
my question is.... I would like to modify the query to select only DISTINCT fbid's. (perhaps the first row only sorted by time)
can someone help with this?
select
p2.fbid,
p2.time,
c.`name` as cname,
o.`name` as oname,
u.`name`
from
( select p1.fbid,
min( p1.time ) FirstTimePerID
from picks p1
group by p1.fbid ) as FirstPerID
JOIN Picks p2
on FirstPerID.fbid = p2.fbid
AND FirstPerID.FirstTimePerID = p2.time
LEFT JOIN Categories c
on p2.cid = c.id
LEFT JOIN Options o
on p2.oid = o.id
LEFT JOIN Users u
on p2.fbid = u.fbid
order by
time desc
I don't know why you originally had LEFT JOINs, as it appears that all picks must be associated with a valid category, option and user... I would then remove the left, and change them to INNER joins instead.
The first inner query grabs for each fbid, the FIRST entry time which will result in a single entity for the FBID. From that, it re-joins to the picks table for the same ID and timeslot... then continues for the rest of the category, options, users join criteria of that single entry.
2 options, you could write a group by clause.
Or you could write a nested query joined back to itself to get pertinent info.
Nested aliased table:
SELECT
n.fBids
FROM
MyTable t
INNER JOIN
(SELECT DISTINCT fBids
FROM MyTable) n
ON n.ID = t.ID
Or group by option
SELECT fBId from MyTable
GROUP BY fBID
select picks.`fbid`, picks.`time`, categories.`name` as cname,
options.`name` as oname, users.`name` from picks left join categories
on (categories.`id` = picks.`cid`) left join options on (options.`id` = picks.oid)
left join users on (users.fbid = picks.`fbid`)
order by time desc GROUP BY picks.`fbid`
select
picks.fbid,
MIN(picks.time) as first_time,
MAX(picks.time) as last_time
from
picks
group by
picks.fbid
order by
MIN(picks.time) desc
However, if you want only distinct fbid's you cannot display cname and other columns at the same time.