PostgreSQL using sum in where clause - sql

I have a table which has a numeric column named 'capacity'. I want to select first rows which the total sum of their capacity is no greater than X, Sth like this query
select * from table where sum(capacity )<X
But I know I can not use aggregation functions in where part.So what other ways exists for this problem?
Here is some sample data
id| capacity
1 | 12
2 | 13.5
3 | 15
I want to list rows which their sum is less than 26 with the order of id, so a query like this
select * from table where sum(capacity )<26 order by id
and it must give me
id| capacity
1 | 12
2 | 13.5
because 12+13.5<26

A bit late to the party, but for future reference, the following should work for a similar problem as the OP's:
SELECT id, sum(capacity)
FROM table
GROUP BY id
HAVING sum(capacity) < 26
ORDER by id ASC;
Use the PostgreSQL docs for reference to aggregate functions: https://www.postgresql.org/docs/9.1/tutorial-agg.html

Use Having clause
select * from table order by id having sum(capacity)<X

You can use the window variant of sum to produce a cumulative sum, and then use it in the where clause. Note that window functions can't be placed directly in the where clause, so you'd need a subquery:
SELECT id, capacity
FROM (SELECT id, capacity, SUM(capacity) OVER (ORDER BY id ASC) AS cum_sum
FROM mytable) t
WHERE cum_sum < 26
ORDER BY id ASC;

Related

Selecting pair(including reverse order) with highest date value

I have a messages table like this
Messages Table
I want to select each unique pair (including reversed order) with highest date. Therefore resulting SQL Select Statement would be like this:
from_id | to_id | date | message
1 2 13:06 I'm Alp
2 3 13:06 I'm Oliver
3 1 11:38 From third to one
I tried to use distinct with max function but it didn't help.
You can use window functions:
select *
from (
select m.*,
row_number() over(partition by min(from_id, to_id), max(from_id, to_id) order by date desc) rn
from messages m
) m
where rn = 1
Note: counter-intuitively enough, SQLite's min() and max() functions, when given several arguments, are the equivalent of least() and greatest() in other databases.

Redshift percentile_disc query and group by

I have a table that looks like this (containing the number of times a particular user has visited a particular page)
n | context_page_path | user_id
--------------------------------
10 | /some/path/ | 1
23 | /some/path/ | 2
30 | /some/other/p/ | 1
...
I'm trying to get the 75% percentile of visits to each page like so:
select
context_page_path,
percentile_disc(0.75) within group (order by n) over (partition by context_page_path) as percentile_75
from my_table
group by context_page_path
However, when I run this query, Redshift wants me to include n in the group by clause.
I'm not sure why it's asking for this?
If I want the average, I can do that easily like so with no complaints.
select
context_page_path,
avg(n)
from my_table
group by context_page_path
percentile_disc() is a window function, rather than an aggregation function. So, you can use:
select distinct context_page_path,
percentile_disc(0.75) within group (order by n) over (partition by context_page_path) as percentile_75
from my_table;
You are trying to use the Window Function syntax instead of the Aggregate Function syntax. Try this:
select
context_page_path,
approximate percentile_disc(0.75) within group (order by n) as percentile_75
from my_table
group by context_page_path
https://docs.aws.amazon.com/redshift/latest/dg/r_APPROXIMATE_PERCENTILE_DISC.html
It's a bit confusing but the two functions are in two different sections of the docs.

SQL select top rows based on limit

Please help me t make below select query
Source table
name Amount
-----------
A 2
B 3
C 2
D 7
if limit is 5 then result table should be
name Amount
-----------
A 2
B 3
if limit is 8 then result table
name Amount
-----------
A 2
B 3
C 2
You can use window function to achieve this:
select name,
amount
from (
select t.*,
sum(amount) over (
order by name
) s
from your_table t
) t
where s <= 8;
The analytic function sum will be aggregated row-by-row based on the given order order by name.
Once you found sum till given row using this, you can filter the result using a simple where clause to find rows till which sum of amount is under or equal to the given limit.
More on this topic:
The SQL OVER() clause - when and why is it useful?
https://explainextended.com/2009/03/08/analytic-functions-sum-avg-row_number/

Summing and ordering at once

I have a table of orders. There I need to find out which 3 partner_id's have made the largest sum of amount_totals, and sort those 3 from biggest to smallest.
testdb=# SELECT amount_total, partner_id FROM sale_order;
amount_total | partner_id
--------------+------------
1244.00 | 9
3065.90 | 12
3600.00 | 3
2263.00 | 25
3000.00 | 10
3263.00 | 3
123.00 | 25
5400.00 | 12
(8 rows)
Just starting SQL, I find it confusing ...
Aggregated amounts
If you want to list aggregated amounts, it can be as simple as:
SELECT partner_id, sum(amount_total) AS amout_suptertotal
FROM sale_order
GROUP BY 1
ORDER BY 2 DESC
LIMIT 3;
The 1 in GROUP BY 1 is a numerical parameter, that refers to the position in the SELECT list. Just a notational shortcut for GROUP BY partner_id in this case.
This ignores the special case where more than three partner would qualify and picks 3 arbitrarily (for lack of definition).
Individual amounts
SELECT partner_id, amount_total
FROM sale_order
JOIN (
SELECT partner_id, rank() OVER (ORDER BY sum(amount) DESC) As rnk
FROM sale_order
GROUP BY 1
ORDER BY 2
LIMIT 3
) top3 USING (partner_id)
ORDER BY top3.rnk;
This one, on the other hand includes all peers if more than 3 partner qualify for the top 3. The window function rank() gives you that.
The technique here is to group by partner_id in the subquery top3 and have the window function rank() attach ranks after the aggregation (window functions execute after aggregate functions). ORDER BY is applied after window functions and LIMIT is applied last. All in one subquery.
Then I join the base table to this subquery, so that only the top dogs remain in the result and order by rnk.
Window functions require PostgreSQL 8.4 or later.
This is rather advanced stuff. You should start learning SQL with something simpler probably.
select amount_total, partner_id
from (
select
sum(amount_total) amount_total,
partner_id
from sale_order
group by partner_id
) s
order by amount_total desc
limit 3

Simple SQL select sum and values of same column

I have a co-worker who is working on a table with an 'amount' column.
They would like to get the top 5 amounts and the sum of the amounts in the same query.
I know you could do this:
SELECT TOP 5 amount FROM table
UNION SELECT SUM(amount) FROM table
ORDER BY amount DESC
But this produces results like this:
1000 (sum)
100
70
50
30
20
When what they really need is this:
100 | 1000
70 | 1000
50 | 1000
30 | 1000
20 | 1000
My intuitive attempts to achieve this tend to run into grouping problems, which isn't such an issue when you are selecting a different column, but is when you want to use an aggregate function based on the column you are selecting.
You can use a CROSS JOIN for this:
SELECT TOP 5 a.amount, b.sum
FROM table a
CROSS JOIN (SELECT SUM(amount) sum FROM table) b
ORDER BY amount DESC
This might work
SELECT TOP 5 amount, (SELECT SUM(amount) FROM table)
FROM table
ORDER BY amount DESC
Not really pretty, but this shouls do it:
SELECT TOP 5 amount, SAmount
FROM table Join
(SELECT SUM(amount) As SAmount FROM table)
ORDER BY amount DESC
As said by others, I'd probably use two queries.
Another approach using analytic functions (SQL Server 2005+):
SELECT TOP 5 amount, SUM(amount) OVER()
FROM table
ORDER BY
amount DESC