Is it possible to create and use window function in the same query? - sql

I'm using PostgreSQL and I have the following situation:
table of Sales (short version):
itemid quantity
5 10
5 12
6 1
table of stock (short version):
itemid stock
5 30
6 1
I have a complex query that also needs to present in one of it's columns the SUM of each itemid.
So it's going to be:
Select other things,itemid,stock, SUM (quantity) OVER (PARTITION BY itemid) AS total_sales
from .....
sales
stock
This query is OK. however this query will present:
itemid stock total_sales
5 30 22
6 1 1
But I don't need to see itemid=6 because the whole stock was sold. meaning that I need a WHERE condition like:
WHERE total_sales<stock
but I can't do that as the total_sales is created after the WHERE is done.
Is there a way to solve this without surrounding the whole query with another one? I'm trying to avoid it if I can.

You can use a subquery or CTE:
select s.*
from (Select other things,itemid,stock,
SUM(quantity) OVER (PARTITION BY itemid) AS total_sales
from .....
) s
where total_sales < stock;
You cannot use table aliases defined in a SELECT in the SELECT, WHERE, or FROM clauses for that SELECT. However, a subquery or CTE gets around this restriction.

You can also use an inner select in your WHERE statement like this:
SELECT *, SUM (quantity) OVER (PARTITION BY itemid) AS total_sales
FROM t
WHERE quantity <> (SELECT SUM(quantity) FROM t ti WHERE t.itemid = ti.itemid);

Related

SQL query can't use variable in FROM statement

I'm new to SQL, so sorry for maybe stupid question.
Table will be from this SQL sandbox:
https://www.w3schools.com/sql/trysql.asp?filename=trysql_asc
There is table of format
OrderDetailID OrderID ProductID Quantity
1 10248 11 12
2 10248 42 10
3 10248 72 5
4 10249 14 9
5 10249 51 40
I want to get products with maximum average quantity.
I can get this using the following query:
SELECT avg.ProductID, avg.Quantity
FROM (
SELECT ProductID, AVG(Quantity) Quantity
FROM OrderDetails
GROUP BY ProductID
) avg
WHERE avg.Quantity = (
SELECT MAX(Quantity) FROM (
SELECT ProductID, AVG(Quantity) Quantity
FROM OrderDetails
GROUP BY ProductID
)
)
ProductID Quantity
8 70
48 70
Here I twice use block
SELECT ProductID, AVG(Quantity) Quantity
FROM OrderDetails
GROUP BY ProductID
because if I use query with avg instead of second block
SELECT avg.ProductID, avg.Quantity
FROM (
SELECT ProductID, AVG(Quantity) Quantity
FROM OrderDetails
GROUP BY ProductID
) avg
WHERE avg.Quantity = (SELECT MAX(Quantity) FROM avg)
I get error could not prepare statement (1 no such table: avg)
So my question is:
Is it a kind of syntaxis mistake and could be simply corrected, or for some reason I can't use variables like that?
Is there simplier way to make the query I need?
Consider Common Table Expressions (CTE) using WITH clause which allows you to avoid repeating and re-calculating the aggregate subquery. Most RDBMS's supports CTEs (fully valid in your SQL TryIt linked page).
WITH avg AS (
SELECT ProductID, AVG(Quantity) Quantity
FROM OrderDetails
GROUP BY ProductID
)
SELECT avg.ProductID, avg.Quantity
FROM avg
WHERE avg.Quantity = (
SELECT MAX(Quantity) FROM avg
)
This is not really a syntax thing, this is rather scope: you try to
reference an alias where it is not in a parent-child relationship. Only this way they can reference each other. (The identifier there is an alias not a variable - that's a different thing.)
A simpler way is to create a temporary set before you run the filter condition - as in a previous answer, with a CTE, or you can try with a temp table. These can be used anywhere because their scope is not within a subquery.

Is it possible to calculate the sum of each group in a table without using group by clause

I am trying to find out if there is any way to aggregate a sales for each product. I realise I can achieve it either by using group-by clause or by writing a procedure.
example:
Table name: Details
Sales Product
10 a
20 a
4 b
12 b
3 b
5 c
Is there a way possible to perform the following query with out using group by query
select
product,
sum(sales)
from
Details
group by
product
having
sum(sales) > 20
I realize it is possible using Procedure, could it be done in any other way?
You could do
SELECT product,
(SELECT SUM(sales) FROM details x where x.product = a.product) sales
from Details a;
(and wrap it into another select to simulate the HAVING).
It's possible to use analytic functions to do the sum calculation, and then wrap that with another query to do your filtering.
See and play with the example here.
select
running_sum,
OwnerUserId
from (
select
id,
score,
OwnerUserId,
sum(score) over (partition by OwnerUserId order by Id) running_sum,
last_value(id) over (partition by OwnerUserId order by OwnerUserId) last_id
from
Posts
where
OwnerUserId in (2934433, 10583)
) inner_q
where inner_q.id = inner_q.last_id
--and running_sum > 20;
We keep a running sum going on the partition of the owner (product), and we tally up the last id for the same window, which is the ID we'll use to get the total sum. Wrap it all up with another query to make sure you get the "last id", take the sum, and then do any filtering you want on the result.
This is an extremely round-about way to avoid using GROUP BY though.
If you don't want nested select statements (run slower), use CASE:
select
sum(case
when c.qty > 20
then c.qty
else 0
end) as mySum
from Sales.CustOrders c

Join to replace sub-query

I am almost a novie in database queries.
However,I do understand why and how correlated subqueries are expensive and best avoided.
Given the following simple example - could someone help replacing with a join to help understand how it scores better:
SQL> select
2 book_key,
3 store_key,
4 quantity
5 from
6 sales s
7 where
8 quantity < (select max(quantity)
9 from sales
10 where book_key = s.book_key);
Apart from join,what other option do we have to avoid the subquery.
In this case, it ought to be better to use a windowed-function on a single access to the table - like so:
with s as
(select book_key,
store_key,
quantity,
max(quantity) over (partition by book_key) mq
from sales)
select book_key, store_key, quantity
from s
where quantity < s.mq
Using Common Table Expressions (CTE) will allow you to execute a single primary SELECT statement and store the result in a temporary result set. The data can then be self-referenced and accessed multiple times without requiring the initial SELECT statement to be executed again and won't require possibly expensive JOINs. This solution also uses ROW_NUMBER() and the OVER clause to number the matching BOOK_KEYs in descending order based off of the quantity. You will then only include the records that have a quantity that is less than the max quantity for each BOOK_KEY.
with CTE as
(
select
book_key,
store_key,
quantity,
row_number() over(partition by book_key order by quantity desc) rn
from sales
)
select
book_key,
store_key,
quantity
from CTE where rn > 1;
Working Demo: http://sqlfiddle.com/#!3/f0051/1
Apart from join,what other option do we have to avoid the subquery.
You use something like this:
SELECT select max(quantity)
INTO #myvar
from sales
where book_key = s.book_key
select book_key,store_key,quantity
from sales s
where quantity < #myvar

Write an Oracle query to get top 10 products for top 5000 stores [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Get top 10 products for every category
I am looking for an Oracle query to get top 5000 stores and for each store get top 10 products and for each top 10 products get top 5 sub-products. So In total I should get 5000*10*5 rows.
Can someone help me get this using Oracle's analytical functions.
My current query looks like
SELECT
store,
product,
sub-product,
count(*) as sales
FROM stores_data
GROUP BY store, product, sub-product;
Please assume table names as stores_data with columns store_id , product,sub_product
You should use dense_rank to get the top N rows.
Something like
SELECT
storeid,
store,
productid,
product,
subproductid,
subproduct
FROM
(
SELECT
s.storeid,
s.store,
p.productid,
p.product,
sp.subproductid,
sp.subproduct,
dense_rank() over ( order by s.storeid) as storerank,
dense_rank() over ( partition by s.storeid
order by p.productid) as productrank
dense_rank() over ( partition by s.storeid, p.productid
order by sp.subproductid) as productrank
FROM
stores s
INNER JOIN products p on p.storeid = s.storeid
INNER JOIN subproduct sp on sp.productid = p.productid
) t
WHERE
t.storerank <= 5000 and
t.productrank < 10 and
t.subproductrank < 5
Of course, I don't now your tables nor the relation between them. And the actual fields and conditions you want to check for, so this is just a simple query getting the top N records based on their id. Also, this query expects a product to have only one store which might not be the case.. At least it will show you how to use dense_rank to get a three-layered sorting/filtering.
I'll leave the other answer because that looks more like how such a table structure should be, I think.
But you described in your other thread to have a table that looks like this:
create table store_data (
store varchar2(40),
product varchar2(40),
subproduct varchar2(40),
sales int);
That actually looks like data that is aggregated already and that you do now want to analyze again. You query could look like this. It first aggregates the sum of the sales, so you can order shops and products by sales too (the sales in the table seem to be for the subproducts. After that, you can add ranks to the shops and products by sales. I added a rank to the subproducts too. I used rank here, so there is a gap in the numbering when more records have the same sales. This way, when you got 8 records with a rank of 1, because they all have the same sales, the 6th record will actually have rank 9 instead of 2, so you will only select the 8 top stores (you wanted 5, but why skip the other 3 if they actually sold exactly the same) and not 4 others too.
select
ts.*
from
(
select
ss.*,
rank() over (order by storesales) as storerank,
rank() over (partition by store order by productsales) as productrank,
rank() over (partition by store, product order by subproductsales) as subproductrank
from
(
select
sd.*,
sum(sales) over (partition by store) as STORESALES,
sum(sales) over (partition by store, product) as PRODUCTSALES,
sum(sales) over (partition by store, product, subproduct) as SUBPRODUCTSALES
from
store_data sd
) ss
) ts
where
ts.storerank <= 2 and
ts.productrank <= 3 and
ts.subproductrank <= 4

Find the highest number of occurences in a column in SQL

Given this table:
Order
custName description to_char(price)
A desa $14
B desb $14
C desc $21
D desd $65
E dese $21
F desf $78
G desg $14
H desh $21
I am trying to display the whole row where prices have the highest occurances, in this case $14 and $21
I believe there needs to be a subquery. So i started out with this:
select max(count(price))
from orders
group by price
which gives me 3.
after some time i didn't think that was helpful. i believe i needed the value 14 and 21 rather the the count so i can put that in the where clause. but I'm stuck how to display that. any help?
UPDATE: So I got it to query the 14 and 21 from this
select price
from orders
group by price
having (count(price)) in
(select max(count(price))
from orders
group by price)
but i need it to display the custname and description column which i get an error:
select custname, description, price
from orders
group by price
having (count(price)) in
(select max(count(price))
from orders
group by price)
SQL Error: ORA-00979: not a GROUP BY expression
any help on this?
I guess you are pretty close. Since HAVING operates on the GROUPed result set, try
HAVING COUNT(price) IN
or
HAVING COUNT(price) =
replacing your current line.
Since you tagged the question as oracle, you can use windowing functions to get aggregate and detail data within the same query.
SELECT COUNT (price) OVER (PARTITION BY price) count_at_this_price,
o.*
from orders o
order by 1 desc
select employee, count(employee)
from work
group by employee
having count(employee) =
( select max(cnt) from
( select employee, count(employee cnt
from work
group by employee
)
);
Reference
You could try something like
select * from orders where price in (select top 2 price from orders group by price order by price desc)
I'm not sure of limiting results in Oracle, in SQL Server is top, maybe you should use limit.