SQL window function is grouped, but still get "must be an aggregate expression or appear in GROUP BY clause" - sql

I have a SQL (presto) query, let's say it's this:
select
id
, product_name
, product_type
, sum(sales) as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
group by 1,2,3
When I run this, I get an error telling me that the window function needs to appear in the GROUP BY clause. Is the best solution to break this out with a subquery? Or is there some syntax changes I need to make for this to work?

If you want the total sales for the type, then you need to nest the sum()s:
select id, product_name, product_type,
sum(sales) as total_sales,
sum(sum(sales)) over (partition by type) as sales_by_type
from some_table
group by 1,2,3;
If you also want the total of all sales, then:
select id, product_name, product_type,
sum(sales) as total_sales,
sum(sum(sales)) over (partition by type) as sales_by_type,
sum(sum(sales)) over () as total_total_sales
from some_table
group by 1,2,3;

What you need is something like below
select
id
, product_name
, product_type
, sum(sales) over () as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
or
select
id
, product_name
, product_type
, sum(sales) over (partition by (select 1)) as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
Both of these works in sql server. Not sure what/if it will work for presto though.
I have seen below variation as well.
over (partition by null)

Related

Averaging and Grouping In google Big Query

I have the table as shown in google big Query:
I just want to do the following:
Calculate Category wise total units sold
Calculate Category wise average selling price
consider below approach
select 'category' type, category name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by category
union all
select * from (
select 'product' type, product name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by product
order by total_sale desc limit 10
)
union all
select * from (
select 'order_date' type, '' || order_date name, count(1) units_sold, sum(sale_price) total_sale, round(avg(sale_price), 2) average_selling_price
from your_table group by order_date
order by total_sale desc limit 5
)
order by type
if applied to sample/dummy data - output would be like below

Can't use order when grouped by in BigQuery

I want to group by FECHA_COMPRA and then order by the same field. But when I do this, I get an error message:
SELECT list expression references column FECHA_COMPRA which is neither grouped nor aggregated at [28:13]
This are the querys I'm using:
Select DATE(FECHA_COMPRA) as Date,TYPE,SUM(AMOUNT) AS Total, SUM(Quantity) as Qty FROM Test
GROUP BY DATE(FECHA_COMPRA)
Order by date(FECHA_COMPRA)
This is also not working:
Select DATE(FECHA_COMPRA) as Date,TYPE,SUM(AMOUNT) AS Total, SUM(Quantity) as Qty FROM Test
GROUP BY DATE(FECHA_COMPRA)
Order by FECHA_COMPRA
What is wrong?
Thanks!
Use below instead
select
date(fecha_compra) as date,
type,
sum(amount) as total,
sum(quantity) as qty
from test
group by date, type
order by date

Calculate top two performing product categories from Sales data

I am trying to build a KPI of top 2 performing product categories for each customer.
I have sales data with following relevant columns -
customerid, product, product_category, order_qty, product_amt , order_date
I am using legacy SQL syntax in BQ.
This is a possible solution...
SELECT
customer_id,
product_category,
order_qty
FROM (
SELECT
customerid,
product_category,
SUM(order_qty) AS order_qty,
ROW_NUMBER() OVER(PARTITION BY customerid ORDER BY order_qty DESC) AS rn
FROM
[project:dataset.table]
GROUP BY
1, 2
)
WHERE
rn <= 2
ORDER BY
1, 3 DESC

Is it possible to filter within a windowing function's partition

Here is a table I created to explain what I want to do:
create table #test (
PlaceID int,
ItemID int,
ItemCount int,
Amount dec(11,2)
)
I would like to get 3 things:
sum by Place
sum by Place and Item
sum by Place and Non-item
The first two are simple:
sum(Amount) over (partition by PlaceID) as PlaceAmount
sum(Amount) over (partition by PlaceID, ItemID) as PlaceItemAmount
But how do I get the sum for all items in the place that are NOT the current item?
Here is a SQL Fiddle with the data and query set up:
select t1.PlaceID, t1.ItemID, t1.ItemCount
, t1.Amount as 'AmtMe'
, SumPlace.sum as 'AmtPlace'
, SumPlace.sum - t1.Amount as 'AmtPlaceNoMe'
from #test as t1
join (select PlaceID, sum(Amount) as 'sum'
from #test
group by PlaceID) as SumPlace
on t1.PlaceID = SumPlace.PlaceID
Does this do what you were expecting? Basically, take your whole (partition by place) and subtract your current (partition by place, item) to get the remainder. I would, however, mention that keeping this in a subquery so as to only run the windowed aggregates once per function and partition set.
Same can go for counts, using that logic as well.
select
PlaceID,
ItemID,
ItemCount,
Amount,
PlaceItemCount,
PlaceAmount,
ItemAndPlaceAmount,
PlaceAmount-ItemAndPlaceAmount as RemainderAmount
from (
select
PlaceID,
ItemID,
ItemCount,
Amount,
sum(ItemCount) over (partition by PlaceID) as PlaceItemCount,
sum(Amount) over (partition by PlaceID) as PlaceAmount,
sum(Amount) over (partition by PlaceID, ItemID) as ItemAndPlaceAmount
from tblTest
) z
SELECT
PlaceID,
ItemID,
ItemCount,
Amount,
sum(ItemCount) over (partition BY PlaceID) AS PlaceItemCount,
sum(Amount) over (partition BY PlaceID) AS PlaceAmount
, sum(Amount) over (partition BY PlaceID, ItemID) AS PlaceItemAmount
, sum(Amount) over (partition BY PlaceID)
- sum(Amount) over (partition BY PlaceID, ItemID) AS PlaceItemAmountMinusGroup
, sum(Amount) over (partition BY PlaceID) - Amount PlaceItemAmountMinusThis
FROM tblTest
PlaceItemAmountMinusGroup is the total amount by place without the total amount of ItemID
PlaceItemAmountMinusThis is the total amount by place without the amount of the row.
SQLFiddle demo

SQL - Remove duplicates to show the latest date record

I have a view which ultimately I want to return 1 row per customer.
Currently its a Select as follows;
SELECT
Customerid,
MAX(purchasedate) AS purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
Customer
GROUP BY
Customerid,
paymenttype,
delivery,
amount,
discountrate
I was hoping the MAX(purchasedate) would work but when I do my groupings it breaks as sometimes there could be a discountrate, sometimes its NULL, paymenttype can differ for each customer also, is there anyway just to show the last purchase a customer makes?
since SQL Server 2008 r2 supports windows function,
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
) derivedTable
WHERE derivedTable.rn = 1
or by using Common Table Expression
WITH derivedTable
AS
(
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate,
ROW_NUMBER() OVER (Partition By CustomerID
ORDER BY purchasedate DESC) rn
FROM Customer
)
SELECT Customerid,
purchasedate,
paymenttype,
delivery,
amount,
discountrate
FROM derivedTable
WHERE derivedTable.rn = 1
or by using join with subquery which works in other DBMS
SELECT a.*
FROM Customer a
INNER JOIN
(
SELECT CustomerID, MAX(purchasedate) maxDate
FROM Customer
GROUP BY CustomerID
) b ON a.CustomerID = b.CustomerID AND
a.purchasedate = b.maxDate