Can't use order when grouped by in BigQuery - google-bigquery

I want to group by FECHA_COMPRA and then order by the same field. But when I do this, I get an error message:
SELECT list expression references column FECHA_COMPRA which is neither grouped nor aggregated at [28:13]
This are the querys I'm using:
Select DATE(FECHA_COMPRA) as Date,TYPE,SUM(AMOUNT) AS Total, SUM(Quantity) as Qty FROM Test
GROUP BY DATE(FECHA_COMPRA)
Order by date(FECHA_COMPRA)
This is also not working:
Select DATE(FECHA_COMPRA) as Date,TYPE,SUM(AMOUNT) AS Total, SUM(Quantity) as Qty FROM Test
GROUP BY DATE(FECHA_COMPRA)
Order by FECHA_COMPRA
What is wrong?
Thanks!

Use below instead
select
date(fecha_compra) as date,
type,
sum(amount) as total,
sum(quantity) as qty
from test
group by date, type
order by date

Related

How to add a new column that summarize rows

I have two issues :
I used 'Rollup' function to add Totals per Month and Year and I would like to change 'NULL' into grand_total as in the attached screenshot
I dont know how to add a new column that will summarize values starting from the second row
Please see attached screenshot of the results I need to receive and an example for a code from my side with a screenshot of the source output : [1]: https://i.stack.imgur.com/6B70o.png
[1]: https://i.stack.imgur.com/E2x8K.png
Select Year(Modifieddate) AS Year,
MONTH(modifieddate) as Month,
Sum(linetotal) as Sum_price
from Sales.SalesOrderDetail
Group by rollup( Year(Modifieddate),MONTH(modifieddate))
Thanks in advance,
I think this will work:
Select Year(Modifieddate) AS Year,
coalesce(convert(varchar(255), month(modifieddate)), 'Grand Total') as Month,
Sum(linetotal) as Sum_price,
sum(sum(linetotal)) over (partition by Year(Modifieddate)
order by coalesce(month(modifieddate), 100)
) as ytd_sum_price
from Sales.SalesOrderDetail
Group by rollup( Year(Modifieddate), month(modifieddate))
The coalesce() in the order by is to put the summary row last for the cumulative sum.
Like this:
Select Year(Modifieddate) AS Year, MONTH(modifieddate) as Month, Sum(linetotal) as Sum_price
from Sales.SalesOrderDetail
Group by rollup( Year(Modifieddate),MONTH(modifieddate))
UNION
Select Year(Modifieddate) AS Year, 'grand_total' as Month, Sum(linetotal) as Sum_price
from Sales.SalesOrderDetail
Group by Year(Modifieddate)
-- SQL SERVER
SELECT t.OrderYear
, CASE WHEN t.OrderMonth IS NULL THEN 'Grand Total' ELSE CAST(t.OrderMonth AS VARCHAR(20)) END b
, t.MonthlySales
, MAX(t.cum_total) cum_total
FROM (SELECT
YEAR(OrderDate) AS OrderYear,
MONTH(OrderDate) AS OrderMonth,
SUM(SubTotal) AS MonthlySales,
SUM(SUM(SubTotal)) OVER (ORDER BY YEAR(OrderDate), MONTH(OrderDate) ROWS UNBOUNDED PRECEDING) cum_total
FROM Sales.SalesOrderHeader
GROUP BY GROUPING SETS ((YEAR(OrderDate), MONTH(OrderDate)))) t
GROUP BY GROUPING SETS ((t.OrderYear
, t.OrderMonth
, t.MonthlySales), t.OrderYear);
Please check this url https://dbfiddle.uk/?rdbms=sqlserver_2019&sample=adventureworks&fiddle=e6cd2ba8114bd1d86b8c61b1453cafcf
To build one #GordonLinoff's answer, you are really supposed to use the GROUPING() function to check whether you are dealing with the grouping column. This behaves better in the face of nullable columns.
Select case when grouping(Year(Modifieddate)) = 0
then Year(Modifieddate)
else 'Grand Total' end AS Year,
case when grouping(month(modifieddate)) = 0
then convert(varchar(255), month(modifieddate))
else 'Grand Total' end as Month,
Sum(linetotal) as Sum_price,
sum(sum(linetotal)) over (
partition by
grouping(Year(Modifieddate)),
grouping(month(modifieddate)),
Year(Modifieddate)
order by month(modifieddate)
) as ytd_sum_price
from Sales.SalesOrderDetail
Group by rollup( Year(Modifieddate), month(modifieddate));

SQL window function is grouped, but still get "must be an aggregate expression or appear in GROUP BY clause"

I have a SQL (presto) query, let's say it's this:
select
id
, product_name
, product_type
, sum(sales) as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
group by 1,2,3
When I run this, I get an error telling me that the window function needs to appear in the GROUP BY clause. Is the best solution to break this out with a subquery? Or is there some syntax changes I need to make for this to work?
If you want the total sales for the type, then you need to nest the sum()s:
select id, product_name, product_type,
sum(sales) as total_sales,
sum(sum(sales)) over (partition by type) as sales_by_type
from some_table
group by 1,2,3;
If you also want the total of all sales, then:
select id, product_name, product_type,
sum(sales) as total_sales,
sum(sum(sales)) over (partition by type) as sales_by_type,
sum(sum(sales)) over () as total_total_sales
from some_table
group by 1,2,3;
What you need is something like below
select
id
, product_name
, product_type
, sum(sales) over () as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
or
select
id
, product_name
, product_type
, sum(sales) over (partition by (select 1)) as total_sales
, sum(sales) over (partition by type) as sales_by_type
from some_table
Both of these works in sql server. Not sure what/if it will work for presto though.
I have seen below variation as well.
over (partition by null)

Tweaking a Query - looking for duplicates within a certain day range

I posted a question similar to this, and got an answer, but the answer isn't configurable - my fault I should have been more clear, so I'll try again.
I have a table where TABLENAME has the following information - OrderDate, OrderNumber, CustomerID, ProductSKU, ProductName exist. This table has lines for invoices. So an order will have a data line for every item in the order.
I want to know, which customers have ordered the same item, more than once, where the order is within 90 of any other order of that same product by that customer, after a specific date. Same product in the same order number do not count. The catch is that I want "more than once" to be configurable, so if I need to see 3 or more, or 4 or more I can adjust AND I want to see the counts. Here's the query I have so far, which I think gives me the items and the counts - but not the 90 day thing:
EDITED: I don't think the former version gave me the right counts
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >'2017-11-01'
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 2
Try doing this. it'll go back 90 days
declare #date date = '2017-11-01'
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >= dateadd(DD,-90,#date) and orderdate <= #date
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 1
yes that is what I was doing in the first query. so this might be a really crappy way of doing it but without seeing any data it was kind of tough. this query shows gives you the order dates as well. hope it helps
WITH DupsWithin90Days (customerid,productsku,productname,orderdate,num)
as
(
select customerid,productsku,productname,orderdate ,count(*) num from (
SELECT X.customerid, X.productsku, X.productname,X.ORDERDATE,ROW_NUMBER() OVER (partition by x.customerid,x.orderdate order by x.orderdate) rownum
FROM
(
SELECT T1.customerid, T1.productsku, T1.productname,T1.ORDERDATE
FROM TABLENAME1 T1
) X
JOIN
(
SELECT T2.customerid, T2.productsku, T2.productname,T2.ORDERDATE
FROM
TABLENAME1 T2
) Y
ON X.customerid = Y.customerid AND X.orderdate >= dateadd(DD,-90,Y.orderdate)
) dup
where rownum > 1
group by customerid,productsku,productname,orderdate
)
select customerid,productsku,productname,orderdate
from DupsWithin90Days
order by customerid ,orderdate desc

Summing a column over a date range in a CTE?

I'm trying to sum a certain column over a certain date range. The kicker is that I want this to be a CTE, because I'll have to use it multiple times as part of a larger query. Since it's a CTE, it has to have the date column as well as the sum and ID columns, meaning I have to group by date AND ID. That will cause my results to be grouped by ID and date, giving me not a single sum over the date range, but a bunch of sums, one for each day.
To make it simple, say we have:
create table orders (
id int primary key,
itemID int foreign key references items.id,
datePlaced datetime,
salesRep int foreign key references salesReps.id,
price int,
amountShipped int);
Now, we want to get the total money a given sales rep made during a fiscal year, broken down by item. That is, ignoring the fiscal year bit:
select itemName, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName
Simple enough. But when you add anything else, even the price, the query spits out way more rows than you wanted.
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
group by itemName, price
Now, each group is (name, price) instead of just (name). This is kind of sudocode, but in my database, just this change causes my result set to jump from 13 to 32 rows. Add to that the date range, and you really have a problem:
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between 150101 and 151231
group by itemName, price
This is identical to the last example. The trouble is making it a CTE:
with totals as (
select itemName, price, sum(price) as totalSales, sum(totalShipped) as totalShipped, orderDate as startDate, orderDate as endDate
from orders
join items on items.id = orders.itemID
where orders.salesRep = '1234'
and orderDate between startDate and endDate
group by itemName, price, startDate, endDate
)
select totals_2015.itemName as itemName_2015, totals_2015.price as price_2015, ...
totals_2016.itemName as itemName_2016, ...
from (
select * from totals
where startDate = 150101 and endDate = 151231
) totals_2015
join (
select *
from totals
where startDate = 160101 and endDate = 160412
) totals_2016
on totals_2015.itemName = totals_2016.itemName
Now the grouping in the CTE is way off, more than adding the price made it. I've thought about breaking the price query into its own subquery inside the CTE, but I can't escape needing to group by the dates in order to get the date range. Can anyone see a way around this? I hope I've made things clear enough. This is running against an IBM iSeries machine. Thank you!
Depending on what you are looking for, this might be a better approach:
select 'by sales rep' breakdown
, salesRep
, '' year
, sum(price * amountShipped) amount
from etc
group by salesRep
union
select 'by sales rep and year' breakdown
, salesRep
, convert(char(4),orderDate, 120) year
, sum(price * amountShipped) amount
from etc
group by salesRep, convert(char(4),orderDate, 120)
etc
When possible group by the id columns or foreign keys because the columns are indexed already you'll get faster results. This applies to any database.
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte order by rep
or more fancy
with cte as (
select id,rep, sum(sales) sls, count(distinct itemid) did, count(*) cnt from sommewhere
where date between x and y
group by id,rep
) select * from cte join reps on cte.rep = reps.rep order by sls desc
I eventually found a solution, and it doesn't need a CTE at all. I wanted the CTE to avoid code duplication, but this works almost as well. Here's a thread explaining summing conditionally that does exactly what I was looking for.

Showing all results even using GROUP BY CLAUSE

Query :
How to sort by months ?
select format(datee,'mmm-yyyy') as [Months],sum(amount) as Amount
from ledger_broker
where ref_from like 'Purchase'
group by format(datee,'mmm-yyyy')
order by format(datee,'mmm-yyyy') desc
Output :
Try grouping by the same exact column which you select:
SELECT t.[Months], t.Amount
FROM
(
SELECT MONTH(datee) AS theMonth, YEAR(datee) AS theYear,
FORMAT(datee,'mmm-yyyy') AS [Months], SUM(amount) AS Amount
FROM ledger_transporter
WHERE ref_from LIKE 'Purchase'
GROUP BY MONTH(datee), YEAR(datee), FORMAT(datee, 'mmm-yyyy')
) t
ORDER BY t.theYear DESC, t.theMonth DESC
One way to order by date is to select the numeric month and year in your query.
change group by datee to group by format(datee,'mmm-yyyy').
select distinct format(datee,'mmm-yyyy') as [Months], sum(amount) as Amount
from ledger_transporter
where ref_from like 'Purchase'
group by format(datee,'mmm-yyyy')
order by Month(datee)
The reason is that your date, which I assume is say 01-FEB-2016 and 02-FEB-2016, is different and if you group by it, you will get 2 different records for it.
However, for format(datee,'mmm-yyyy'), ie FEB-2016, both of these dates are same. Hence the mismatch