Querying table with group by and sum - sql

I have the following table called Orders
Order | Date | Total
------------------------------------
34564 | 03/05/2015| 15.00
77456 | 01/01/2001| 3.00
25252 | 02/02/2008| 4.00
34564 | 03/04/2015| 7.00
I am trying to select the distinct order sum the total and group by order #, the problem is that it shows two records for 34564 because they are different dates.. How can I sum if they are repeated orders and pick only the max(date) - But sill sum the total of the two instances?
I.E result
Order | Date | Total
------------------------------------
34564 | 03/05/2015| 22.00
77456 | 01/01/2001| 3.00
25252 | 02/02/2008| 4.00
Tried:
SELECT DISTINCT Order, Date, SUM(Total)
FROM Orders
GROUP BY Order, Date
Of couse the above won't work as you can see but i am not sure how to achieve what i intend.

SELECT [order], MAX(date) AS date, SUM(total) AS total
FROM Orders o
GROUP BY [order]

You can use the MAX aggregate function to choose the latest Date to appear from each Order group:
SELECT Order, MAX(Date) AS Date, SUM(Total) AS Total
FROM Orders
GROUP BY Order

Simplest query should be:
SELECT MAX(Order), MAX(Date), SUM(Total)
FROM Orders

You can use SUM and MAX together:
SELECT
[Order],
[Date] = MAX([Date]),
Total = SUM(Total)
FROM tbl
GROUP BY [Order]
A word of advice, please refrain from using reserved words like Order and Date for your columns and table names.

Just add MAX(Date) to your SELECT clause.
Try this :
SELECT DISTINCT Order, MAX(Date), SUM(Total)
FROM Orders
GROUP BY Order, Date

Related

Combining COUNT and RANK - PostgreSQL

What I need to select is total number of trips made by every 'id_customer' from table user and their id, dispatch_seconds, and distance for first order. id_customer, customer_id, and order_id are strings.
It should looks like this
+------+--------+------------+--------------------------+------------------+
| id | count | #1order id | #1order dispatch seconds | #1order distance |
+------+--------+------------+--------------------------+------------------+
| 1ar5 | 3 | 4r56 | 1 | 500 |
| 2et7 | 2 | dc1f | 5 | 100 |
+------+--------+------------+--------------------------+------------------+
Cheers!
Original post was edited as during discussion S-man helped me to find exact problem solution. Solution by S-man https://dbfiddle.uk/?rdbms=postgres_10&fiddle=e16aa6008990107e55a26d05b10b02b5
db<>fiddle
SELECT
customer_id,
order_id,
order_timestamp,
dispatch_seconds,
distance
FROM (
SELECT
*,
count(*) over (partition by customer_id), -- A
first_value(order_id) over (partition by customer_id order by order_timestamp) -- B
FROM orders
)s
WHERE order_id = first_value -- C
https://www.postgresql.org/docs/current/static/tutorial-window.html
A window function which gets the total record count per user
B window function which orders all records per user by timestamp and gives the first order_id of the corresponding user. Using first_value instead of min has one benefit: Maybe it could be possible that your order IDs are not really increasing by timestamp (maybe two orders come in simultaneously or your order IDs are not sequential increasing but some sort of hash)
--> both are new columns
C now get all columns where the "first_value" (aka the first order_id by timestamp) equals the order_id of the current row. This gives all rows with the first order by user.
Result:
customer_id count order_id order_timestamp dispatch_seconds distance
----------- ----- -------- ------------------- ---------------- --------
1ar5 3 4r56 2018-08-16 17:24:00 1 500
2et7 2 dc1f 2018-08-15 01:24:00 5 100
Note that in these test data the order "dc1f" of user "2et7" has a smaller timestamp but comes later in the rows. It is not the first occurrence of the user in the table but nevertheless the one with the earliest order. This should demonstrate the case first_value vs. min as described above.
You are on the right track. Just use conditional aggregation:
SELECT o.customer_id, COUNT(*)
MAX(CASE WHEN seqnum = 1 THEN o.order_id END) as first_order_id
FROM (SELECT o.*,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_timestamp ASC) as seqnum
FROM orders o
) o
GROUP BY o.customer_id;
Your JOIN is not necessary for this query.
You can use window function :
select distinct customer_id,
count(*) over (partition by customer_id) as no_of_order
min(order_id) over (partition by customer_id order by order_timestamp) as first_order_id
from orders o;
I think there are many mistakes in your original query, your rank isn't partitioned, the order by clause seems incorrect, you filter out all but one "random" order, then apply the count, the list goes on.
Something like this seems closer to what you seem to want?
SELECT
customer_id,
order_count,
order_id
FROM (
SELECT
a.customer_id,
a.order_count,
a.order_id,
RANK() OVER (PARTITION BY a.order_id, a.customer_id ORDER BY a.order_count DESC) AS rank_id
FROM (
SELECT
customer_id,
order_id,
COUNT(*) AS order_count
FROM
orders
GROUP BY
customer_id,
order_id) a) b
WHERE
b.rank_id = 1;

How to get multiple rows based on max date

I have a table SalePrices in SQL server and data same as below:
SPID ProductID Price Date
001 Pro01 10 2016-03-10
002 Pro01 20 2016-03-11
003 Pro02 10 2016-03-13
004 Pro02 20 2016-03-15
What I want is create a view that show only one ProductID and Price that I have modified at the last time. So what I want is same as the result below:
ProductID Price Date
Pro01 20 2016-03-11
Pro02 20 2016-03-15
There're few different approaches for this, for example, using row_number():
;with cte as (
select
ProductID, Price, Date,
row_number() over(partition by ProductID order by Date desc) as rn
from <Table>
)
select
ProductID, Price, Date
from cte
where
rn = 1
sql fiddle demo
Another version with windowing functions, this one with FIRST_VALUE();
SELECT ProductID, price, date
FROM products
WHERE spid IN (
SELECT FIRST_VALUE(spid) OVER (PARTITION BY ProductID ORDER BY date DESC) spid
FROM products
)
An SQLfiddle to test with.
Note that Roman's version with ROW_NUMBER should work from SQL Server 2005 and newer, while this will only work for SQL Server 2012 and newer.
TRY THIS:
SELECT
ProductID
, Price
, Date FROM tablename AS A
JOIN (SELECT ProductID,MAX(Date) AS DATE FROM tablename
GROUP BY ProductID
) AS B ON A.Date=B.DATE AND A.ProductID=B.ProductID
one more approach...
select productid,price,date
from
table t1
where date=(select max(date) from table t2 where t1.productid=t2.productid)
Your last record will have the highest SPID:
select
ProductId, Price, Date
from
SalePrices sap
where
sap.spid =(
select
max(sap2.spid)
from
SalePrices sap2
where
sap2.productId = sap.productId)
This query will give u desired result:
ProductID Price Date
Pro01 20 2016-03-11
Pro02 20 2016-03-15

How can I generate the latest for an aggregate?

Hey stackoverflow community,
I have a table of Sales, hypothetical shown below.
Customer Revenue State Date
David $100 NY 2016-01-01
David $500 NJ 2016-01-03
Fred $200 CA 2016-01-01
Fred $200 CA 2016-01-02
I'm writing a simple query of revenue generated by customer. The output returns as such:
David $600
Fred $400
What I want to do now is add the row for the latest purchase date.
Desired result:
David $600 2016-01-03
Fred $400 2016-01-02
I would like to keep the SQL code as clean as possible. I also want to avoid doing a JOIN to a new query as this query can start to get complex. Any ideas as to how to do so?
You should sum revenues in your group and get the maximum of dates.
Something like this:
SELECT
Customer, SUM(Revenue) as RevenueSum, MAX([Date]) as [Date]
FROM Sales
GROUP BY Customer
I think it's what you need
select Customer,sum(Revenue), max(Date) from Sales group by Customer
One way to get the SUM of Revenue and also get the information from the record with the MAX Date is to use the ROW_NUMBER() and SUM() windowed functions.
The SUM() OVER() will apply the sum for the Customer to each row and the ROW_NUMBER() OVER() will give each row an order number by Customer and Date DESC.
Put this in a subquery and select only the records with Row_Number of 1 (max date)
SELECT [Customer],
[Revenue],
[State],
[Date]
FROM (SELECT [Customer],
SUM([Revenue]) OVER (PARTITION BY [Customer]) [Revenue],
[State],
[Date],
ROW_NUMBER() OVER (PARTITION BY [Customer] ORDER BY [Date] DESC) Rn
FROM Sales
) t
WHERE t.Rn = 1

Hourly sum of values

I have a table with the following structure and sample data:
STORE_ID | INS_TIME | TOTAL_AMOUNT
2 07:46:01 20
3 19:20:05 100
4 12:40:21 87
5 09:05:08 5
6 11:30:00 12
6 14:22:07 100
I need to get the hourly sum of TOTAL_AMOUNT for each STORE_ID.
I tried the following query but i don't know if it's correct.
SELECT STORE_ID, SUM(TOTAL_AMOUNT) , HOUR(INS_TIME) as HOUR FROM VENDAS201302
WHERE MINUTE(INS_TIME) <=59
GROUP BY HOUR,STORE_ID
ORDER BY INS_TIME;
Not sure why you are not considering different days here. You could get the hourly sum using Datepart() function as below in Sql-Server:
DEMO
SELECT STORE_ID, SUM(TOTAL_AMOUNT) HOURLY_SUM
FROM t1
GROUP BY STORE_ID, datepart(hour,convert(datetime,INS_TIME))
ORDER BY STORE_ID
SELECT STORE_ID,
HOUR(INS_TIME) as HOUR_OF_TIME,
SUM(TOTAL_AMOUNT) as AMOUNT_SUM
FROM VENDAS201302
GROUP BY STORE_ID, HOUR_OF_TIME
ORDER BY INS_TIME;

Get max of column using sum

I have one table with following data..
saleId amount date
-------------------------
1 2000 10/10/2012
2 3000 12/10/2012
3 2000 11/12/2012
2 3000 12/10/2012
1 4000 11/10/2012
4 6000 10/10/2012
From my table I want result with max of sum amount between dates 10/10/2012 and 12/10/2012 which for the data above will be:
saleId amount
---------------
1 6000
2 6000
4 6000
Here 6000 is the max of the sums (by saleId) so I want ids 1, 2 and 4.
You have to use Sub-queries like this:
SELECT saleId , SUM(amount) AS Amount
FROM Table1
GROUP BY saleId
HAVING SUM(amount) =
(
SELECT MAX(AMOUNT) FROM
(
SELECT SUM(amount) AS AMOUNT FROM Table1
WHERE date BETWEEN '10/10/2012' AND '12/10/2012'
GROUP BY saleId
) AS A
)
See this SQLFiddle
This query goes through the table only once and is fairly optimised.
select top(1) with ties saleid, amount
from (
select saleid, sum(amount) amount
from tbl
where date between '20121010' and '20121210'
group by saleid
) x
order by amount desc;
You can produce the SUM with the WHERE clause as a derived table, then SELECT TOP(1) in the query using WITH TIES to show all the ones with the same (MAX) amount.
When presenting dates to SQL Server, try to always use the format YYYYMMDD for robustness.