PostgreSQL SELECT the last order per customer per date range

PostgreSQL SELECT the last order per customer per date range - sql

In PostgreSQL:
I have a Table that has 3 columns:
CustomerNum, OrderNum, OrderDate.
There may(or may not) be many orders for each customer per date range. What I am needing is the last OrderNum for each Customer that lies in the date range that is supplied.
What I have been doing is getting a ResultSet of the customers and querying each one separately, but this is taking too much time.
Is there any way of using a sub-select to select out the customers, then get the last OrderNum for each Customer?

On postgres you can also use the non-standard DISTINCT ON clause:
SELECT DISTINCT ON (CustomerNum) CustomerNum, OrderNum, OrderDate
FROM Orders
WHERE OrderDate BETWEEN 'yesterday' AND 'today'
ORDER BY CustomerNum, OrderDate DESC;
See http://www.postgresql.org/docs/current/static/sql-select.html#SQL-DISTINCT

select customernum, max(ordernum)
from table
where orderdate between '...' and '...'
group by customernum
that's all.

SELECT t1.CustomerNum, t1.OrderNum As LastOrderNum, t1.LastOrderDate
FROM table1 As t1
WHERE t1.OrderDate = (SELECT MAX(t2.OrderDate)
FROM table1 t2
WHERE t1.CustomerNum = t2.CustomerNum
AND t2.OrderDate BETWEEN date1 AND date2)
AND t1.OrderDate BETWEEN date1 AND date2

Not sure about your Customer table's structure or relationships, but this should work:
SELECT Customer.Num, (
SELECT OrderNum FROM Orders WHERE CustomerNum = Customer.Num AND OrderDate BETWEEN :start AND :end ORDER BY OrderNum DESC LIMIT 1
) AS LastOrderNum
FROM Customer

If by last order number you mean the largest order number then you can just use your select as the predicate for customer num, group the results and select the maximum:
SELECT CustomerNum, MAX(OrderNum) AS LastOrderNum
FROM Orders
WHERE
CustomerNum IN (SELECT CustomerNum FROM ...)
AND
OrderDate BETWEEN :first_date AND :last_date
GROUP BY CustomerNum
If the last order number isn't necessarily the largest order number then you'll need to either find the largest order date for each customer and join it together with the rest of the orders to find the corresponding number(s):
SELECT O.CustomerNum, O.OrderNum AS LastOrderNum
FROM
(SELECT CustomerNum, MAX(OrderDate) AS OrderDate
FROM Orders
WHERE
OrderDate BETWEEN :first_date AND :last_date
AND
CustomerNum IN (SELECT CustomerNum FROM ...)
GROUP BY CustomerNum
) AS CustLatest
INNER JOIN
Orders AS O USING (CustomerNum, OrderDate);

-- generate some data
DROP TABLE tmp.orders;
CREATE TABLE tmp.orders
( id INTEGER NOT NULL
, odate DATE NOT NULL
, payload VARCHAR
)
;
ALTER TABLE tmp.orders ADD PRIMARY KEY (id,odate);
INSERT INTO tmp.orders(id,odate,payload) VALUES
(1, '2011-10-04' , 'one' )
, (1, '2011-10-24' , 'two' )
, (1, '2011-10-25' , 'three' )
, (1, '2011-10-26' , 'four' )
, (2, '2011-10-23' , 'five' )
, (2, '2011-10-24' , 'six' )
;
-- CTE to the rescue ...
WITH sel AS (
SELECT * FROM tmp.orders
WHERE odate BETWEEN '2011-10-23' AND '2011-10-24'
)
SELECT * FROM sel s0
WHERE NOT EXISTS (
SELECT * FROM sel sx
WHERE sx.id = s0.id
AND sx.odate > s0.odate
)
;
result:
DROP TABLE
CREATE TABLE
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "orders_pkey" for table "orders"
ALTER TABLE
INSERT 0 6
id | odate | payload
----+------------+---------
1 | 2011-10-24 | two
2 | 2011-10-24 | six
(2 rows)

Related

How to select columns that aren't part of an aggregate query using HAVING SUM() in the WHERE and selecting only certain rows on db2

Using AS400 db2 for this.
I have a table of orders. From that table I have to:
Get all orders from a specified list of order IDs and type
Group by the user_id on those orders
Check to make sure the total order amount on the group is greater than $100
Return all orders that matched the group but the results won't be grouped, which includes order_id which is not part of the group
I got a bit stuck because the AS400 did not like that I was asking to select a field that wasn't part of the group, which I need.
I came up with this query, but it's slow.
-- Create a common temp table we can use in both places
WITH wantedOrders AS (
SELECT order_id FROM orders
WHERE
-- Only orders from the web
order_type = 'web'
-- And only orders that we want to get at this time
AND order_id IN
(
50,
20,
30
)
)
-- Our main select that gets all order information, even the non-grouped stuff
SELECT
t1.order_id,
t1.user_id,
t1.amount,
t2.total_amount,
t2.count
FROM orders AS t1
-- Join in the group data where we can do our query
JOIN (
SELECT
user_id,
SUM(amount) as total_amount,
COUNT(*) AS count
FROM
orders
-- Re use the temp table to get the order numbers
WHERE order_id IN (SELECT order_id FROM wantedOrders)
GROUP BY
user_id
HAVING SUM(amount)>100
) AS t2 ON t2.user_id=t1.user_id
-- Make sure we only use the order numbers
WHERE order_id IN (SELECT order_id FROM wantedOrders)
ORDER BY t1.user_id ASC;
What's the better way to write this query?

Try this:
WITH
wantedOrders (order_id) AS
(
VALUES 1, 2
)
, orders (order_id, user_id, amount) AS
(
VALUES
(1, 1, 50)
, (2, 1, 50)
, (1, 2, 60)
, (2, 2, 60)
, (3, 3, 200)
, (4, 3, 200)
)
-- Our main select that gets all order information, even the non-grouped stuff
SELECT *
FROM
(
SELECT
order_id,
user_id,
amount,
SUM (amount) OVER (PARTITION BY user_id) AS total_amount,
COUNT (*) OVER (PARTITION BY user_id) AS count
FROM orders t
WHERE EXISTS
(
SELECT 1
FROM wantedOrders w
WHERE w.order_id = t.order_id
)
) A
WHERE total_amount > 100
ORDER BY user_id ASC
ORDER_ID
USER_ID
AMOUNT
TOTAL_AMOUNT
COUNT
1
2
60
120
2
2
2
60
120
2

If order_id is the PK of the table. Then just add the columns you need to the wantedOrders query and use it as your "base" (instead of using orders and refiltering it. You should end up joining wantedOrders with itself.

You can do:
select t.*
from orders t
join (
select user_id
from orders t
where order_id in (50, 20, 30)
group by user_id
having sum(total_amount) > 100
) s on s.user_id = t.user_id
The first table orders as t will produce the data you want. It will be filtered by the second "table expression" s that preselects the groups according to your logic.

sql - select all rows that have all multiple same cols

I have a table with 4 columns.
date
store_id
product_id
label_id
and I need to find all store_ids that have all products_id with same label_id (for example 4)in one day.
for example:
store_id | label_id | product_id | data|
4 4 5 9/2
5 4 7 9/2
4 3 12 9/2
4 4 7 9/2
so it should return 4 because it's the only store that contains all possible products with label 4 at one day.
I have tried something like this:
(select store_id, date
from table
where label_id = 4
group by store_id, date
order by date)
I dont know how to write the outer query, I tried:
select * from table
where product_id = all(Inner query)
but it didnt work.
Thanks

It is unclear from your question whether the labels are specific to a given day or through the entire period. But a variation of Tim's answer seems appropriate. For any label:
SELECT t.date, t.label, t.store_id
FROM t
GROUP BY t.date, t.label, t.store_id
HAVING COUNT(DISTINCT t.product_id) = (SELECT COUNT(DISTINCT t2product_id)
FROM t t2
WHERE t2.label = t.label
);
For a particular label:
SELECT t.date, t.store_id
FROM t
WHERE t.label = 4
GROUP BY t.date,t.store_id
HAVING COUNT(DISTINCT t.product_id) = (SELECT COUNT(DISTINCT t2product_id)
FROM t t2
WHERE t2.label = t.label
);
If the labels are specific to the date, then you need that comparison in the outer queries as well.

Here is one way:
SELECT date, store_id
FROM yourTable
GROUP BY date, store_id
HAVING COUNT(DISTINCT product_id) = (SELECT COUNT(DISTINCT product_id)
FROM yourTable t2
WHERE t2.date = t1.date)
ORDER BY date, product_id;
This query reads in a pretty straightforward way, and it says to find every product, on some date, whose distinct product count is the same as the distinct product count on the same day, across all stores.

I'd probably aggregate to lists of products in a string or array:
with products_per_day_and_store as
(
select
store_id,
date,
string_agg(distinct product_id order by product_id) as products
from mytable
where label_id = 4
group by store_id, date
)
, products_per_day
(
select
date,
string_agg(distinct product_id order by product_id) as products
from mytable
where label_id = 4
group by date
)
select distinct ppdas.store_id
from products_per_day_and_store ppdas
join products_per_day ppd using (date, products);

Return only unique rows from a Table

I have a table with 4 columns and 7 rows.
This table contains 1 customer with the same ID same LNAME and FNAME.
Also the table has 2 customers with the same ID, but different LNAME or FNAME.
That is the sales reps input error. Ideally my table should have only 2 rows (Row with ID_pk 3 and 7)
I need to have the following result-sets from the above table:
All unique rows by all the four columns (Row with ID_pk 3 and 7). (excluding case # 3 listed below)
All duplicates by all the four columns (Row with ID_pk 3 and 8).
All duplicates by Customer_ID but with not matching LNAME and/or FNAME (Row with ID_pk 1, 2, 4 and 5) (these rows have to be sent back to sales reps for validation.)

Doing stuff this like relies heavily on nested queries, the GROUP BY clause, and the COUNT function.
Part 1 - Unique rows
This query will show you all the rows where the customer ID has matching data.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers WHERE Customer_ID IN (
SELECT Customer_ID FROM (
SELECT DISTINCT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
) Customers
GROUP BY Customer_ID
HAVING COUNT(Customer_ID) = 1
)
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
Part 2 - Duplicates
This query will show you all the rows that have the same data entered more than once.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME
FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
HAVING COUNT(Customer_ID) > 1
Part 3 - Mismatched Data
This query is basically the same as the first, just looking for a different COUNT value.
SELECT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers WHERE Customer_ID IN (
SELECT Customer_ID FROM (
SELECT DISTINCT Customer_ID, Customer_FNAME, Customer_LNAME FROM dbo.customers
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME
) Customers
GROUP BY Customer_ID
HAVING COUNT(Customer_ID) > 1
)
GROUP BY Customer_ID, Customer_FNAME, Customer_LNAME

You may use a CTE (Common Table expression): https://msdn.microsoft.com/en-us/library/ms175972.aspx
;WITH checkDup AS (
SELECT Customer_ID, ROW_NUMBER() OVER (PARTITION BY Customer_ID ORDER BY Customer ID) AS 'RN'
FROM Table)
SELECT Customer_ID FROM checkDup
WHERE RN = 1;
Will give you your example output.
You may manipulate the CTE to get the other results you seek.

display max on one columns with multiple columns in output

How can I display maximum OrderId for a CustomerId with many columns?
I have a table with following columns:
CustomerId, OrderId, Status, OrderType, CustomerType
A customer with Same customer id could have many order ids(1,2,3..) I want to be able to display the max Order id with the rest of the customers in a sql view. how can I achieve this?
Sample Data:
CustomerId OrderId OrderType
145042 1 A
110204 1 C
145042 2 D
162438 1 B
110204 2 B
103603 1 C
115559 1 D
115559 2 A
110204 3 A

I'd use a common table expression and ROW_NUMBER:
;With Ordered as (
select *,
ROW_NUMBER() OVER (PARTITION BY CustomerID
ORDER BY OrderId desc) as rn
from [Unnamed table from the question]
)
select * from Ordered where rn = 1

select * from table_name
where orderid in
(select max(orderid) from table_name group by customerid)

One way to do this is with not exists:
select t.*
from table t
where not exists (select 1
from table t2
where t2.CustomerId = t.CustomerId and
t2.OrderId > t.OrderId
);
This is saying: "get me all rows from t where there is no higher order id for the customer."

SQL Group BY SUM one column and select of first row of grouped items

I have a part table where I have 5 fields. I want to sum the QTY of the mfgpn while showing the first returned row for the other 3 fields (Manfucturer, DateCode, Description). I initially thought of using the MIN function as follows, but that doesn't really help me insofar as that the data is not a int data type. How would I go about doing this? Right now I'm stuck at the following query below:
SELECT SUM([QTY]) AS QTY
,[MFGPN]
,MIN([MANUFACTURER]) AS MANUFACTURER
,MIN([DATECODE]) AS DateCode
,MIN([DESCRIPTION]) AS DESCRIPTION
INTO part
GROUP BY MFGPN, MANUFACTURER, DATECODE, description
ORDER BY mfgpn ASC

Would CROSS APPLY work for you?
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM part a
CROSS APPLY (SELECT TOP 1 * FROM part b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Tested with the following:
DECLARE #T1 AS TABLE (
[QTY] int
,[MFGPN] NVARCHAR(50)
,[MANUFACTURER] NVARCHAR(50)
,[DATECODE] DATE
,[DESCRIPTION] NVARCHAR(50));
INSERT #T1 VALUES
(2, 'MFGPN-1', 'MANUFACTURER-A', '20120101', 'A-1'),
(4, 'MFGPN-1', 'MANUFACTURER-B', '20120102', 'B-1'),
(3, 'MFGPN-1', 'MANUFACTURER-C', '20120103', 'C-1'),
(1, 'MFGPN-2', 'MANUFACTURER-A', '20120101', 'A-2'),
(5, 'MFGPN-2', 'MANUFACTURER-B', '20120101', 'B-2')
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM #T1 a
CROSS APPLY (SELECT TOP 1 * FROM #T1 b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Produces
QTY MFGPN MANUFACTURER DATECODE DESCRIPTION
9 MFGPN-1 MANUFACTURER-A 2012-01-01 A-1
6 MFGPN-2 MANUFACTURER-A 2012-01-01 A-2

This can be easily managed with a windowed SUM():
WITH summed_and_ranked AS (
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY = SUM(QTY) OVER (PARTITION BY MFGPN),
RNK = ROW_NUMBER() OVER (
PARTITION BY MFGPN
ORDER BY DATECODE -- or which column should define the order?
)
FROM atable
)
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY,
INTO parts
FROM summed_and_ranked
WHERE RNK = 1
;
For every row, the total group quantity and the ranking within the group is calculated. When actually getting rows for inserting into the new table (the main SELECT), only rows with RNK values of 1 are pulled. Thus you get a result set containing group totals as well as details of certain rows.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

PostgreSQL SELECT the last order per customer per date range - sql

select customernum, max(ordernum) from table where orderdate between '...' and '...' group by customernum that's all.

SELECT t1.CustomerNum, t1.OrderNum As LastOrderNum, t1.LastOrderDate FROM table1 As t1 WHERE t1.OrderDate = (SELECT MAX(t2.OrderDate) FROM table1 t2 WHERE t1.CustomerNum = t2.CustomerNum AND t2.OrderDate BETWEEN date1 AND date2) AND t1.OrderDate BETWEEN date1 AND date2

Not sure about your Customer table's structure or relationships, but this should work: SELECT Customer.Num, ( SELECT OrderNum FROM Orders WHERE CustomerNum = Customer.Num AND OrderDate BETWEEN :start AND :end ORDER BY OrderNum DESC LIMIT 1 ) AS LastOrderNum FROM Customer

Related

How to select columns that aren't part of an aggregate query using HAVING SUM() in the WHERE and selecting only certain rows on db2

sql - select all rows that have all multiple same cols

Return only unique rows from a Table

display max on one columns with multiple columns in output

SQL Group BY SUM one column and select of first row of grouped items

Categories

Resources