Finding max date for a concatenated field - sql

I am trying to take a very large product table that has one row per product status and date, and get down to a table that demonstrates the latest status for each product they own.
I think if I concatenate the account and product columns and then use that to find the max date but I'm stumbling with my code. Would appreciate any insight!
Example table
Account
Product
EffectiveDate
Status
10000
Product A
5/1/2021
Live
10000
Product A
9/1/2020
Decomissioned
10000
Product B
12/1/2021
Implementing
My goal output would be:
Account
Product
EffectiveDate
Status
10000
Product A
5/1/2021
Live
10000
Product B
12/1/2021
Implementing

SELECT X.Account,X.Product,X.EffectiveDate,X.Status FROM
(
SELECT E.Account,E.Product,E.EffectiveDate,E.Status,
ROW_NUMBER()OVER(PARTITION BY E.Product ORDER BY E.EffectiveDate DESC)AS XCOL
FROM Example_table AS E
)X WHERE X.XCOL=1
May be something like this will be suitable

For some DBMS you can use a window function and qualify. On others (SQL Server) you can use top and a window function.
create table T1 (Account int, Product varchar(255), EffectiveDate date, status varchar(255));
insert into T1 (Account, Product , EffectiveDate , "STATUS" ) values
(10000, 'Product A', '2021-05-01', 'Live'),
(10000, 'Product A', '2020-09-01', 'Decomissioned'),
(10000, 'Product B', '2021-12-01', 'Implementing');
-- Snowflake, Teradata, Oracle, others...
select Account, Product, EffectiveDate, Status
from T1
qualify row_number() over (partition by Account, Product order by EffectiveDate desc) = 1
;
-- SQL Server
select top 1 with ties Account, Product, EffectiveDate, Status
from T1
order by row_number() over (partition by Account, Product order by EffectiveDate desc);

Related

SQL Query - second ID of a list ordered by date and ID

I have a SQL database with a list of Customer IDs CustomerID and invoices, the specific product purchased in each invoice ProductID, the Date and the Income of each invoice . I need to write a query that will retrieve for each product, which was the second customer who made a purchase
How do I do that?
EDIT:
I have come up with the following query:
SELECT *,
LEAD(CustomerID) OVER (ORDER BY ProductID, Date) AS 'Second Customer Who Made A Purchase'
FROM a
ORDER BY ProductID, Date ASC
However, this query presents multiple results for products that have more than two purchases. Can you advise?
SELECT a2.ProductID,
(
SELECT a1.CustomerID
FROM a a1
WHERE a1.ProductID = a2.ProductID
ORDER BY Date asc
LIMIT 1,1
) as SecondCustomer
FROM a a2
GROUP BY a2.ProductID
I need to write a query that will retrieve for each product, which was the second customer who made a purchase
This sounds like a window function:
select a.*
from (select a.*,
row_number() over (partition by productid order by date asc) as seqnum
from a
) a
where seqnum = 2;

SQL filter to replace duplicate value records with one single custom value record

I am trying to create a report that shows a count of items (store_Product) purchased by store location(store_ID).
My issue is that when a distinct store location purchases both product_a and product_b, then I need the report to show one record of that store_ID with store_Product as "product_A" instead of having two records with same store_ID and both product_A and product_B.
However, if a distinct store location only purchases product_A OR product_B (but not both) then it would show one record of that store_ID along with what product it purchased as it normally does now.
On the left is what I am getting right now and on the right is what I want the result to look like:
How can I achieve this result?
Thanks!
In Microsoft SQL Server, you can achieve this by using CTE:
CREATE TABLE #temp (
store_id int,
store_product varchar(25)
)
INSERT INTO #temp
VALUES (100, 'product_A')
, (100, 'product_B')
, (200, 'product_B')
, (300, 'product_A')
, (400, 'product_B')
, (400, 'product_A')
;WITH cte
AS (SELECT
*,
ROW_NUMBER() OVER (PARTITION BY store_id ORDER BY store_id, store_product) AS rn
FROM #temp)
SELECT
store_id , store_product
FROM cte
WHERE rn = 1
DROP TABLE #temp
select store_id, min(store_product) as store_product
from table_name
group by store_id;
... its another dirty trick that will work with the sample data ;)
In a comment to an answer you are correcting your request. You want to suppress product_B when the same store also has product_A. All other rows shall remain in the result. At least this is how I understand this now.
One way to achieve this is with a NOT IN (or NOT EXISTS) clause:
select
store_id,
store_product
from mytable
where store_product <> 'product_B'
or store_id not in (select store_id from mytable where store_product = 'product_A');
or if you find that more readable:
select
store_id,
store_product
from mytable
where not
(
store_product = 'product_B' and
store_id in (select store_id from mytable where store_product = 'product_A')
);

Unable to retrieve a row with highest value of a column using row number and group by

I am working on a query which returns one row which has highest price in it for each product.
For Example I have
Table T1
Product Price Tax Location
Pen 10 2.25 A
Pen 5 1.25 B
Pen 15 1.5 A
Board 25 5.26 A
Board 2 NULL B
Water 5 10 A
The result should be like
Product Price Tax Location
Pen 15 1.5 A
Board 25 5.26 A
Water 5 10 A
I am using row number() and group by to achieve this using the following
ALTER VIEW [dbo].[InferredBestBids]
AS
SELECT ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL
) ) AS id ,
product ,
MAX(price) AS Price ,
MIN(tax) AS Tax ,
location
FROM [dbo].InferredBids_A
WHERE NOT ( proce IS NULL
AND tax IS NULL
)
GROUP BY market ,
term
GO
When I ran the above query, it threw me the error
Column 'dbo.InferredBids_A.Location' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
When I tried to group the query results by location, it gave me incorrect results by returning multiple rows for a product depending on the location
No GROUP BY is needed if you get your row_number clauses properly engaged and then just select based on the rownumber. Feel free to add an extra row_number call to the front of the outer query if you require it for some other reason. See the example here.
SELECT Product, Price, Tax, Location
FROM (
SELECT Product, Price, Tax, Location, ROW_NUMBER()OVER(PARTITION BY Product ORDER BY Price DESC) as RowID
FROM InferredBids_A
) T
WHERE RowID = 1
If you select something that's aggregated you must GROUP BY anything else in the select list that is not also aggregated:
SELECT Product, Price ,Tax, Location
FROM (SELECT Product, Price ,Tax, Location,
RANK() OVER (PARTITION BY Product ORDER BY Price DESC) N
FROM InferredBids_A
WHERE Price IS NOT NULL AND Tax IS NOT NULL
) T WHERE N = 1
(RANK will give rows for ties, use ROW_NUMBER if you don't care about these)
Making some test data:
DECLARE #BestBids TABLE
(
Product VARCHAR(20),
Price INT,
Tax DECIMAL(10,2),
Location VARCHAR(10)
)
INSERT INTO #BestBids
VALUES
('Pen', 10, 2.25, 'A'),
('Pen', 5, 1.25, 'B'),
('Pen', 15, 1.5, 'A'),
('Board', 25, 5.26, 'A'),
('Board', 2, NULL, 'B'),
('Water', 5, 10, 'A');
We get our row number to be set to the highest price for each product.
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Product ORDER BY Price DESC) RN
FROM #BestBids
) a
WHERE RN=1
We wrap the sql and just pick the first row number. Here is the output:
Product Price Tax Location RN
Board 25 5.26 A 1
Pen 15 1.50 A 1
Water 5 10.00 A 1
You could use a common table expression:
WITH cte
AS ( SELECT product ,
MAX(price) AS price
FROM dbo.InferredBids_A
)
SELECT product ,
price ,
tax ,
location
FROM dbo.InferredBids_A tbl
INNER JOIN cte ON cte.product = tbl.product
AND cte.price = tbl.price

SQL Standard Value and Variations

Below is a sample of data
UnitID ITEM_Num Price
13446 71079 45.57
13447 71079 45.57
13448 71079 52.50
13449 71079 45.57
13450 71079 36.22
The actual dataset has roughly 100 unique UnitIDs and 700 unique Item_Num values. I am trying to determine the most common price for each Item_Num and then select any records that vary from that standard by more than a specified percent.
Ideally we would have a standard Price value for each item but we don't. What is the best way to find the most common value. Also is there a function that might be able to quickly rank the Items with the most variation is Price.
This is SQL Server 2012.
You can use GROUP BY statement:
SELECT Price, count(*) FROM my_table GROUP BY Price ORDER BY Price ASC
Hope this helps!
The following query should work in SQL Server. It should give back each ITEM_Num with a price 10% lower or higher than the most common price.
;WITH cte AS (
SELECT
RANK() OVER (PARTITION BY ITEM_Num ORDER BY COUNT(1) DESC) AS 'Rank'
, ITEM_Num
, Price
FROM Units
GROUP BY ITEM_Num, Price
)
SELECT u1.UnitID
, u1.ITEM_Num
, u1.Price
, u2.Price AS 'most common price'
FROM Units u1
INNER JOIN cte AS u2
ON u2.ITEM_Num = u1.ITEM_Num
AND u2.Rank = 1
WHERE ABS(u1.Price - u2.Price) >= (u2.Price * 0.1);
EDIT: I wrote the query not knowing your DBMS, could probably be more efficient using the ranking functions of SQL Server.
EDIT 2: http://sqlfiddle.com/#!6/74940/33
Create table #t(
UnitID int,
Item_Num int,
Price money
)
Insert into #t(Unitid, Item_Num, Price)
values(13446, 71079, 45.57 ),
(13447, 71079, 45.57),
(13448, 71079, 52.50),
(13449, 71079, 45.57),
(13450, 71079, 36.22)
;with cte as (
Select
Unitid, Item_Num, Price,
Row_Number() over ( partition by item_num order by price) rownum
from #t
)
Select
u.UnitID,
u.Item_Num,
u.Price,
U1.price as CommonPrice,
u.RowNum,
U.Price*0.1,
(u.price +(u.price*0.1)) as NewPrice
from cte as U
inner join #t u1 on u.item_num =u1.item_num
where u.rownum =1

SQL Group BY SUM one column and select of first row of grouped items

I have a part table where I have 5 fields. I want to sum the QTY of the mfgpn while showing the first returned row for the other 3 fields (Manfucturer, DateCode, Description). I initially thought of using the MIN function as follows, but that doesn't really help me insofar as that the data is not a int data type. How would I go about doing this? Right now I'm stuck at the following query below:
SELECT SUM([QTY]) AS QTY
,[MFGPN]
,MIN([MANUFACTURER]) AS MANUFACTURER
,MIN([DATECODE]) AS DateCode
,MIN([DESCRIPTION]) AS DESCRIPTION
INTO part
GROUP BY MFGPN, MANUFACTURER, DATECODE, description
ORDER BY mfgpn ASC
Would CROSS APPLY work for you?
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM part a
CROSS APPLY (SELECT TOP 1 * FROM part b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Tested with the following:
DECLARE #T1 AS TABLE (
[QTY] int
,[MFGPN] NVARCHAR(50)
,[MANUFACTURER] NVARCHAR(50)
,[DATECODE] DATE
,[DESCRIPTION] NVARCHAR(50));
INSERT #T1 VALUES
(2, 'MFGPN-1', 'MANUFACTURER-A', '20120101', 'A-1'),
(4, 'MFGPN-1', 'MANUFACTURER-B', '20120102', 'B-1'),
(3, 'MFGPN-1', 'MANUFACTURER-C', '20120103', 'C-1'),
(1, 'MFGPN-2', 'MANUFACTURER-A', '20120101', 'A-2'),
(5, 'MFGPN-2', 'MANUFACTURER-B', '20120101', 'B-2')
SELECT
SUM(a.[QTY]) AS QTY
,a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
FROM #T1 a
CROSS APPLY (SELECT TOP 1 * FROM #T1 b WHERE a.[MFGPN] = b.[MFGPN]) c
GROUP BY
a.[MFGPN]
,c.[MANUFACTURER]
,c.[DATECODE]
,c.[DESCRIPTION]
Produces
QTY MFGPN MANUFACTURER DATECODE DESCRIPTION
9 MFGPN-1 MANUFACTURER-A 2012-01-01 A-1
6 MFGPN-2 MANUFACTURER-A 2012-01-01 A-2
This can be easily managed with a windowed SUM():
WITH summed_and_ranked AS (
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY = SUM(QTY) OVER (PARTITION BY MFGPN),
RNK = ROW_NUMBER() OVER (
PARTITION BY MFGPN
ORDER BY DATECODE -- or which column should define the order?
)
FROM atable
)
SELECT
MFGPN,
MANUFACTURER,
DATECODE,
DESCRIPTION,
QTY,
INTO parts
FROM summed_and_ranked
WHERE RNK = 1
;
For every row, the total group quantity and the ranking within the group is calculated. When actually getting rows for inserting into the new table (the main SELECT), only rows with RNK values of 1 are pulled. Thus you get a result set containing group totals as well as details of certain rows.