SQL query to get status as of a given date - sql

I'm sure this has been answered before but couldn't find it.
I have a table of items which change status every few weeks. I want to look at an arbitrary day and figure out how many items were in each status.
For example:
tbl_ItemHistory
ItemID
StatusChangeDate
StatusID
Sample data:
1001, 1/1/2010, 1
1001, 4/5/2010, 2
1001, 6/15/2010, 4
1002, 4/1/2010, 1
1002, 6/1/2010, 3
...
So I need to figure out how many items were in each status for a given day. So on 5/1/2010, there was one item (1001) in status 2 and one item in status 1 (1002).
Since these items don't change status very often, maybe I could create a cached table every night that has a row for every item and every day of the year? I'm not sure if that's best or how to do that though
I'm using SQL Server 2008R2

For an arbitrary day, you can do something like this:
select ih.*
from (select ih.*,
row_number() over (partition by itemId order by StatusChangeDate desc) as seqnum
from tbl_ItemHistory ih
where StatusChangeDate <= #YOURDATEGOESHERE
) ih
where seqnum = 1
The idea is to enumerate all the history records for each on or before the date,using row_nubmer. The ordering is in reverse chronological order, so the most recent record -- on or before the date -- has a value of 1.
The query then just chooses the records whose value is 1.
To aggregate the results to get each status for the date, use:
select statusId, count(*)
from (select ih.*,
row_number() over (partition by itemId order by StatusChangeDate desc) as seqnum
from tbl_ItemHistory ih
where StatusChangeDate <= #YOURDATEGOESHERE
) ih
where seqnum = 1
group by StatusId

Related

how to show for every user the last two payement date and sum of mounts?

i have 2 tables where from i'm trying to extract from table 1 the last 2 taxe dates per user who were taxed for the last time on the 19/06/2022 and with product id 12 in table 2, and the sum amount of taxes, as well as the time range between the two last taxe dates as mentionned in the image bellow .
First step is to add a RANK() or ROW_NUMBER() to order the payments backwards, by id so you're only looking at the 2 last payments. Like this.
The next step is to aggregate those to get min and max dates, and sum of amount. Like this.
Lastly, you calculate the difference between min and max dates. Like this.
WITH LAST_TWO AS (
SELECT *,ROW_NUMBER() OVER(PARTITION BY id ORDER BY tax_date DESC) AS time_ago
FROM table1
QUALIFY time_ago <= 2
),
AGG AS (
SELECT
id,
MIN(tax_date) as tax_date_MIN,
MAX(tax_date) as tax_date_MAX,
SUM(amount) as amount_SUM
FROM LAST_TWO
GROUP BY id
)
SELECT id, amount_SUM, DATEDIFF(day, tax_date_MIN, tax_date_MAX) as DATE_RANGE
FROM AGG
INNER JOIN table2 ON AGG.id = table2.id
WHERE table2.product_id = 12;

SQL Distinct / GroupBy

Ok, I’m stuck on an SQL query and tried long enough that it’s time to ask for help :) I'm using Objection.js – but that's not super relevant as I really just can't figure out how to structure the SQL.
I have the following example data set:
Items
id
name
1
Test 1
2
Test 2
3
Test 3
Listings
id
item_id
price
created_at
1
1
100
1654640000
2
1
60
1654640001
3
1
80
1654640002
4
2
90
1654640003
5
2
90
1654640004
6
3
50
1654640005
What I’m trying to do:
Return the lowest priced listing for each item
If all listings for an item have the same price, I want to return the newest of the two items
Overall, I want to return the resulting items by price
I’m trying to write a query that returns the data:
id
item_id
name
price
created_at
6
3
Test 3
50
1654640005
2
1
Test 1
60
1654640001
5
2
Test 2
90
1654640004
Any help would be greatly appreciated! I'm also starting fresh, so I can add new columns to the data if that would help at all :)
An example of where my query is right now:
select * from "listings" inner join (select "item_id", MIN(price) as "min_price" from "listings" group by "item_id") as "grouped_listings" on "listings"."item_id" = "grouped_listings"."item_id" and "listings"."price" = "grouped_listings"."min_price" where "listings"."sold_at" is null and "listings"."expires_at" > ? order by CAST(price AS DECIMAL) ASC limit ?;
This gets me listings – but if two listings have the same price, it returns multiple listings with the same item_id – not ideal.
Given the postgresql tag, this should work:
with listings_numbered as (
select *, row_number() over (
partition by item_id
order by price asc, created_at desc
) as rownum
from listings
)
select l.id, l.item_id, i.name, l.price, l.created_at
from listings_numbered l
join items i on l.item_id=i.id
where l.rownum=1
order by price asc;
This is a bit of an advanced query, using window functions and a common table expression, but we can break it down.
with listings_numbered as (...) select simply means to run the query inside of the ..., and then we can refer to the results of that query as listings_numbered inside of the select, as though it was a table.
We're selecting all of the columns in listings, plus one more:
row_number() over (partition by item_id order by price asc, created_at desc). partition by item_id means that we would like the row number to reset for each new item_id, and the order by specifies the ordering that the rows should get within each partition before we number them: first increasing by price, then decreasing by creation time to break ties.
The result of the CTE listings_numbered looks like:
id
item_id
price
created_at
rownum
2
1
60
1654640001
1
3
1
80
1654640002
2
1
1
100
1654640000
3
5
2
90
1654640004
1
4
2
90
1654640003
2
6
3
50
1654640005
1
If you look at only the rows where rownum (the last column) is 1, then you can see that it's exactly the set of listings that you're interested in.
The outer query then selects from this this dataset, joins on items to get the name, filters to only the listings where rownum is 1, and sorts by price, to get the final result:
id
item_id
name
price
created_at
6
3
Test 3
50
1654640005
2
1
Test 1
60
1654640001
5
2
Test 2
90
1654640004
Aggregation functions, as the MIN function you employed in your query, is a viable option, yet if you want to have an efficient query for your problem, window functions can be your best friends. This class of functions allow to compute values over "windows" (partitions) of your table given some specified columns.
For the solution to this problem I'm going to compute two values using the window functions:
the minimum value for "listings.price", by partitioning on "listings.item_id",
the maximum value for "created_at", by partitioning on "listings.item_id" and listings.price
SELECT *,
MIN(price) OVER(PARTITION BY item_id) AS min_price,
MAX(created_at) OVER(PARTITION BY item_id, price) AS max_created_at
FROM listings
Once you have all records of listings associated to the corresponding minimum price and latest date, it's necessary for you to select the records whose
price equals the minimum price
created_at equals the most recent created_at
WITH cte AS (
SELECT *,
MIN(price) OVER(PARTITION BY item_id) AS min_price,
MAX(created_at) OVER(PARTITION BY item_id, price) AS max_created_at
FROM listings
)
SELECT id,
item_id,
price,
created_at
FROM cte
WHERE price = min_price
AND created_at = max_created_at
If you need to order by price, it's sufficient to add a ORDER BY price clause.
Check the demo here.

Selecting 5 Most Recent Records Of Each Group

The below statement retrieves the top 2 records within each group in SQL Server. It works correctly, however as you can see it doesn't scale at all. I mean that if I wanted to retrieve the top 5 or 10 records instead of just 2, you can see how this query statement would grow very quickly.
How can I convert this query into something that returns the same records, but that I can quickly change it to return the top 5 or 10 records within each group instead, rather than just 2? (i.e. I want to just tell it to return the top 5 within each group, rather than having 5 unions as the below format would require)
Thanks!
WITH tSub
as (SELECT CustomerID,
TransactionTypeID,
Max(EventDate) as EventDate,
Max(TransactionID) as TransactionID
FROM Transactions
WHERE ParentTransactionID is NULL
Group By CustomerID,
TransactionTypeID)
SELECT *
from tSub
UNION
SELECT t.CustomerID,
t.TransactionTypeID,
Max(t.EventDate) as EventDate,
Max(t.TransactionID) as TransactionID
FROM Transactions t
WHERE t.TransactionID NOT IN (SELECT tSub.TransactionID
FROM tSub)
and ParentTransactionID is NULL
Group By CustomerID,
TransactionTypeID
Use Partition by to solve this type problem
select values from
(select values ROW_NUMBER() over (PARTITION by <GroupColumn> order by <OrderColumn>)
as rownum from YourTable) ut where ut.rownum<=5
This will partitioned the result on the column you wanted order by EventDate Column then then select those entry having rownum<=5. Now you can change this value 5 to get the top n recent entry of each group.

Update a field for a specific # of records in SQL Server 2005

Say I want 3 records flagged for each product in my table. But if some products only get 1 or 2 records flagged or even no records flagged, how can I make it randomly flag the remaining records up to the total of 3 per product.
Ex:
1 record gets flagged for Product_A, 2 records get flagged for Product_B and 3 records get flagged for Product_C.
Once script is complete, I need 2 more records flagged for Product_A and 1 more for Product_B.
This can be a loop or a cte or whatever is the most efficient way to do this in sql. Thanks!
Here's one way to do it:
;with SelectedIds as(
select
Id,
row_number() over (
partition by ProductCode -- distinct numbering for each Product Code
order by newid() -- random
) as rowno
from ProductLines
)
update p
set IsFlagged = 1
from ProductLines p
join SelectedIds s
on p.id = s.id and
s.rowno <= 3 -- limit to 3 records / product code
;
Here's a full sample, including some test data: http://www.sqlfiddle.com/#!3/3bee1/6
Use row_number() in a derived table where the numbers are generated so the rows that already have flags come first and the rest are ordered randomly and partition by Product. If random is not a requirement you can just remove newid() from the query.
Set the flag for the rows number 1-3 if the row is not already flagged.
update T
set Flag = 1
from (
select Flag,
row_number() over(partition by Product
order by Flag desc, newid()) as rn
from YourTable
) as T
where T.rn <= 3 and
T.Flag = 0
SQL Fiddle

Multiple filters on SQL query

I have been reading many topics about filtering SQL queries, but none seems to apply to my case, so I'm in need of a bit of help. I have the following data on a SQL table.
Date item quantity moved quantity in stock sequence
13-03-2012 16:51:00 xpto 2 2 1
13-03-2012 16:51:00 xpto -2 0 2
21-03-2012 15:31:21 zyx 4 6 1
21-03-2012 16:20:11 zyx 6 12 2
22-03-2012 12:51:12 zyx -3 9 1
So this is quantities moved in the warehouse, and the problem is on the first two rows which was a reception and return at the same time, because I'm trying to make a query which gives me the stock at a given time of all items. I use max(date) but i don't get the right quantity on result.
SELECT item, qty_in_stock
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY item ORDER BY item_date DESC, sequence DESC) rn
FROM mytable
WHERE item_date <= #date_of_stock
) q
WHERE rn = 1
If you are on SQL-Server 2012, these are several nice features added.
You can use the LAST_VALUE - or the FIRST_VALUE() - function, in combination with a ROWS or RANGE window frame (see OVER clause):
SELECT DISTINCT
item,
LAST_VALUE(quantity_in_stock) OVER (PARTITION BY item
ORDER BY date, sequence
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
AS quantity_in_stock
FROM tableX
WHERE date <= #date_of_stock
Add a where clause and do the summation:
select item, sum([quantity moved])
from t
group by item
where t.date <= #DESIREDDATETIME
If you put a date in for the desired datetime, remember that goes to midnight when the day starts.