SQL counting and grouping with condition - sql

On my postgresql db I have a score table with the columns user_id, item_id, succeeded (bool) and created_at.
I want to count the number of succeeded items for each user and each item (where succeeded = 't'). But I only want to count succeeded item occurred after the last failure.
I try the following code without success.
SELECT COUNT(*), item_id
FROM score
WHERE score.succeeded = 't' AND user_id = XX
GROUP BY score.item_id
HAVING MAX(score.created_at) > score.created_at AND score.succeeded = 'f'
exemple
data:
user_id | item_id | succeeded | created_at
------------------------------------------
12 | 1 | true | 2016-04-01
12 | 1 | false | 2016-04-02
12 | 1 | true | 2016-04-03
12 | 2 | true | 2016-04-01
12 | 2 | true | 2016-04-02
12 | 2 | true | 2016-04-03
12 | 3 | true | 2016-04-01
12 | 3 | true | 2016-04-02
12 | 3 | false | 2016-04-03
Excepted result (for user 12):
item_id | succeeded_count
-------------------------
1 | 1
2 | 3
3 | 0

Add a NOT EXISTS condition. It will make sure only true rows having no later false row are counted.
SELECT COUNT(*), s1.item_id
FROM score s1
WHERE score.succeeded = 't'
AND s1.user_id = XX
AND NOT EXISTS (select 1 from score s2
where s2.user_id = s1.user_id
and s2.item_id = s1.item_id
and s2.succeeded = 'f'
and s2.created_at > s1.created_at)
GROUP BY s1.item_id

You should be filtering the rows in the where clause. Assuming that succeeded only takes the values of 'f' or 't':
select s.item_id, count(*)
from score s
where s.created_at > (select max(s2.created_at)
from score s2
where s2.item_id = s.item_id and s2.succeeded = 'f'
)
group by s.item_id;
Note: This will filter out values of 0. You can get all of them by using join instead:
select s.item_id, count(s2.item_id)
from scores s left join
(select item_id, max(created_at) maxca
from scores
where succeeded = 'f'
group by item_id
) ss
on s.item_id = ss.item_id and s.created_at >= ss.maxca
group by item_id;
If succeeded can take on values other than 'f', then you can add a where clause to the outer query or use conditional aggregation.

Related

SQL logic to determine unsold inventory and corresponding available dates (Available to sell)

I am looking for advice on how to generate SQL to be used in SQL Server that will show available inventory to sell and the corresponding date that said inventory will be available. I am easily able to determine if we have inventory that is available immediately but can't wrap my head around what logic would be needed to determine future available quantities.
In the below table. The +/- column represents the weekly inbound vs outbound and the quantity available is a rolling SUM OVER PARTITION BY of the +/- column. I was able to get the immediate quantity available through this simple logic:
Case when Min(X.Qty_Available) > 0 Then Min(X.Qty_Available) else 0 END
AS Immediate_available_Qty
Table:
+-------------+---------------+---------------+------+---------------+
| Item Number | Item Name | week_end_date | +/- | Qty_Available |
+-------------+---------------+---------------+------+---------------+
| 123456 | Fidget Widget | 7/13/2019 | 117 | 117 |
| 123456 | Fidget Widget | 7/20/2019 | 49 | 166 |
| 123456 | Fidget Widget | 7/27/2019 | -7 | 159 |
| 123456 | Fidget Widget | 8/3/2019 | -12 | 147 |
| 123456 | Fidget Widget | 8/10/2019 | -1 | 146 |
| 123456 | Fidget Widget | 8/17/2019 | 45 | 191 |
| 123456 | Fidget Widget | 8/24/2019 | -1 | 190 |
| 123456 | Fidget Widget | 8/31/2019 | -1 | 189 |
| 123456 | Fidget Widget | 9/7/2019 | 50 | 239 |
+-------------+---------------+---------------+------+---------------+
My desired results of this query would be as follows:
+-----------+-----+
| Output | Qty |
+-----------+-----+
| 7/13/2019 | 117 |
| 7/20/2019 | 29 |
| 8/17/2019 | 43 |
+-----------+-----+
the second availability is determined by taking the first available quantity of 117 out of each line in Qty_Available column and finding the new minimum. If the new min is Zero, find the next continuously positive string of data (that runs all the way to the end of the data). Repeat for the third_available quantity and then stop.
I was on the thought train of pursuing RCTE logic but don't want to dive into that rabbit hole if there is a better way to tackle this issue and I'm not even sure the RCTE work for this problem?
This should return your expected result:
SELECT Item_Number, Min(week_end_date), Sum("+/-")
FROM
(
SELECT *
-- put a positive value plus all following negative values in the same group
-- using a Cumulative Sum over 0/1
,Sum(CASE WHEN "+/-" > 0 THEN 1 ELSE 0 end)
Over (PARTITION BY Item_Number
ORDER BY week_end_date
ROWS UNBOUNDED PRECEDING) AS grp
FROM my_table
) AS dt
WHERE grp <= 3 -- only the 1st 3 groups
GROUP BY Item_Number, grp
So here's what I came up with. I know this is poor, I didn't want to leave this thread high and dry and maybe I can get more insight on a better path. Please know that I've never had any real training so I don't know what I don't know.
I ended up running this into a temp table and altering the commented out section in table "A". then re-running that into a temp table.
Select
F.Upc,
F.name,
F.Week_end_date as First_Available_Date,
E.Qty_Available_1
From
(
Select Distinct
D.Upc,
D.name,
Case When Min(D.Rolling_Qty_Available) Over ( PARTITION BY D.upc) < 1 then 0 else
Min(D.Rolling_Qty_Available) Over ( PARTITION BY D.upc) END as Qty_Available_1,
Case When Max(D.Look_up_Ref) Over ( PARTITION BY D.upc) = 0 then '-1000' else
Max(D.Look_up_Ref) Over ( PARTITION BY D.upc) END as Look_up_Ref_1
From
(
Select
A.Upc,
A.name,
A.Week_end_Date,
A.Rolling_Qty_Available,
CASE WHEN
C.Max_Row = A.Row_num and A.[Rolling_Qty_Available] >1 THEN 1
ELSE
CASE WHEN
Sum(A.Calc_Row_Thing) OVER (Partition by A.UPC Order by A.Row_Num DESC
ROWS BETWEEN UNBOUNDED PRECEDING
AND Current ROW
) = (C.Max_Row - A.Row_num + 1)
THEN
C.Max_Row - A.Row_num + 1
ELSE 0 END
END as Look_up_Ref
FROM (
Select
G.Upc,
G.Name,
G.Week_End_Date,
G.Row_num,
G.Calc_Row_Thing,
G.Rolling_Qty_Available
--CASE When (G.Rolling_Qty_Available -
--isnull(H.Qty_Available_1,0)) > 0 then 1 else - 0 END as
--Calc_Row_Thing,
From [dbo].[ATS_item_detail_USA_vw] as G
--Left Join [dbo].[tmp_ats_usa_qty_1] as H on G.upc = H.upc
) AS A --Need to subtract QTY 1 out of here and below
join (
SELECT
B.upc,
Max(Row_num) AS Max_Row
FROM [dbo].[ATS_item_detail_USA_vw] AS B
GROUP BY B.upc
) as C on A.upc = C.upc
) as D
GROUP BY
D.Upc,
D.name,
D.Rolling_Qty_Available,
D.Look_up_Ref
HAVING Max(D.Look_up_Ref) > 1
) as E
Left join
(
SELECT
A.Upc,
A.name,
A.Week_end_Date,
A.Rolling_Qty_Available,
CASE WHEN
C.Max_Row = A.Row_num and A.[Rolling_Qty_Available] >1 THEN 1
ELSE
CASE WHEN
Sum(A.Calc_Row_Thing) OVER (Partition by A.UPC Order by A.Row_Num DESC
ROWS BETWEEN UNBOUNDED PRECEDING
AND Current ROW
) = (C.Max_Row - A.Row_num + 1)
THEN
C.Max_Row - A.Row_num + 1
ELSE 0 END
END as Look_up_Ref
From (
Select
G.Upc,
G.Name,
G.Week_End_Date,
G.Row_num,
G.Calc_Row_Thing,
G.Rolling_Qty_Available
--CASE When (G.Rolling_Qty_Available -
--isnull(H.Qty_Available_1,0)) > 0 then 1 else - 0 END as
--Calc_Row_Thing,
From [dbo].[ATS_item_detail_USA_vw] as G
--Left Join [dbo].[tmp_ats_usa_qty_1] as H on G.upc = H.upc
) as A --subtract qty_1 out the start qty 2 calc
join (
SELECT
B.upc,
Max(Row_num) as Max_Row
FROM [dbo].[ATS_item_detail_USA_vw] as B
GROUP BY B.upc
) AS C ON A.upc = C.upc
) AS F ON E.upc = F.upc and E.Look_up_Ref_1 = F.Look_up_Ref

SQL Server : show a new column that does calculation base on other rows

I'm using SQL Server, I have a table which I have simplified as the follow:
item ----- order-----Active
-------------------------------
a-------------1---------true
b-------------2---------false
c-------------3---------true
d-------------4---------true
e-------------5---------false
f-------------6---------true
I want to query that return the three columns and an extra column call new-order, which for each item it subtract one for every inactive(Active being false) items that has a order number lower than itself. So the above table will become
item ----- ordering-----Active----------NewOrder
-------------------------------------------------
a-------------1---------true----------1
b-------------2---------false----------2
c-------------3---------true----------2
d-------------4---------true----------3
e-------------5---------false----------4
f-------------6---------true----------4
My attempt:
Select
g.item, g.ordering, g.active,
g.ordering - (Select Count(x.ordering)
from grocery As x
where x.ordering < g.ordering
and x.active = false
group by x.ordering) As NewOrder
From
grocery as g
which doesn't work because the subquery contain more than one row. But I honestly don't have any idea how to approach this. Is this even possible using just subquery?
Appreciate any advice or help
Edit:
In the database there are more than 6 rows, and not in the correct order.
Basically an item's new order is its own ordering# subtract the number of inactive items whose ordering# is lower than the item's ordering#.
Again inactive items is items whose active = false.
I think following query will give you the result you want, you can omit NULL values.
You basically want the NEWORDER when active is true so you can omit NULL values for false
select
a.*,
b.NEWORDER
from
(
select
*
from
test1
)
as a
left join
(
select [order] as NO,ROW_NUMBER() over(order by [order]) as NEWORDER from test1 where active='true'
)
as b
on a.[order]=b.no
OUTPUT
item order active NEWORDER
-------------------------------------
a 1 true 1
b 2 false NULL
c 3 true 2
d 4 true 3
e 5 false NULL
f 6 true 4
Here is one way
SELECT item,
[order],
active,
NewOrder = Sum(t)OVER(ORDER BY [order])
FROM (SELECT *,
CASE WHEN Lag(Active)OVER(ORDER BY [order]) = 'false' THEN 0 ELSE 1 END AS t
FROM (VALUES ('a',1,'true' ),
('b',2,'false' ),
('c',3,'true' ),
('d',4,'true' ),
('e',5,'false' ),
('f',6,'true' )) tc (item, [order], Active)) a
Result :
+------+-------+--------+----------+
| item | order | active | NewOrder |
+------+-------+--------+----------+
| a | 1 | true | 1 |
| b | 2 | false | 2 |
| c | 3 | true | 2 |
| d | 4 | true | 3 |
| e | 5 | false | 4 |
| f | 6 | true | 4 |
+------+-------+--------+----------+

row counter with condition in two different columns

I have the following tables with sport results (e.g. football):
tblGoals (RowId, GameRowIdm PlayerRowId, TeamRowId, GoalMinute)
RowId | GameRowId | PlayerRowId | TeamRowId | GoalMinute
--------------------------------------------------------
1 | 1 | 1 | 1 | 25
2 | 1 | 2 | 2 | 45
3 | 1 | 3 | 1 | 66
tblPlayers (RowId, PlayerName)
RowId | PlayerName
------------------
1 | John Snow
2 | Frank Underwood
3 | Jack Bauer
tblGames (RowId, TeamHomeRowId, TeamGuestRowId)
RowId | TeamHomeRowId | TeamGuestRowId | GameDate
---------------------------------------------------
1 | 1 | 2 | 2015-01-01
Now I want get a list of all goals. The list should look like this:
GoalMinute | PlayerName | GoalsHome | GoalsGuest
-----------------------------------------------------
25 | John Snow | 1 | 0
45 | Frank Underwood | 1 | 1
66 | Jack Bauer | 2 | 1
GoalsHome and GoalsGuest should be a counter of the shot goals for the team. So e.g. if you check the last row, the result is 2:1 for home team.
To get this list of goals, I used this statement:
SELECT t_gol.GoalMinute,
t_ply.PlayerName,
CASE WHEN
t_gol.TeamRowId = t_gam.TeamHomeRowId
THEN ROW_NUMBER() OVER (PARTITION BY t_gam.TeamHomeRowId ORDER BY t_gam.TeamHomeRowId)
END AS GoalsHome,
CASE WHEN
t_gol.TeamRowId = t_gam.TeamGuestRowId
THEN ROW_NUMBER() OVER (PARTITION BY t_gam.TeamGuestRowId ORDER BY t_gam.TeamGuestRowId)
END AS GoalsGuest
FROM dbo.tblGoalsFussball AS t_gol
LEFT JOIN dbo.tblPlayersFussball AS t_ply ON (t_ply.RowId = t_gol.PlayerRowId)
LEFT JOIN dbo.tblGames AS t_gam ON (t_gam.RowId = t_gol.GameRowId)
WHERE t_gol.GameRowId = #match_row
But what I get is this here:
GoalMinute | PlayerName | GoalsHome | GoalsGuest
-----------------------------------------------------
25 | John Snow | 1 | NULL
45 | Frank Underwood | NULL | 2
66 | Jack Bauer | 3 | NULL
Maybe ROW_NUMBER() is the wrong approach?
I would do the running total using sum() as a windowed aggregate function with the over ... clause, which works in SQL Server 2012+.
select
g.RowId, g.GameDate, t.GoalMinute, p.PlayerName,
GoalsHome = COALESCE(SUM(case when TeamRowId = g.TeamHomeRowId then 1 end) OVER (PARTITION BY gamerowid ORDER BY goalminute),0),
GoalsGuest = COALESCE(SUM(case when TeamRowId = g.TeamGuestRowId then 1 end) OVER (PARTITION BY gamerowid ORDER BY goalminute),0)
from tblGoals t
join tblPlayers p on t.PlayerRowId = p.RowId
join tblGames g on t.GameRowId = g.RowId
order by t.GameRowId, t.GoalMinute
Another approach (that also works in older versions) is to use a self-join and sum up the rows with lower goalminutes. For ease of reading I've used a common table expression to split the goals into two columns for home and guest team:
;with t as (
select
g.GoalMinute, g.PlayerRowId, g.GameRowId,
case when TeamRowId = ga.TeamHomeRowId then 1 end HomeGoals,
case when TeamRowId = ga.TeamGuestRowId then 1 end GuestGoals
from tblGoals g
join tblGames ga on g.GameRowId = ga.RowId
)
select
g.RowId, g.GameDate, t.GoalMinute, p.PlayerName,
GoalsHome = (select sum(coalesce(HomeGoals,0)) from t t2 where t2.GoalMinute <= t.GoalMinute and t2.GameRowId = t.GameRowId),
GoalsGuest = (select sum(coalesce(GuestGoals,0)) from t t2 where t2.GoalMinute <= t.GoalMinute and t2.GameRowId = t.GameRowId)
from t
join tblPlayers p on t.PlayerRowId = p.RowId
join tblGames g on t.GameRowId = g.RowId
order by t.GameRowId, t.GoalMinute
The CTE isn't necessary though, you could just as well use a derived table
Sample SQL Fiddle
I think the easiest way is with subqueries..
SELECT
tgs.GoalMinute,
tpl.PlayerName,
( SELECT
COUNT(t.RowId)
FROM
tblgoals AS t
WHERE t.GoalMinute <= tgs.GoalMinute
AND t.GameRowId = tgm.RowId
AND t.TeamRowId = tgm.TeamHomeRowId
) AS HomeGoals,
( SELECT
COUNT(t.RowId)
FROM
tblgoals AS t
WHERE t.GoalMinute <= tgs.GoalMinute
AND t.GameRowId = tgm.RowId
AND t.TeamRowId = tgm.TeamGuestRowId
) AS GuestGoals
FROM
tblgoals AS tgs
JOIN tblplayers AS tpl ON tgs.RowId = tpl.RowId
JOIN tblGames AS tgm ON tgm.RowId = tgs.GameRowId
ORDER BY tgs.GoalMinute

Select last changed row in sub-query

I have a table product:
id | owner_id | last_activity | box_id
------------------------------------
1 | 2 | 12/19/2014 | null
2 | 2 | 12/13/2014 | null
3 | 2 | 08/11/2014 | null
4 | 2 | 12/11/2014 | 99
5 | 2 | null | 99
6 | 2 | 12/15/2014 | 99
7 | 2 | null | 105
8 | 2 | null | 105
9 | 2 | null | 105
The only variable that I have is owner_id.
I need to select all products of a user, but if the product is in a box then only latest one should be selected.
Sample output for owner = 2 is following:
id | owner_id | last_activity | box_id
------------------------------------
1 | 2 | 12/19/2014 | null
2 | 2 | 12/13/2014 | null
3 | 2 | 08/11/2014 | null
6 | 2 | 12/15/2014 | 99
7 | 2 | null | 105
I'm not able to find a way to select the latest product from a box.
My current query, which does not return correct value, but can be executed:
SELECT p.* FROM product p
WHERE p.owner_id = 2
AND (
p.box IS NULL
OR (
p.box IS NOT NULL
AND
p.id = ( SELECT MAX(pp.id) FROM product pp
WHERE pp.box_id = p.box_id )
)
I tried with dates:
SELECT p.* FROM product p
WHERE p.owner_id = 2
AND (
p.box IS NULL
OR (
p.box IS NOT NULL
AND
p.id = ( SELECT * FROM (
SELECT pp.id FROM product pp
WHERE pp.box_id = p.box_id
ORDER BY last_activity desc
) WHERE rownum = 1
)
)
Which gives error: p.box_id is undefined as it's inside 2nd subquery.
Do you have any ideas how can I solve it?
The ROW_NUMBER analytical function might help with such queries:
SELECT "owner_id", "id", "box_id", "last_activity" FROM
(
SELECT "owner_id", "id", "box_id", "last_activity",
ROW_NUMBER()
OVER (PARTITION BY "box_id" ORDER BY "last_activity" DESC NULLS LAST) rn
-- ^^^^^^^^^^^^^^^
-- descending order, reject nulls after not null values
-- (this is the default, but making it
-- explicit here for self-documentation
-- purpose)
FROM T
WHERE "owner_id" = 2
) V
WHERE rn = 1 or "box_id" IS NULL
ORDER BY "id" -- <-- probably not necessary, but matches your example
See http://sqlfiddle.com/#!4/db775/8
there can be nulls as a value. If there are nulls in all products inside a box, then MIN(id) should be returned
Even if is is probably not a good idea to rely on id to order things is you think you need that, you will have to change the ORDER BY clause to:
... ORDER BY "last_activity" DESC NULLS LAST, "id" DESC
-- ^^^^^^^^^^^
Use exists
SELECT
p.*
FROM
product p
WHERE
p.owner_id = 2 AND
( p.box IS NULL OR
(
p.box IS NOT NULL AND
NOT EXISTS
(
SELECT
pp.id
FROM
product pp
WHERE
pp.box_id = p.box_id AND
pp.last_activity > p.last_activity
)
)
)
You can use union to first get all rows where box_is null and than fetch rows with max id and date where box_id is not null:
SELECT * FROM
(
SELECT id,owner_id,last_activity,box_id FROM product WHERE owner_id = 2 AND box_id IS NULL
UNION
SELECT MAX(id),owner_id,MAX(last_activity),box_id FROM product WHERE owner_id = 2 AND box_id IS NOT NULL GROUP BY owner_id, box_id
) T1
ORDER BY
id

Multiple sql query or Cursor?

I need help on something that seems to be complex to me.
I made a query to create a tbl1 which is the Cartesian product of the tables Item and Warehouse. It give’s me back all items in all warehouses:
SELECT i.ItemID, w.WarehouseID
FROM Item i, Warehouse w
I made a second query (tbl2) where I check the date of the last document previous or equal to a variable date (#datevar) and whose quantity rule is 1 (PhysicalQtyRule = 1), this by Item and Warehouse, obtained from StockHistory table
SELECT MAX(CreateDate) AS [DATE1], ItemID, Quantity, WarehouseID
FROM StockHistory
WHERE PhysicalQtyRule = 1 AND CreateDate <= #datevar
GROUP BY ItemID, Quantity, WarehouseID
Now, I need more three steps:
Build a third table containing per item and warehouse the sum of quantity, but the quantity rule is 2 (PhysicalQtyRule = 2) and date between tbl2.date (if exists) and the date of the variable #datevar, obtained from the table StockHistory. Something like that:
SELECT ItemID, WarehouseID, SUM(Quantity)
FROM StockHistory
WHERE PhysicalQtyRule = 2 AND CreateDate > tbl2.DATE1 --If exists
AND CreateDate <= #datevar
GROUP BY ItemID, WarehouseID
Build a fourth table containing per item and warehouse the sum of quantity, but the quantity rule is 3 (PhysicalQtyRule = 3) and date between tbl2.date (if any) and the date of the variable #datevar, obtained from the table StockHistory. Something like that:
SELECT ItemID, WarehouseID, SUM(Quantity)
FROM StockHistory
WHERE PhysicalQtyRule = 3 AND CreateDate > tbl2.DATE1 --If exists
AND CreateDate <= #datevar
GROUP BY ItemID, WarehouseID
Create a final table based on the first one, with an sum quantity column, something like that:
SELECT i.ItemID, w.WarehouseID, tbl2.Quantity + tbl3.Quantity – tbl4.Quantity AS [Qty]
FROM Item i, Warehouse w
I don't know if need cursors (something new for me) or multiple querys, but it's important the best performance because my StockHistory table have millions of records.
Can anyone help-me please? Thank you!
Some sample data, only for one Item and one warehouse:
+--------+-------------+------------+-----------------+----------+
| ItemID | WarehouseID | CreateDate | PhysicalQtyRule | Quantity | Balance | comments
+--------+-------------+------------+-----------------+----------+
| 1234 | 11 | 2013-03-25 | 2 | 35 | 35 | Rule 2 = In
| 1234 | 11 | 2013-03-28 | 3 | 30 | 5 | Rule 3 = Out
| 1234 | 11 | 2013-04-01 | 1 | 3 | 3 | Rule 1 = Reset
| 1234 | 11 | 2013-07-12 | 2 | 40 | 43 | Rule 2 = In
| 1234 | 11 | 2013-09-05 | 3 | 20 | 23 | Rule 3 = Out
| 1234 | 11 | 2013-12-31 | 1 | 25 | 25 | Rule 1 = Reset
| 1234 | 11 | 2014-01-09 | 3 | 11 | 14 | Rule 3 = Out
| 1234 | 11 | 2014-01-16 | 3 | 6 | 8 | Rule 3 = Out
I want to know the balance on any variable date.
Without your data, I can't test this but I believe this should be your solution.
SELECT i.ItemID
,w.WarehouseID
,[Qty] = tbl2.Quantity + tbl3.Quantity – tbl4.Quantity
FROM Item i
CROSS JOIN Warehouse w
OUTER APPLY (
SELECT [DATE1] = MAX(sh.CreateDate)
,sh.ItemID
,sh.Quantity
,sh.WarehouseID
FROM StockHistory sh
WHERE sh.PhysicalQtyRule = 1 AND sh.CreateDate <= #datevar
AND i.ItemID = sh.ItemID
AND w.WarehouseID = sh.WarehouseID
GROUP BY sh.ItemID, sh.Quantity, sh.WarehouseID ) tbl2
OUTER APPLY (
SELECT sh.ItemID
,sh.WarehouseID
,[Quantity] = SUM(sh.Quantity)
FROM StockHistory sh
WHERE sh.PhysicalQtyRule = 2 AND sh.CreateDate > tbl2.DATE1 --If exists
AND sh.CreateDate <= #datevar AND i.ItemID = sh.ItemID
AND w.WarehouseID = sh.WarehouseID
GROUP BY sh.ItemID, sh.WarehouseID ) tbl3
OUTER APPLY (
SELECT sh.ItemID
,sh.WarehouseID
,[Quantity] = SUM(sh.Quantity)
FROM StockHistory sh
WHERE sh.PhysicalQtyRule = 3 AND sh.CreateDate > tbl2.DATE1 --If exists
AND sh.CreateDate <= #datevar AND i.ItemID = sh.ItemID
AND w.WarehouseID = sh.WarehouseID
GROUP BY sh.ItemID, sh.WarehouseID ) tbl4