Sqlite query Smarter Rows with mathematics of cloned tables? - sql

Okay this is pretty complicated to even explain but I will try.
Buying table
----------------------------------------------------------------------------------------------------------
id itemId amount price bought collected slot aborted playerHash
1 2607 4111 200600 0 0 0 0 1020628
2 11335 1 0 0 0 3 1 1020628
3 2495 6546 5306 0 0 1 0 1020628
4 1127 101 58300 101 0 5 0 37763265
5 14479 1 107500 0 0 2 0 37763265
6 1 100 1 0 0 0 0 3 *simulate a problem Buy
Selling table
----------------------------------------------------------------------------------------------------------
id itemId amount price sold collected slot aborted playerHash
1 8 8234 132950 7244 0 4 0 1020628
2 9 1980 132950 0 0 5 0 1020628
3 9 100 126300 0 0 2 0 1020628
4 3024 8888 10900 8888 0 0 0 37763265
5 1 100 1 1 0 0 0 1 *simulate a problem Sell
6 1 100 1 1 0 0 0 2 *simulate a problem Sell
Result of match ups
----------------------------------------------------------------------------------------------------------
S.itemId S.amount S.price S.sold S.collected S.slot S.playerHash B.itemId B.amount B.price B.bought B.collected B.slot B.playerHash
123 2 444 1 0 0 15431 123 34535 448 3 0 1 3455
123 2 444 1 0 0 15431 123 7567 444 333 0 3 7651
*simulated result rows of wrong data
1 100 1 1 0 0 1 1 100 1 0 0 0 3
1 100 1 1 0 0 2 1 100 1 0 0 0 3
The query I'm trying to do has to match up Buyers of same itemId with Sellers of same itemId.
Has to make sure both Buyer and Seller didn't have aborted boolean set on their item.
Also process Buyers who paying more then what the Seller wants first.
Also check to make sure the amount Seller is selling hasn't been sold yet.
Also check to make sure the amount Buyer is buying hasn't been bought yet.
As well as process 100 sales per batch.
Query below works great for the most part.
Problem I'm trying to fix in the query is to make it more intelligent with mathematics.
Lets say Selling table had playerHash 1 and playerHash 2
itemId 8 with amount 100 and sold 1 listed by playerHash lets say 1.
itemId 8 with amount 100 and sold 1 listed by playerHash lets say 2.
Now there is 1 Buyer called playerHash 3 who is buying same itemId 8 and amount 100.
Problem now is
playerHash1 can't sell all his 100 amount of itemId 8 because he only has 99 left (sold 1)
playerHash2 can't sell all his 100 amount of itemId 8 because he only has 99 left (sold 1) same story
Now the query should return
playerHash 1 selling to playerHash3 buyer with a new column to the end of row called something like willBuyAmount set it to 99
Next row should know that playerHash 1 has sold = 100 (without updating the database and without changing sold column in previous row)
Dsplay that playerHash 2 selling to playerHash3 buyer with a new coulmn to the end of row called something like willBuyAmount set it to 1.
playerHash2 should still be able to know by now that it has 98 left (sold 1) + 1 temporary somewhere idk clone the tables maybe?
and for future buyers of same itemId and >= price should be able to buy just 98 of this item amount left from playerHash2.
Here is the math I want to put into the query
VAR amountBuyerNeeds = (B.amount - B.bought)
VAR amountSellerStock = (S.amount - S.sold)
//update these values in the simulated cloned table data.
B.bought = IF(amountBuyerNeeds > amountSellerStock, (B.bought + amountSellerStock), B.amount)
S.sold = IF(amountSellerStock > amountBuyerNeeds, (S.sold + amountBuyerNeeds), S.amount)
//to real row print out
willBuyAmount = IF(amountBuyerNeeds > amountSellerStock, amountSellerStock, amountBuyerNeeds)
Query ATM looks like this.
SELECT S.itemId AS sell_itemId,
S.amount AS sell_amount,
S.price AS sell_price,
S.sold AS sell_sold,
S.collected AS sell_collected,
S.slot AS sell_slot,
S.playerHash AS sell_playerHash,
B.itemId AS buy_itemId,
B.amount AS buy_amount,
B.price AS buy_price,
B.bought AS buy_bought,
B.collected AS buy_collected,
B.slot AS buy_slot,
B.playerHash AS buy_playerHash
FROM Buying AS B,
Selling AS S
ON B.itemId = S.itemId
AND
B.aborted = 0
AND
S.aborted = 0
AND
B.price >= S.price
AND
S.sold < S.amount
AND
B.bought < B.amount
ORDER BY B.price DESC
LIMIT 100;

Related

how to access repeat purchase records for the next three months without self join?

I have a table with customer transaction information, for example records for one customer (identified by customer_id) look like this:
order_id
bk_date
booking_has_insurance_indicator
1
7/20
0
2
8/2
0
3
8/3
1
4
8/9
1
5
11/6
0
6
12/2
0
7
12/6
0
8
12/7
0
I'd like to find out for each customer, for each order_id, if there's repeat purchase within 90 days and how many of those, also if so, whether there's insurance attached. For example, for order_id = 1, there's three repeat purchase (order_id = 2,3,4) within 90 days and there exist orders with insurance (order_id = 3,4). Ideal output would look like
order_id
bk_date
repeat_count
repeat_has_insurance_indicator
1
7/20
3
1
2
8/2
2
1
3
8/3
2
1
4
8/9
1
0
5
11/6
3
0
6
12/2
2
0
7
12/6
1
0
8
12/7
0
0
I'm aware that if I only want to access the next order record I can use LEAD window function without joining, but with question above, I could only think of self join to join each order_id to the ones with bk_date within 90 days. However, given the volume of the data with millions of customers, self join is also not an option due to memory limit. Could someone help me if there's a more efficient solution?

How to show the closest date to the selected one

I'm trying to extract the stock in an specific date. To do so, I'm doing a cumulative of stock movements by date, product and warehouse.
select m.codart AS REF,
m.descart AS 'DESCRIPTION',
m.codalm AS WAREHOUSE,
m.descalm AS WAREHOUSEDESCRIP,
m.unidades AS UNITS,
m.entran AS 'IN',
m.salen AS 'OUT',
m.entran*1 + m.salen*-1 as MOVEMENT,
(select sum(m1.entran*1 + m1.salen*-1)
from MOVSTOCKS m1
where m1.codart = m.codart and m1.codalm = m.codalm and m.fecdoc >= m1.fecdoc) as 'CUMULATIVE',
m.PRCMEDIO as 'VALUE',
m.FECDOC as 'DATE',
m.REFERENCIA as 'REF',
m.tipdoc as 'DOCUMENT'
from MOVSTOCKS m
where (m.entran <> 0 or m.salen <> 0)
and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000'
order by m.fecdoc
Without the and (select max(m2.fecdoc) from MOVSTOCKS m2) < '2020-11-30T00:00:00.000' it shows data like this, which is ok.
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 0 2 0 2 -2 -7 2020-11-25
1 1 3 0 3 -3 -3 2020-11-25
1 0 5 0 5 -5 -7 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
1 0 1 1 0 1 3 2020-12-01
The problem is, with the subselect in the where clause it returns no results (I think it is because it just looks for the max date and says it is bigger than 2020-11-30). I would like it to show the closest dates (all of them, for each product and warehouse) to the selected one, in this case 2020-11-30.
It should look slike this:
REF WAREHOUSE UNITS IN OUT MOVEMENT CUMULATIVE DATE
1 1 3 0 3 -3 -3 2020-11-25
1 0 9 9 0 9 2 2020-11-26
2 0 2 2 0 2 2 2020-11-26
Sorry if I'm not clear. Ask me if I have to clarify anything
Thank you
I am guessing that you want something like this:
select t.*
from (select m.*,
sum(m.entran - m1.salen) over (partition by m.codart, m.codalm order by fecdoc) as cumulative,
max(fecdoc) over (partition by m.codart, m.codalm) as max_fecdoc
from MOVSTOCKS m
where fecdoc < '2020-11-30'
) m
where fecdoc = max_fecdoc;
The subquery calculates the cumulative amount of stock using window functions and filters for records before the cutoff date. The outer query selects the most recent record from the combination of codeart/codalm, which seems to be how you are identifying a product.

Creating 2 "cartridges" of cumulative sum with conditions using SQL

I need to create 2 cumulative sums based on the value type, for example:
I have values of incoming stock units from 2 types: A and B. and I also have records of outgoing stock units.
If we have enough stock of type "A" it should taken out of type A, if not- it should be taken out of type B. so basically I need to crate the columns "A stock" and "B stock" below, representing the current balance of each type.
I tried using cumulative sum but I'm having trouble with the condition... is there a way to write this query without using a loop ? ( Vertica DB)
In table below A_stock and B_stock are the final result I need to create
ID Type In OUT A stock B stock Order_id
1 A 100 0 100 0 1
1 B 50 0 100 50 2
1 A 100 0 200 50 3
1 - 0 -200 0 50 4
1 - 0 -10 0 40 5
1 B 50 0 0 90 6
1 A 40 0 40 90 7
1 - 0 -20 20 90 8
2 A 30 0 30 0 1
2 B 20 0 30 20 2
2 A 10 0 40 20 3
2 - 0 -20 20 20 4
You can use window functions - but you need a column that defines the ordering of the rows, I assumed ordering_id:
select t.*,
sum(case when type = 'A' then in + out else 0 end) over(partition by id order by ordering_id) a_stock,
sum(case when type = 'B' then in + out else 0 end) over(partition by id order by ordering_id) b_stock
from mytable t
This assumes that you want the stock on a per-id basis; if that's not the case, just remove the partition clause from the over() clause.

T-SQL return table ordered by largest pairs first

I currently have a table that looks like this:
id carrots potatoes
1 10 0
2 0 5
3 0 0
4 15 3
5 13 2
I want to look at customers who ordered both carrots and potatoes. Like this:
id carrots potatoes
4 15 3
5 13 2
1 10 0
2 0 5
3 0 0
I am currently using an ORDER BY where both fields are DESC: ORDER BY potatoes DESC, carrots DESC
The problem is that this isn't always reliable. Right now it works, but in the case of a customer who ordered a lot of potatoes and no carrots, if I arbitrarily switch the order to ORDER BY potatoes DESC, carrots DESC it gives back
id carrots potatoes
2 0 5
4 15 3
5 13 2
1 10 0
3 0 0
What would your approach be?
Code at sqlfiddle here: http://sqlfiddle.com/#!18/60763/2. T-SQL/Microsoft SQL Server Management Studio 2016.
You can use:
order by (case when carrots > 0 then 1 else 0 end) + (case when potatoes > 0 then 1 else 0 end) desc
Or, if that is too much typing:
order by sign(carrots) + sign(potatoes) desc
You can simple do sum :
order by carrots + potatoes desc

Calculating Run Cost for lengths of Pipe & Pile

I work for a small company and we're trying to get away from Excel workbooks for Inventory control. I thought I had it figured out with help from (Nasser) but its beyond me. This is what I can get into a table, from there I need too get it to look like the table below.
My data
ID|GrpID|InOut| LoadFt | LoadCostft| LoadCost | RunFt | RunCost| AvgRunCostFt
1 1 1 4549.00 0.99 4503.51 4549.00 0 0
2 1 1 1523.22 1.29 1964.9538 6072.22 0 0
3 1 2 -2491.73 0 0 3580.49 0 0
4 1 2 -96.00 0 0 3484.49 0 0
5 1 1 8471.68 1.41 11945.0688 11956.17 0 0
6 1 2 -369.00 0 0 11468.0568 0 0
7 2 1 1030.89 5.07 5223.56 1030.89 0 0
8 2 1 314.17 5.75 1806.4775 1345.06 0 0
9 2 1 239.56 6.3 1508.24 1509.228 0 0
10 2 2 -554.46 0 0 954.768 0 0
11 2 1 826.24 5.884 4861.5961 1781.008 0 0
Expected output
ID|GrpID|InOut| LoadFt | LoadCostft| LoadCost | RunFt | RunCost| AvgRunCostFt
1 1 1 4549.00 0.99 4503.51 4549.00 4503.51 0.99
2 1 1 1523.22 1.29 1964.9538 6072.22 6468.4638 1.0653
3 1 2 -2491.73 1.0653 -2490.6647 3580.49 3977.7991 1.111
4 1 2 -96.00 1.111 -106.656 3484.49 3871.1431 1.111
5 1 1 8471.68 1.41 11945.0688 11956.17 15816.2119 1.3228
6 1 2 -369.00 1.3228 -488.1132 11468.0568 15328.0987 1.3366
7 2 1 1030.89 5.07 5223.56 1030.89 5223.56 5.067
8 2 1 314.17 5.75 1806.4775 1345.06 7030.0375 5.2266
9 2 1 239.56 6.3 1508.24 1509.228 8539.2655 5.658
10 2 2 -554.46 5.658 -3137.1346 954.768 5402.1309 5.658
11 2 1 826.24 5.884 4861.5961 1781.008 10263.727 5.7629
The first record of a group would be considered the opening balance. Inventory going into the yard have the ID of 1 and out of the yard are 2's. Load footage going into the yard always has a load cost per foot and I can calculate the the running total of footage. The first record of a group is easy to calculate the run cost and run cost per foot. The next record becomes a little more difficult to calculate. I need to move the average of run cost per foot forward to the load cost per foot when something is going out of the yard and then calculate the run cost and average run cost per foot again. Hopefully this makes sense to somebody and we can automate some of these calculations. Thanks for any help.
Here's an Oracle example I found;
SQL> select order_id
2 , volume
3 , price
4 , total_vol
5 , total_costs
6 , unit_costs
7 from ( select order_id
8 , volume
9 , price
10 , volume total_vol
11 , 0.0 total_costs
12 , 0.0 unit_costs
13 , row_number() over (order by order_id) rn
14 from costs
15 order by order_id
16 )
17 model
18 dimension by (order_id)
19 measures (volume, price, total_vol, total_costs, unit_costs)
20 rules iterate (4)
21 ( total_vol[any] = volume[cv()] + nvl(total_vol[cv()-1],0.0)
22 , total_costs[any]
23 = case SIGN(volume[cv()])
24 when -1 then total_vol[cv()] * nvl(unit_costs[cv()-1],0.0)
25 else volume[cv()] * price[cv()] + nvl(total_costs[cv()-1],0.0)
26 end
27 , unit_costs[any] = total_costs[cv()] / total_vol[cv()]
28 )
29 order by order_id
30 /
ORDER_ID VOLUME PRICE TOTAL_VOL TOTAL_COSTS UNIT_COSTS
---------- ---------- ---------- ---------- ----------- ----------
1 1000 100 1000 100000 100
2 -500 110 500 50000 100
3 1500 80 2000 170000 85
4 -100 150 1900 161500 85
5 -600 110 1300 110500 85
6 700 105 2000 184000 92
6 rows selected.
Let me say first off three things:
This is certainly not the best way to do it. There is a rule saying that if you need a while-loop, then you are most probably doing something wrong.
I suspect there is some calculation errors in your original "Expected output", please check the calculations since my calculated values are different according to your formulas.
This question could also be seen as a gimme teh codez type of question, but since you asked a decently formed question with some follow-up research, my answer is below. (So no upvoting since this is help for a specific case)
Now onto the solution:
I attempted to use my initial hint of the LAG statement in a nicely formed single update statement, but since you can only use a windowed function (aka LAG) inside a select or order by clause, that will not work.
What the code below does in short:
It calculates the various calculated fields for each record when they can be calculated and with the appropriate functions, updates the table and then moves onto the next record.
Please see comments in the code for additional information.
TempTable is a demo table (visible in the linked SQLFiddle).
Please read this answer for information about decimal(19, 4)
-- Our state and running variables
DECLARE #curId INT = 0,
#curGrpId INT,
#prevId INT = 0,
#prevGrpId INT = 0,
#LoadCostFt DECIMAL(19, 4),
#RunFt DECIMAL(19, 4),
#RunCost DECIMAL(19, 4)
WHILE EXISTS (SELECT 1
FROM TempTable
WHERE DoneFlag = 0) -- DoneFlag is a bit column I added to the table for calculation purposes, could also be called "IsCalced"
BEGIN
SELECT top 1 -- top 1 here to get the next row based on the ID column
#prevId = #curId,
#curId = tmp.ID,
#curGrpId = Grpid
FROM TempTable tmp
WHERE tmp.DoneFlag = 0
ORDER BY tmp.GrpID, tmp.ID -- order by to ensure that we get everything from one GrpID first
-- Calculate the LoadCostFt.
-- It is either predetermined (if InOut = 1) or derived from the previous record's AvgRunCostFt (if InOut = 2)
SELECT #LoadCostFt = CASE
WHEN tmp.INOUT = 2
THEN (lag(tmp.AvgRunCostFt, 1, 0.0) OVER (partition BY GrpId ORDER BY ID))
ELSE tmp.LoadCostFt
END
FROM TempTable tmp
WHERE tmp.ID IN (#curId, #prevId)
AND tmp.GrpID = #curGrpId
-- Calculate the LoadCost
UPDATE TempTable
SET LoadCost = LoadFt * #LoadCostFt
WHERE Id = #curId
-- Calculate the current RunFt and RunCost based on the current LoadFt and LoadCost plus the previous row's RunFt and RunCost
SELECT #RunFt = (LoadFt + (lag(RunFt, 1, 0) OVER (partition BY GrpId ORDER BY ID))),
#RunCost = (LoadCost + (lag(RunCost, 1, 0) OVER (partition BY GrpId ORDER BY ID)))
FROM TempTable tmp
WHERE tmp.ID IN (#curId, #prevId)
AND tmp.GrpID = #curGrpId
-- Set all our values, including the AvgRunCostFt calc
UPDATE TempTable
SET RunFt = #RunFt,
RunCost = #RunCost,
LoadCostFt = #LoadCostFt,
AvgRunCostFt = #RunCost / #RunFt,
doneflag = 1
WHERE ID = #curId
END
SELECT ID, GrpID, InOut, LoadFt, RunFt, LoadCost,
RunCost, LoadCostFt, AvgRunCostFt
FROM TempTable
ORDER BY GrpID, Id
The output with your sample data and a SQLFiddle demonstrating how it all works:
ID GrpID InOut LoadFt RunFt LoadCost RunCost LoadCostFt AvgRunCostFt
1 1 1 4549 4549 4503.51 4503.51 0.99 0.99
2 1 1 1523.22 6072.22 1964.9538 6468.4638 1.29 1.0653
3 1 2 -2491.73 3580.49 -2654.44 3814.0238 1.0653 1.0652
4 1 2 -96 3484.49 -102.2592 3711.7646 1.0652 1.0652
5 1 1 8471.68 11956.17 11945.0688 15656.8334 1.41 1.3095
6 1 2 -369 11587.17 -483.2055 15173.6279 1.3095 1.3095
7 2 1 1030.89 1030.89 5226.6123 5226.6123 5.07 5.07
8 2 1 314.17 1345.06 1806.4775 7033.0898 5.75 5.2288
9 2 1 239.56 1584.62 1509.228 8542.3178 6.3 5.3908
10 2 2 -554.46 1030.16 -2988.983 5553.3348 5.3908 5.3907
11 2 1 826.24 1856.4 4861.5962 10414.931 5.884 5.6103
If you are unclear about parts of the code, I can update with additional explanations.