Get nearest date column value from another table in SQL Server - sql

I have two tables A and B,
Table A
PstngDate WorkingDayOutput
12/1/2020 221
12/3/2020 327
12/4/2020 509
12/5/2020 418
12/7/2020 390
12/8/2020 431
12/9/2020 244
12/10/2020 246
12/11/2020 314
12/12/2020 301
12/14/2020 411
12/15/2020 530
12/16/2020 554
12/17/2020 300
12/18/2020 375
12/23/2020 402
12/24/2020 302
12/25/2020 269
12/26/2020 382
12/28/2020 608
Table B
PstngDate HolidayOutput isWorkingDay
12/2/2020 20 0
12/6/2020 24 0
12/13/2020 31 0
12/19/2020 82 0
12/22/2020 507 0
12/27/2020 537 0
Expected output:
PstngDate WorkingDayOutput HolidayOutput
12/1/2020 221 20
12/3/2020 327
12/4/2020 509
12/5/2020 418 24
12/7/2020 390
12/8/2020 431
12/9/2020 244
12/10/2020 246
12/11/2020 314
12/12/2020 301 31
12/14/2020 411
12/15/2020 530
12/16/2020 554
12/17/2020 300
12/18/2020 375 589
12/23/2020 402
12/24/2020 302
12/25/2020 269
12/26/2020 382 537
12/28/2020 608
I want to join TableB to TableA with nearest lesser date column. If you see Expectedoutput table, day 18 row of holidayoutput column is taking sum of day19 and day22 of table B.

I want to join TableB to TableA with nearest lesser date column
This sounds like a lateral join:
select a.*, coalesce(b.holidayquantity, 0) as holidayquantity
from a
outer apply (
select top (1) b.*
from b
where b.pstng_date >= a.pstng_date
order by b.pstng_date
) b

You can use self left join as follows:
Select pstng_date, workingDayQuantity,
HolidayQuantity,
workingDayQuantity + HolidayQuantity as total
From
(Select a.*, b.HolidayQuantity,
Row_number() over (partirion by a.psrng_date order by b.pstng_date) ad rn
From tablea a join tableb b On b.pstng_date > a.pstng_date) t
Where rn=1

Related

how to select a value based on multiple criteria

I'm trying to select some values based on some proprietary data, and I just changed the variables to reference house prices.
I am trying to get the total offers for houses where they were sold at the bid or at the ask price, with offers under 15 and offers * sale price less than 5,000,000.
I then want to get the total number of offers for each neighborhood on each day, but instead I'm getting the total offers across each neighborhood (n1 + n2 + n3 + n4 + n5) across all dates and the total offers in the dataset across all dates.
My current query is this:
SELECT DISTINCT(neighborhood),
DATE(date_of_sale),
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`
WHERE ((offers * accepted_sale_price < 5000000)
AND (offers < 15)
AND (house_bid = sale_price OR
house_ask = sale_price))) as bid_ask_off,
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`) as
total_offers,
FROM `big_query.a_table_name.houseprices`
GROUP BY neighborhood, DATE(date_of_sale) LIMIT 100
Which I am expecting a result like, with date being repeated throughout as d1, d2, d3, etc.:
but am instead receiving
I'm aware that there are some inherent problems with what I'm trying to select / group, but I'm not sure what to google or what tutorials to look at in order to perform this operation.
It's querying quite a bit of data, and I want to keep costs down, as I've already racked up a smallish bill on queries.
Any help or advice would be greatly appreciated, and I hope I've provided enough information.
Here is a sample dataframe.
neighborhood date_of_sale offers accepted_sale_price house_bid house_ask
bronx 4/1/2022 3 323 320 323
manhattan 4/1/2022 4 244 230 244
manhattan 4/1/2022 8 856 856 900
queens 4/1/2022 15 110 110 135
brooklyn 4/2/2022 12 115 100 115
manhattan 4/2/2022 9 255 255 275
bronx 4/2/2022 6 330 300 330
queens 4/2/2022 10 405 395 405
brooklyn 4/2/2022 4 254 254 265
staten_island 4/3/2022 2 442 430 442
staten_island 4/3/2022 13 195 195 225
bronx 4/3/2022 4 650 650 690
manhattan 4/3/2022 2 286 266 286
manhattan 4/3/2022 6 356 356 400
staten_island 4/4/2022 4 361 361 401
staten_island 4/4/2022 5 348 348 399
bronx 4/4/2022 8 397 340 397
manhattan 4/4/2022 9 333 333 394
manhattan 4/4/2022 11 392 325 392
I think that this is what you need.
As we group by neighbourhood we do not need DISTINCT.
We take sum(offers) for total_offers directly from the table and bids from a sub-query which we join to so that it is grouped by neighbourhood.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY neighborhood) s
ON h.neighborhood = s.neighborhood
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
Or the following which modifies more the initial query but may be more like what you need.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
date_of_sale dos,
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY
neighborhood,
date_of_sale) s
ON h.neighborhood = s.neighborhood
AND h.date_of_sale = s.dos
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;

SQL iterative // recursive cte with conditions (substract from previous rows)

I have this query calculating how many products I have to produce to serve my pending orders and the components I need to produce them.
select
l.codart as SKU, --final product
e.codartc as Component, --piece of final product
e.unicompo, --Components needed for each SKU
l1.SKU_pending - s.SKU_STOCK as "SKU to produce",
s2.C_STOCK as "Component stock",
s2.C_STOCK - sum((l1.SKU_pending - s.SKU_STOCK) * e.unicompo)
over (partition by e.codartc order by l.codart) as "Component stock after producing"
from linepedi l --table with sales orders
left join escandallo e on e.codartp = l.codart --table with SKU components
inner join (select l1.codart, sum(l1.unidades - l1.uniservida - l1.unianulada) as "SKU_pending" --pending sales. I called it from a subquery so I don't have to repeat the calculation each time I need it
from linepedi l1
where (l1.unidades - l1.uniservida - l1.unianulada) > 0
group by l1.codart) l1 on l1.CODART = l.codart
left join (select s.codart, sum(s.unidades) as "SKU_STOCK"
from __STOCKALMART s
group by s.codart) s on s.codart = l.codart
left join (select s.codart, sum(s.unidades) as "C_STOCK"
from __STOCKALMART s
group by s.codart) s2 on s2.codart = e.codartc
where l1.SKU_pending - s.SKU_STOCK > 0
group by l.codart, e.codartc, e.unicompo, l1.SKU_pending, s.SKU_STOCK, s2.C_STOCK
order by l.codart
Query returns next table:
SKU
Component
unicompo
SKU to produce
Component stock
Component stock after producing
20611
286
1
50
2021
1971
20611
329
1
50
2759
2709
20611
ARTZD031
1
50
643
593
220178
ARTZD027
1
384
477
93
220178
SICBB005
1
384
845
461
220178
265
1
384
894
510
220185
265
1
200
894
310
220185
SICBB005
1
200
845
261
220185
ARTZD028
1
200
71
-129
220192
ARTZD029
1
200
364
164
220192
SICBB005
1
200
845
61
220192
265
1
200
894
110
When Component stock after producing returns less than 0, I don't want it to substract the SKU to produce, but the mininum Component stock for that SKU, while "saving" this value for the next time I need the same component. I think I would need to make an iteration with conditionals.
This is what I'd like to accomplish:
SKU
Component
unicompo
SKU to produce
Component stock
Component stock after producing
20611
286
1
50
2021
1971
20611
329
1
50
2759
2709
20611
ARTZD031
1
50
643
593
220178
ARTZD027
1
384
477
93
220178
SICBB005
1
384
845
461
220178
265
1
384
894
510
220185
265
1
200
894
439
220185
SICBB005
1
200
845
390
220185
ARTZD028
1
200
71
0
220192
ARTZD029
1
200
364
164
220192
SICBB005
1
200
845
190
220192
265
1
200
894
239
I've been reading some articles and I feel like it might be done with a recursive CTE, but I don't really know how since I didn't find any example similar to mine.
How can achieve this? Any help will be appreciated. Thank you very much

SQL JOIN with 2 aggregates returning incorrect results

I am trying to join 3 different tables to get how many Home Runs a player has in his career along with how many Awards they have recieved. However, I'm getting incorrect results:
Peoples
PlayerId
Battings
PlayerId, HomeRuns
AwardsPlayers
PlayerId, AwardName
Current Attempt
SELECT TOP 25 Peoples.PlayerId, SUM(Battings.HomeRuns) as HomeRuns, COUNT(AwardsPlayers.PlayerId)
FROM Peoples
JOIN Battings ON Battings.PlayerId = Peoples.PlayerId
JOIN AwardsPlayers ON AwardsPlayers.PlayerId = Battings.PlayerId
GROUP BY Peoples.PlayerId
ORDER BY SUM(HomeRuns) desc
Result
PlayerID HomeRuns AwardCount
bondsba01 35814 1034
ruthba01 23562 726
rodrial01 21576 682
mayswi01 21120 736
willite01 20319 741
griffke02 18270 667
schmimi01 18084 594
musiast01 16150 748
pujolal01 14559 414
dimagjo01 12996 468
ripkeca01 12499 609
gehrilo01 12325 425
aaronha01 12080 368
foxxji01 11748 462
ramirma02 10545 399
benchjo01 10114 442
sosasa01 9744 304
ortizda01 9738 360
piazzmi01 9394 396
winfida01 9300 460
rodriiv01 9019 667
robinfr02 8790 330
dawsoan01 8760 420
robinbr01 8576 736
hornsro01 8127 648
I am pretty confident it's my second join Do I need to do some sort of subquery or should this work? Barry Bonds definitely does not have 35,814 Home Runs nor does he have 1,034 Awards
If I just do a single join, I get the correct output:
SELECT TOP 25 Peoples.PlayerId, SUM(Battings.HomeRuns) as HomeRuns
FROM Peoples
JOIN Battings ON Battings.PlayerId = Peoples.PlayerId
GROUP BY Peoples.PlayerId
ORDER BY SUM(HomeRuns) desc
bondsba01 762
aaronha01 755
ruthba01 714
rodrial01 696
mayswi01 660
pujolal01 633
griffke02 630
thomeji01 612
sosasa01 609
robinfr02 586
mcgwima01 583
killeha01 573
palmera01 569
jacksre01 563
ramirma02 555
schmimi01 548
ortizda01 541
mantlmi01 536
foxxji01 534
mccovwi01 521
thomafr04 521
willite01 521
bankser01 512
matheed01 512
ottme01 511
What am I doing wrong? I'm sure it's how I'm joining my second table (AwardsPlayers)
I think you have two independent dimensions. The best approach is to aggregate before joining:
SELECT TOP 25 p.PlayerId, b.HomeRuns, ap.cnt
FROM Peoples p LEFT JOIN
(SELECT b.PlayerId, SUM(b.HomeRuns) as HomeRuns
FROM Battings b
GROUP BY b.PlayerId
) b
ON b.PlayerId = p.PlayerId LEFT JOIN
(SELECT ap.PlayerId, COUNT(*) as cnt
FROM AwardsPlayers ap
GROUP BY ap.PlayerId
) ap
ON ap.PlayerId = p.PlayerId
ORDER BY b.HomeRuns desc;
Result
bondsba01 762 47
aaronha01 755 16
ruthba01 714 33
rodrial01 696 31
mayswi01 660 32
pujolal01 633 23
griffke02 630 29
thomeji01 612 6
sosasa01 609 16
robinfr02 586 15
mcgwima01 583 9
killeha01 573 8
palmera01 569 8
jacksre01 563 13
ramirma02 555 19
schmimi01 548 33
ortizda01 541 18
mantlmi01 536 15
foxxji01 534 22
mccovwi01 521 10
thomafr04 521 10
willite01 521 39
bankser01 512 10
matheed01 512 4
ottme01 511 11

HSQLDB query to replace a null value with a value derived from another record

This is a small excerpt from a much larger table, call it LOG:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 null
3 364 509 7045 7457 null
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 null
9 672 622 5632 null 5966
10 672 622 5632 2635 null
I would like a query that will replace the null in the 'TFAID' column with the value from the 'TFAID' column from the 'FID' column that matches.
Desired output would therefore be:
RN EID FID FRID TID TFAID
1 364 509 7045 null 7452
2 364 509 7045 7452 7452
3 364 509 7045 7457 7452
4 375 512 4525 5442 5241
5 375 513 4525 5863 5241
6 375 515 4525 2542 5241
7 576 621 5632 null 5452
8 576 621 5632 2595 5452
9 672 622 5632 null 5966
10 672 622 5632 2635 5966
I know that something like
SELECT RN,
EID,
FID,
FRID,
TID,
(COALESCE TFAID, {insert clever code here}) AS TFAID
FROM LOG
is what I need, but I can't for the life of me come up with the clever bit of SQL that will fill in the proper TFAID.
HSQLDB supports SQL features that can be used as alternatives. These features are not supported by some other databases.
CREATE TABLE LOG (RN INT, EID INT, FID INT, FRID INT, TID INT, TFAID INT);
-- using LATERAL
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l , LATERAL (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID) f
-- using scalar subquery
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, (SELECT MAX(TFAID) AS TFAID FROM LOG f WHERE f.FID = l.FID)) AS TFAID
FROM LOG l
Here is one approach. This aggregates the log to get the value and then joins the result in:
SELECT l.RN, l.EID, l.FID, l.FRID, l.TID,
COALESCE(l.TFAID, f.TFAID) AS TFAID
FROM LOG l join
(select fid, max(tfaid) as tfaid
from log
group by fid
) f
on l.fid = f.fid;
There may be other approaches that are more efficient. However, HSQL doesn't implement all SQL features.

Stock on the fly missing values with no sales

I have the following queries,
QryStockOnHand
SELECT QrySaleTot.Item, QrySaleTot.ProductID, [QryStockLevel].[Stock]-[QrySaleTot].[Quantity] AS StockOnHand
FROM QryStockLevel INNER JOIN QrySaleTot ON QryStockLevel.ProductID = QrySaleTot.ProductID;
QrySaleTot
SELECT TblProduct.Item, Sum(TblTotalSale.Size) AS Quantity, TblProduct.ProductID
FROM TblProduct INNER JOIN TblTotalSale ON TblProduct.[ProductID] = TblTotalSale.[ProductID]
GROUP BY TblProduct.Item, TblProduct.ProductID;
QryStockLevel
SELECT TblStock.ProductID, Sum(TblStock.StockLevel) AS Stock, TblProduct.Item
FROM TblStock INNER JOIN TblProduct ON TblStock.ProductID = TblProduct.ProductID
GROUP BY TblStock.ProductID, TblProduct.Item;
When I run the QryStockonHand and no sales of a product have been made then the porduct does not appear in the result of the query...
Sample Data
TblStock
StockID ProductID StockLevel
138 1 528
139 3 528
140 5 528
141 9 528
142 7 528
143 18 80
144 30 72
145 34 72
146 33 72
147 32 200
148 22 80
149 19 80
150 23 80
151 20 80
TblProduct
ProductID Item Price StockDelivery PriceSmall Large Small
1 Carling £2.50 528 £1.40 2 1
3 Carlsburg £2.70 528 £1.60 2 1
5 IPA £2.30 528 £1.20 2 1
7 StrongBow £2.80 528 £1.65 2 1
9 RevJames £2.45 528 £1.30 2 1
11 Becks £2.90 72 1
12 WKDBlue £2.80 72 1
13 WKDRed £2.80 72 1
14 SmirnoffIce £2.80 72 1
TblTotalSale
TotalSalesID ProductID SalePrice Day Time Size
576 1 £1.40 19/02/2012 15:34:24 1
528 1 £2.50 09/02/2012 14:44:44 2
530 1 £1.40 09/02/2012 14:44:44 1
565 1 £2.50 19/02/2012 15:34:24 2
567 1 £1.40 19/02/2012 15:34:24 1
570 3 £2.70 19/02/2012 15:34:24 2
571 3 £1.60 19/02/2012 15:34:24 1
577 3 £2.70 19/02/2012 15:34:24 2
578 3 £1.60 19/02/2012 15:34:24 1
533 3 £2.70 09/02/2012 14:44:44 2
534 3 £1.60 09/02/2012 14:44:44 1
Any Idea why... I guess it is a null thing, where it is seeing the no sales as a non existent thing, instead of a zero sales.... any idea how i could fix it?
Thanks
Sam
Instead of an inner join, use a left outer join, which will tell it to grab all of the rows from the left hand table on the join, instead of an inner join, which returns only rows which have values in both tables.
I don't know the QryStockLevel fields, but your query should look something like this:
SELECT QryStockLevel.Item, QryStockLevel.ProductID, [QryStockLevel].[Stock]-NZ([QrySaleTot].[Quantity],0) AS StockOnHand
FROM QryStockLevel LEFT OUTER JOIN QrySaleTot ON QryStockLevel.ProductID = QrySaleTot.ProductID;
Note the NZ function to handle a null on the Quantity when qrysaletot does not have a row.