How to insert into SQL table with previous data check - sql

I'm creating a table in which I will store bookmakers odds changes for sport events over time (it will have hundrets k of rows).
I want to create an update function in PHP, which puts in the table data only if current_odd_value is different than most recent odd_value stored in table.
Using simple INSERT function I created this table of 1 match (8483075) from two companies (66 and 22) for the same market (1) which has 3 selection (1001, 1002, 1003) that I get today at 17:00:
internal_id
match_id
company_id
market_id
selection_id
odd_value
update_date
1
8483075
66
1
1001
9,60
2021-01-04 17:00:00
2
8483075
66
1
1002
18,00
2021-01-04 17:00:00
3
8483075
66
1
1003
1,09
2021-01-04 17:00:00
4
8483075
22
1
1001
8,40
2021-01-04 17:00:00
5
8483075
22
1
1002
16,00
2021-01-04 17:00:00
6
8483075
22
1
1003
1,08
2021-01-04 17:00:00
At 17:05 I checked odds once again and I noticed 2 changes (for internal_id 2 and 6):
2 / 8483075 / 66 / 1 / 1002 / 15,00 ==> 18,00
6 / 8483075 / 22 / 1 / 1003 / 1,08 ==> 1,18
, that I should put into that table and should look like this:
internal_id
match_id
company_id
market_id
selection_id
odd_value
update_date
7
8483075
66
1
1002
15,00
2021-01-04 17:05:00
8
8483075
22
1
1003
1,18
2021-01-04 17:05:00
My idea to do that was to:
get table of all recent odd values for each match_id + company_id + market_id + selection_id
compare it with current odd value and only if it's different than value from point 1. put new record into table with proper data
MY QUESTIONS:
What will be the SELECT query to get what I need for point 1? I think I can use internal_id (higher means most recent) or update_date to get it, but I don't know how. I know how to make it for specific match_id + company_id + makret_id + selection_id but I need whole table in one select not one by one.
Is my approach correct or I should try different approach? (I think that retriving whole table at the beginning of update with most recent odds should be faster than comparing each value one by one)
Additional info:
All data that I have are coming from XML/JSON files that I'm receiving from different sources (so different formats etc. that I'm unifying under my db).

Related

count number of records by month over the last five years where record date > select month

I need to show the number of valid inspectors we have by month over the last five years. Inspectors are considered valid when the expiration date on their certification has not yet passed, recorded as the month end date. The below SQL code is text of the query to count valid inspectors for January 2017:
SELECT Count(*) AS RecordCount
FROM dbo_Insp_Type
WHERE (dbo_Insp_Type.CERT_EXP_DTE)>=#2/1/2017#);
Rather than designing 60 queries, one for each month, and compiling the results in a final table (or, err, query) are there other methods I can use that call for less manual input?
From this sample:
Id
CERT_EXP_DTE
1
2022-01-15
2
2022-01-23
3
2022-02-01
4
2022-02-03
5
2022-05-01
6
2022-06-06
7
2022-06-07
8
2022-07-21
9
2022-02-20
10
2021-11-05
11
2021-12-01
12
2021-12-24
this single query:
SELECT
Format([CERT_EXP_DTE],"yyyy/mm") AS YearMonth,
Count(*) AS AllInspectors,
Sum(Abs([CERT_EXP_DTE] >= DateSerial(Year([CERT_EXP_DTE]), Month([CERT_EXP_DTE]), 2))) AS ValidInspectors
FROM
dbo_Insp_Type
GROUP BY
Format([CERT_EXP_DTE],"yyyy/mm");
will return:
YearMonth
AllInspectors
ValidInspectors
2021-11
1
1
2021-12
2
1
2022-01
2
2
2022-02
3
2
2022-05
1
0
2022-06
2
2
2022-07
1
1
ID
Cert_Iss_Dte
Cert_Exp_Dte
1
1/15/2020
1/15/2022
2
1/23/2020
1/23/2022
3
2/1/2020
2/1/2022
4
2/3/2020
2/3/2022
5
5/1/2020
5/1/2022
6
6/6/2020
6/6/2022
7
6/7/2020
6/7/2022
8
7/21/2020
7/21/2022
9
2/20/2020
2/20/2022
10
11/5/2021
11/5/2023
11
12/1/2021
12/1/2023
12
12/24/2021
12/24/2023
A UNION query could calculate a record for each of 50 months but since you want 60, UNION is out.
Or a query with 60 calculated fields using IIf() and Count() referencing a textbox on form for start date:
SELECT Count(IIf(CERT_EXP_DTE>=Forms!formname!tbxDate,1,Null)) AS Dt1,
Count(IIf(CERT_EXP_DTE>=DateAdd("m",1,Forms!formname!tbxDate),1,Null) AS Dt2,
...
FROM dbo_Insp_Type
Using the above data, following is output for Feb and Mar 2022. I did a test with Cert_Iss_Dte included in criteria and it did not make a difference for this sample data.
Dt1
Dt2
10
8
Or a report with 60 textboxes and each calls a DCount() expression with criteria same as used in query.
Or a VBA procedure that writes data to a 'temp' table.

SQL query that I have set up the algorithm but cannot write the code

I could not find keywords to describe in the title.
I have a problem and I just can explain with example, I have a table like this
user_id | transaction_id | bonus_id | created_at
1. 1 4 2021-05-01
1 3 65 2021-05-01
1 4 4 2021-05-02
1 1 5 2021-05-02
1. 3 76. 2021-05-03
1 2 5 2021-05-03
Due to a mistake I made in php here, transaction id 3 and bonus id 65 but the bonus id 4 that should be
I need to replace all transactions from transaction type 1 to the next transaction type 1 with the bonus id of the first transaction_type_1.
but of course I have to do this for every user. How can I do that?

How to write the query to make report by month in sql

I have the receiving and sending data for whole year. so i want to built the monthly report base on that data with the rule is Fisrt in first out. It means is the first receiving will be sent out first ...
DECLARE #ReceivingTbl AS TABLE(Id INT,ProId int, RecQty INT,ReceivingDate DateTime)
INSERT INTO #ReceivingTbl
VALUES (1,1001,210,'2019-03-12'),
(2,1001,315,'2019-06-15'),
(3,2001,500,'2019-04-01'),
(4,2001,10,'2019-06-15'),
(5,1001,105,'2019-07-10')
DECLARE #SendTbl AS TABLE(Id INT,ProId int, SentQty INT,SendMonth int)
INSERT INTO #SendTbl
VALUES (1,1001,50,3),
(2,1001,100,4),
(3,1001,80,5),
(4,1001,80,6),
(5,2001,200,6)
SELECT * FROM #ReceivingTbl ORDER BY ProId,ReceivingDate
SELECT * FROM #SendTbl ORDER BY ProId,SendMonth
Id ProId RecQty ReceivingDate
1 1001 210 2019-03-12
2 1001 315 2019-06-15
5 1001 105 2019-07-10
3 2001 500 2019-04-01
4 2001 10 2019-06-15
Id ProId SentQty SendMonth
1 1001 50 3
2 1001 100 4
3 1001 80 5
4 1001 80 6
5 2001 200 6
--- And the below is what i want:
Id ProId RecQty ReceivingDate ... Mar Apr May Jun
1 1001 210 2019-03-12 ... 50 100 60 0
2 1001 315 2019-06-15 ... 0 0 20 80
5 1001 105 2019-07-10 ... 0 0 0 0
3 2001 500 2019-04-01 ... 0 0 0 200
4 2001 10 2019-06-15 ... 0 0 0 0
Thanks!
Your question is not clear to me.
If you want to purely use the FIFO approach, therefore ignore any data the table contains, you necessarely need to order by ID, which in your example you are providing, and looks like it is in order of insert.
The first line inserted should be also the first line appearing in the select (FIFO), in order to do so you have to use:
ORDER BY Id ASC
Which will place the lower value of the ID first (1, 2, 3, ...)
To me though, this doesn't make much sense, so pay attention to the meaning o the data you actually have and leverage dates like ReceivingDate, and order by that, maybe even filtering by month of the date, below an example for January data:
WHERE MONTH(ReceivingDate) = 1

How to calculate a running total that is a distinct sum of values

Consider this dataset:
id site_id type_id value date
------- ------- ------- ------- -------------------
1 1 1 50 2017-08-09 06:49:47
2 1 2 48 2017-08-10 08:19:49
3 1 1 52 2017-08-11 06:15:00
4 1 1 45 2017-08-12 10:39:47
5 1 2 40 2017-08-14 10:33:00
6 2 1 30 2017-08-09 07:25:32
7 2 2 32 2017-08-12 04:11:05
8 3 1 80 2017-08-09 19:55:12
9 3 2 75 2017-08-13 02:54:47
10 2 1 25 2017-08-15 10:00:05
I would like to construct a query that returns a running total for each date by type. I can get close with a window function, but I only want the latest value for each site to be summed for the running total (a simple window function will not work because it sums all values up to a date--not just the last values for each site). So I guess it could be better described as a running distinct total?
The result I'm looking for would be like this:
type_id date sum
------- ------------------- -------
1 2017-08-09 06:49:47 50
1 2017-08-09 07:25:32 80
1 2017-08-09 19:55:12 160
1 2017-08-11 06:15:00 162
1 2017-08-12 10:39:47 155
1 2017-08-15 10:00:05 150
2 2017-08-10 08:19:49 48
2 2017-08-12 04:11:05 80
2 2017-08-13 02:54:47 155
2 2017-08-14 10:33:00 147
The key here is that the sum is not a running sum. It should only be the sum of the most recent values for each site, by type, at each date. I think I can help explain it by walking through the result set I've provided above. For my explanation, I'll walk through the original data chronologically and try to explain the expected result.
The first row of the result starts us off, at 2017-08-09 06:49:47, where chronologically, there is only one record of type 1 and it is 50, so that is our sum for 2017-08-09 06:49:47.
The second row of the result is at 2017-08-09 07:25:32, at this point in time we have 2 unique sites with values for type_id = 1. They have values of 50 and 30, so the sum is 80.
The third row of the result occurs at 2017-08-09 19:55:12, where now we have 3 sites with values for type_id = 1. 50 + 30 + 80 = 160.
The fourth row is where it gets interesting. At 2017-08-11 06:15:00 there are 4 records with a type_id = 1, but 2 of them are for the same site. I'm only interested in the most recent value for each site so the values I'd like to sum are: 30 + 80 + 52 resulting in 162.
The 5th row is similar to the 4th since the value for site_id:1, type_id:1 has changed again and is now 45. This results in the latest values for type_id:1 at 2017-08-12 10:39:47 are now: 30 + 80 + 45 = 155.
Reviewing the 6th row is also interesting when we consider that at 2017-08-15 10:00:05, site 2 has a new value for type_id 1, which gives us: 80 + 45 + 25 = 150 for 2017-08-15 10:00:05.
You can get a cumulative total (running total) by including an ORDER BY clause in your window frame.
select
type_id,
date,
sum(value) over (partition by type_id order by date) as sum
from your_table;
The ORDER BY works because
The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
SELECT type_id,
date,
SUM(value) OVER (PARTITION BY type_id ORDER BY type_id, date) - (SUM(value) OVER (PARTITION BY type_id, site_id ORDER BY type_id, date) - value) AS sum
FROM your_table
ORDER BY type_id,
date

TSQL query to return most recent record based on another columns value

I have a table that contains a list of expiration dates for various companies. The table looks like the following:
ID CompanyID Expiration
--- ---------- ----------
1 1 2016-01-01
2 1 2015-01-01
3 2 2016-04-02
4 2 2015-04-02
5 3 2014-01-03
6 4 2015-04-09
7 5 2015-07-20
8 5 2016-05-01
I am trying to build a TSQL query that will return just the most recent record for every company (i.e. CompanyID). Such as:
ID CompanyID Expiration
--- ---------- ----------
1 1 2016-01-01
3 2 2016-04-02
5 3 2014-01-03
6 4 2015-04-09
8 5 2016-05-01
It looks like there is a exact correlation between ID and Expiration. If that is true, ie the later the Expiration the higher the ID, then you could simply pull Max(ID) and Max(Expiration) which are 1:1 and group by CompanyID:
Select max(ID), CompanyID, max(Expiration) from Table group by Company ID