Sum of two table values per time and find minimum of sum result in SQL-Server - sql

I have two database.
in database one I have a table (507.000 record for one day data)
-- insert data from database_1
DROP TABLE IF EXISTS #AccountBalance;
CREATE TABLE #AccountBalance
( AccountNumber VARCHAR(20),AccountBalance MONEY,TranTime DATETIME);
Sample data is AccountBalance:
BankAccountNumber AccountBalance transactiontime
01003930510 42006.00 2021-03-20
45033323462 4682.00 2021-03-20
23035469562 3388.00 2021-03-20
23005168662 617.00 2021-03-20
01004829050 44640.00 2021-03-20
Sample data for TransactionCards
BankAccountNumber Balance TransactionTime
45033323462 245428.00 2021-03-21 00:06:47.000
23038201062 140983.00 2021-03-21 00:06:49.000
45019249962 60416.00 2021-03-21 00:07:46.000
45004876662 588154.00 2021-03-21 00:10:46.000
45004876662 627867.00 2021-03-22 00:17:44.000
in database two I have a table with 18 millions records
Aim : find one value and call that Minimum Value Of Sum(balance) per record
I trreid:
0. Insert data in one database with temporary table and linked-Server.
Write cursor like below (fetch BankAccountNumber from CardTransaction and if not exists same BankAccountNumber Insert into #Account balance ,If exists update balance )
calculate sum of all SUM(AccountBalance) per fetch(record or per time) and insert result into #Result (business is clear in cursor)
Select Min(AccountBalance ) From #Result
-- Create tables for calculate
DROP TABLE IF EXISTS #AccountBalance;
CREATE TABLE #AccountBalance
( BankAccountNumber VARCHAR(20),AccountBalance MONEY,TranTime DATETIME); -- I inserted 507.000 row record data in this table
DROP TABLE IF EXISTS #Result
CREATE TABLE #Result (SumOfBalance MONEY, BankAccountNumber VARCHAR(20), TranTime DATETIME)
-- variable for cursor procces
DECLARE #BankAccountNumber VARCHAR(20);
DECLARE #TransactionBalance MONEY;
DECLARE #TranTime DATETIME;
DECLARE #OldBankAccountNumber VARCHAR(20);
DECLARE #OldAccountBalance MONEY;
DECLARE #OldTranTime DATETIME = '2021-03-20';
-- start cursor
DECLARE CR CURSOR FOR
SELECT rt.BankAccountNumber,rt.Balance,rt.TransactionTime
FROM RawData.dbo.CardTransaction rt;
PRINT '-------Sum of all AccountBalance Report per time------';
OPEN CR;
FETCH NEXT FROM CR
INTO #BankAccountNumber,
#TransactionBalance,
#TranTime;
-- insert sum of account balanace into result table
INSERT INTO #Result (SumOfBalance,BankAccountNumber,TranTime)
SELECT SUM(AccountBalance),#BankAccountNumber,#TranTime FROM #AccountBalance
WHILE ##FETCH_STATUS = 0 AND dbo.DoContinue() = 1
BEGIN
SELECT BankAccountNumber = #OldBankAccountNumber , AccountBalance = #OldAccountBalance FROM dbo.AccountBalance WHERE BankAccountNumber = #BankAccountNumber
IF #OldBankAccountNumber=#BankAccountNumber -- if exists record in account balance
BEGIN
-- update account balance with new balance
UPDATE #AccountBalance
SET AccountBalance = #TransactionBalance
WHERE BankAccountNumber = #BankAccountNumber
-- insert new sum of account balanace into result table
INSERT INTO #Result (SumOfBalance,BankAccountNumber,TranTime)
SELECT SUM(AccountBalance),#BankAccountNumber,#TranTime FROM #AccountBalance
END;
ELSE
BEGIN
--
INSERT INTO #AccountBalance (BankAccountNumber,AccountBalance,TranTime)
VALUES (#BankAccountNumber, #TransactionBalance, #TranTime);
-- insert new sum of account balanace into result table
INSERT INTO #Result (SumOfBalance,BankAccountNumber,TranTime)
SELECT SUM(AccountBalance),#BankAccountNumber,#TranTime FROM #AccountBalance
END;
PRINT #BankAccountNumber
FETCH NEXT FROM CR
INTO #BankAccountNumber,#TransactionBalance,#TranTime;
END;
CLOSE CR;
DEALLOCATE CR;
Problem : Very slowly work and I can't wait one day for run cursor . and I cant see result, but I guess not reliable value(I checked 2000 record)
What I need : I need fast and reliable solution
Expected table like below :
SumOfAccountBalance transactiontime
98,721 2021-03-21 10:01:00
339,464 2021-04-22 01:01:00
480,447 2021-04-23 01:01:00
540,863 2021-04-23 02:01:00
1,129,017 2021-04-23 03:01:00
1,168,730 2021-04-23 15:01:00
Final Expected :
MinCriticalPointAccountBalance transactiontime
98,721 2021-03-21 10:01:00

Related

combine employee work days and leave days entry in single report

I have one table in which employees daily login entry is present and when employee is on leave he enters leave date entry in other Leave table .want to create monthly report which contains each employees all days records whether he is on leave or worked on that day.
Please help how i can achieve this .
DECLARE #user_id int
Declare #myresultset Cursor
Set #myresultset = CURSOR for Select USER_ID from users where user_id=679
OPEN #myresultset
Fetch NEXT from #myresultset
into #user_id
While ##Fetch_Status=0
Begin
While #user_id is not NULL
Begin
--SELECT timesheet_id
--INTO #variable
Declare #ts TABLE (Value INT)
INSERT INTO #ts (Value)
SELECT timesheet_id
FROM timesheet where user_id=#user_id and MONTH(start_time_server)=month(getdate()) and year(start_time_server)=YEAR(getdate()) and user_id=679
End
declare #le_ts table (Value INT)
Insert into #le_ts (Value)
select timesheet_id from leaves where start_time_server not in (select start_time_server from timesheet where timesheet_id in (select * from #ts)) and user_id=#user_id
select *,timesheet.login_by from #ts,#le_ts,timesheet where timesheet.user_id=#user_id
Fetch Next from #myresultset
into #user_id
End
close #myresultset
enter image description here

Running total by date/ID based on latest change to value SQL

I have a unique case where I want to calculate the running total of quantities day over day. I have been searching a lot but couldn't find the right answer. Code-wise, there is nothing much I can share as it refers to a lot of sensitive data
Below is the table of dummy data:
As you can see, there are multiple duplicate IDs by date. I want to be able to calculate the running total of a date as follows:
For 2022/03/24, the running total would be 9+33 = 42, on 2022/03/26 the running total should be 9+31 = 40. Essentially, the running total for any given day should pick the last value by ID if it changed or the value that exists. In this case on 2022/03/26 for that date, for ID 2072, we pick 31 and not 33 because that's the latest value available.
Expected Output:
There maybe be many days spanning across and the running total needs to be day over day.
Possible related question: SQL Server running total based on change of state of a column
PS: For context, ID is just a unique identifier for an inventory of items. Each item's quantity changes day by day. In this example, ID 1's inventoyr last changed on 2022/03/24 where as ID 2072's changed multiple times. Running total for 2022/03/24 would be quantities of inventory items on that day. On 26th there are no changes for ID 1 but ID 2072 changed, the inventory pool should reflect the total as current inventory size of ID 2072+ current size of ID 1. On 26th, again ID 1 did not have any change, but ID 2072 changed. Therefore inventory size = current size of ID 2072 + current size of ID 1, in this case, 40. Essentially, it is just a current size of inventory with day over day change.
Any help would be really appreciated! Thanks.
I added a few more rows just in case if this is what you really wanted.
I used T-SQL.
declare #orig table(
id int,
quantity int,
rundate date
)
insert into #orig
values (1,9,'20220324'),(2072,33,'20220324'),(2072,31,'20220326'),(2072,31,'20220327'),
(2,10,'20220301'),(2,20,'20220325'),(2,30,'20220327')
declare #dates table (
runningdate date
)
insert into #dates
select distinct rundate from #orig
order by rundate
declare #result table (
dates date,
running_quality int
)
DECLARE #mydate date
DECLARE #sum int
-- CURSOR definition
DECLARE my_cursor CURSOR FOR
SELECT * FROM #dates
OPEN my_cursor
-- Perform the first fetch
FETCH NEXT FROM my_cursor into #mydate
-- Check ##FETCH_STATUS to see if there are any more rows to fetch
WHILE ##FETCH_STATUS = 0
BEGIN
;with cte as (
select * from #orig
where rundate <= #mydate
), cte2 as (
select id, max(rundate) as maxrundate
from cte
group by id
), cte3 as (
select a.*
from cte as a join cte2 as b
on a.id = b.id and a.rundate = b.maxrundate
)
select #sum = sum(quantity)
from cte3
insert into #result
select #mydate, #sum
-- This is executed as long as the previous fetch succeeds
FETCH NEXT FROM my_cursor into #mydate
END -- cursor
CLOSE my_cursor
DEALLOCATE my_cursor
select * from #result
Result:
dates running_quality
2022-03-01 10
2022-03-24 52
2022-03-25 62
2022-03-26 60
2022-03-27 70

Generating dummy data from existing data set is slow using cursor

I'm trying to generate dummy data from the existing data I have in the tables. All I want is to increase the number of records in Table1 to N specified amount. The other tables should increase based on the foreign key references.
The tables has one to many relationship. For one record in table 1, I can have multiple entries in table 2, and in table 3 I can have many records based on IDs of the second table.
Since IDs are primary keys, I either capture it by
SET #NEWLY_INSERTED_ID = SCOPE_IDENTITY()
after inserting to table 1 and using in insert for table2, or inserting them to temp table and joining them to achieve the same results for table 3.
Here's the approach I'm taking with the CURSOR.
DECLARE #MyId as INT;
DECLARE #myCursor as CURSOR;
DECLARE #DESIRED_ROW_COUNT INT = 70000
DECLARE #ROWS_INSERTED INT = 0
DECLARE #CURRENT_ROW_COUNT INT = 0
DECLARE #NEWLY_INSERTED_ID INT
DECLARE #LANGUAGE_PAIR_IDS TABLE ( LangugePairId INT, NewId INT, SourceLanguage varchar(100), TargetLangauge varchar(100) )
WHILE (#ROWS_INSERTED < #DESIRED_ROW_COUNT)
BEGIN
SET #myCursor = CURSOR FOR
SELECT Id FROM MyTable
SET #CURRENT_ROW_COUNT = (SELECT COUNT(ID) FROM MyTable)
OPEN #myCursor;
FETCH NEXT FROM #myCursor INTO #MyId;
WHILE ##FETCH_STATUS = 0
BEGIN
IF ((#CURRENT_SUBMISSION_COUNT < #DESIRED_ROW_COUNT) AND (#ROWS_INSERTED < #DESIRED_ROW_COUNT))
BEGIN
INSERT INTO [dbo].[MyTable]
([Column1]
([Column2]
([Column3]
)
SELECT
,convert(numeric(9,0),rand() * 899999999) + 100000000
,COlumn2
,Colum3
FROM MyTable
WHERE Id = #MyId
SET #NEWLY_INSERTED_ID = SCOPE_IDENTITY()
INSERT INTO [dbo].[Language]
([MyTable1Id]
,[Target]
,[Source]
OUTPUT inserted.Id, inserted.MyTable1Id, inserted.Source, inserted.[Target] INTO #LANGUAGE_PAIR_IDS (LangugePairId, NewId, SourceLanguage, TargetLangauge)
SELECT
#NEWLY_INSERTED_ID
,[Target]
,[Source]
FROM [dbo].[Language]
WHERE MyTableId = #MyId
ORDER BY Id
DECLARE #tbl AS TABLE (newLanguageId INT, oldLanguageId INT, sourceLanguage VARCHAR(100), targetLanguage VARCHAR(100))
INSERT INTO #tbl (newLanguageId, oldLanguageId, sourceLanguage, targetLanguage)
SELECT 0, id, [Source], [Target] MyTable1Id FROM Language WHERE MyTable1Id = #MyId ORDER BY Id
UPDATE t
SET t.newlanguageid = lp.LangugePairId
FROM #tbl t
JOIN #LANGUAGE_PAIR_IDS lp
ON t.sourceLanguage = lp.SourceLanguage
AND t.targetLanguage = lp.TargetLangauge
INSERT INTO [dbo].[Manager]
([LanguagePairId]
,[UserId]
,[MyDate])
SELECT
tbl.newLanguageId
,p.[UserId]
,p.[MyDate]
FROM Manager m
INNER JOIN #tbl tbl
ON m.LanguagePairId = tbl.oldLanguageId
WHERE m.LanguagePairId in (SELECT Id FROM Language WHERE MyTable1Id = #MyId) -- returns the old language pair id
SET #ROWS_INSERTED += 1
SET #CURRENT_ROW_COUNT +=1
END
ELSE
BEGIN
PRINT 'REACHED EXIT'
SET #ROWS_INSERTED = #DESIRED_ROW_COUNT
BREAK
END
FETCH NEXT FROM #myCursor INTO #MyId;
END
CLOSE #myCursor
DEALLOCATE #myCursor
END
The above code works! It generates the data I need. However, it's very very slow. Just to give some comparison. Initial load of data for table 1 was ~60,000 records, Table2: ~74,000 and Tabl3 ~3,400
I tried to insert 9,000 rows in Table1. With the above code, it took 17:05:01 seconds to complete.
Any suggestion on how I can optimize the query to run little faster? My goal is to insert 1-2 mln records in Table1 without having to wait for days. I'm not tied to CURSOR. I'm ok to achieve the same result in any other way possible.

SQL Query help needed - Multiple rows in 1st table should match to multiple table in 2nd table

Problem Illustration
I am trying to find that magical query to generate summary information. I have mapped my problem into fictitious illustration. I have 'WaterLeakage%' table which records leakage occurred in hotel rooms over several year.
I have another table which records WaterConsumption in liters for each table.
Now i have to find actual water leakage in liters for given room number over given date range.
Basically i have to group several rows in 'WaterLeakage%' table to several rows in 'WaterConsumption' table. I am trying to figure out magical efficient query to find this. Unable to find it, please help.
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
DECLARE #ROOM_NUMBER INT = 101;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_CONSUMPTION'))
DROP TABLE #WATER_CONSUMPTION;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_LEAKAGE_PER'))
DROP TABLE #WATER_LEAKAGE_PER;
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/01','2017/01/02',5),
(102,'2017/01/01','2017/01/05',10),
(101,'2017/01/04','2017/02/06',10);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'2017/01/01',100),
(101,'2017/01/02',100),
(101,'2017/01/03',100),
(101,'2017/01/04',100),
(101,'2017/01/05',100),
(101,'2017/01/06',100),
(102,'2017/01/01',100),
(102,'2017/01/02',100),
(102,'2017/01/03',100),
(102,'2017/01/04',100),
(102,'2017/01/05',100);
DECLARE #TotalLeak REAL = 0;
SELECT * FROM #WATER_CONSUMPTION;
SELECT * FROM #WATER_LEAKAGE_PER;
SELECT * FROM #WATER_CONSUMPTION T1 JOIN (SELECT * FROM #WATER_LEAKAGE_PER WHERE ROOM_NUMBER=#ROOM_NUMBER) T2
ON (T1.ROOM_NUMBER=T2.ROOM_NUMBER AND T1.UDAY >= T2.START_DATE AND T1.UDAY <= T2.END_DATE);
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
I am very close to solution now. Basically i changed my thinking. I will join reverse now.
BEGIN
--Input Parameters for calculating water wastage between date range
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT,
LEAKAGE_PER_DAY_IN_LITER INT
)
-- Leakage in liter per room for each day, This will have multiple entries for room and date if room number and date is available in multiple date ranges, ex. in #WATER_CONSUMPTION table for room number 101 we have multiple entries with overlapping dates
CREATE TABLE #DAY_WISE_LEAKAGE
(
ROOM_NUMBER INT,
LDATE DATE,
LEAKAGE_IN_LITER INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/15','2017/01/18',30),
(102,'2017/01/15','2017/01/18',10),
(101,'2017/01/15','2017/02/13',5);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'01/01/2017',1001),
(101,'01/02/2017',1001),
(101,'01/03/2017',1001),
(101,'01/04/2017',1001),
(101,'01/05/2017',1001),
(101,'01/06/2017',1001),
(101,'01/07/2017',1001),
(101,'01/08/2017',1001),
(101,'01/09/2017',1001),
(101,'01/10/2017',1001),
(101,'01/11/2017',1001),
(101,'01/12/2017',1001),
(101,'01/13/2017',1001),
(101,'01/14/2017',1001),
(101,'01/15/2017',1001),
(101,'01/16/2017',1001),
(101,'01/17/2017',1001),
(101,'01/18/2017',1001),
(101,'01/19/2017',1001),
(101,'01/20/2017',1001),
(101,'01/21/2017',1001),
(101,'01/22/2017',1001),
(101,'01/23/2017',1001),
(101,'01/24/2017',1001),
(101,'01/25/2017',1001),
(101,'01/26/2017',1001),
(101,'01/27/2017',1001),
(101,'01/28/2017',1001),
(101,'01/29/2017',1001),
(101,'01/30/2017',1001),
(101,'01/31/2017',1001);
DECLARE #ROOM_NUMBER INT
DECLARE #START_DATE DATE
DECLARE #END_DATE DATE
DECLARE #WATER_LEAKAGE_PERCENT INT
-- cursor for calculating water wastage pre date range per day available in #WATER_LEAKAGE_PER table
DECLARE WATER_LEAKAGE_PER_CURSOR CURSOR FOR
SELECT ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT FROM #WATER_LEAKAGE_PER
OPEN WATER_LEAKAGE_PER_CURSOR
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #TOTAL_WATER_USED_FOR_DATE_RANGE INT=0;
DECLARE #NUMBER_OF_DAYS INT=0;
DECLARE #LEAKAGE_PER_DAY_IN_LITER INT=0;
-- Total Liters of water used for 1 date range
SELECT #TOTAL_WATER_USED_FOR_DATE_RANGE =SUM(WATER_CONSUMPTION_LITER),#NUMBER_OF_DAYS=COUNT(1) FROM #WATER_CONSUMPTION WHERE ROOM_NUMBER=#ROOM_NUMBER AND UDAY BETWEEN #START_DATE AND #END_DATE;
-- Liters of water leakage per day for selevted date range in cursor
SELECT #LEAKAGE_PER_DAY_IN_LITER=((#TOTAL_WATER_USED_FOR_DATE_RANGE*#WATER_LEAKAGE_PERCENT)/100)/#NUMBER_OF_DAYS;
UPDATE #WATER_LEAKAGE_PER SET LEAKAGE_PER_DAY_IN_LITER = #LEAKAGE_PER_DAY_IN_LITER WHERE ROOM_NUMBER=#ROOM_NUMBER AND START_DATE = #START_DATE AND END_DATE=#END_DATE AND WATER_LEAKAGE_PERCENT=#WATER_LEAKAGE_PERCENT;
-- generate dates and water leakage, this will be used for actual calculation of water leakage in date range.
;WITH n AS
(
SELECT TOP (DATEDIFF(DAY, #START_DATE, #END_DATE) + 1)
n = ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects
)
INSERT INTO #DAY_WISE_LEAKAGE SELECT #ROOM_NUMBER, DATEADD(DAY, n-1, #START_DATE),#LEAKAGE_PER_DAY_IN_LITER
FROM n;
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
END
CLOSE WATER_LEAKAGE_PER_CURSOR;
DEALLOCATE WATER_LEAKAGE_PER_CURSOR;
-- Average of Liters of water leakage per Room number.
SELECT ROOM_NUMBER,SUM(LEAKAGE_IN_LITER) FROM #DAY_WISE_LEAKAGE WHERE LDATE BETWEEN #START_DATE_PARAM AND #END_DATE_PARAM GROUP BY ROOM_NUMBER;
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
DROP TABLE #DAY_WISE_LEAKAGE
END

Create a random selection weighted on number of points, SQL

I have a table of winners for a prize draw, where each winner has earned a number of points over the year. There are 1300 registered users, with points varying between 50 and 43,000. I need to be able to select a random winner, which is straight forward, but the challenge I am having is building the logic where each point counts as an entry ticket into the prize draw. Would appreciate any help.
John
Your script would look something similar to this:
Script 1 :
DECLARE #Name varchar(100),
#Points int,
#i int
DECLARE Cursor1 CURSOR FOR SELECT Name, Points FROM Table1
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO Table2 (Name)
VALUES (#Name)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
I have created a table (Table1) with only a Name and Points column (varchar(100) and int), I have created a cursor in order to look through all the records within Table1 and then loop through the Points and then inserted each record into another table (Table2).
This then imports the Name depending on the Points column.
Script 2 :
DECLARE #Name varchar(100),
#Points int,
#i int,
#Count int
CREATE TABLE #temptable(
UserEmailID nvarchar(200),
Points int)
DECLARE Cursor1 CURSOR FOR SELECT UserEmailID, Points FROM Table1_TEST
OPEN Cursor1
FETCH NEXT FROM Cursor1
INTO #Name, #Points
WHILE ##FETCH_STATUS = 0
BEGIN
SET #i = 0
WHILE #i < #Points
BEGIN
INSERT INTO #temptable (UserEmailID, Points)
VALUES (#Name, #Points)
SET #i = #i + 1
END
FETCH NEXT FROM Cursor1 INTO #Name, #Points
END
DEALLOCATE Cursor1
SELECT * FROM #temptable
DROP TABLE #temptable
In Script2 I have imported the result into a TEMP table as requested.
The script now runs through each record within you Table1 and imports the individuals UserEmailID and Points into the TEMP table depending on how much the Points are in Table1.
So if John has a total of 3 points, and Sarah 2, the script will import Johns UserEmailID 3 times into the TEMP table and 2 times for Sarah.
If you apply the random selector on the TEMP table, it will then randomly select a individual.
John would obviously stand a better chance to win because he has 3 records in the TEMP table whereas Sarah only has 2.
Suppose Johns UserEmailID is 1 and Sarah is 2:
The OUTPUT of TEMP table would then be:
UserEmailID | Points
1 | 3
1 | 3
1 | 3
2 | 2
2 | 2
Please let me know if you need any clarity.
Hope this helps.
You can do a weighted draw using the following method:
Calculate the cumulative sum of points.
Divide by the total number of points to get a value between 0 and 1
Each row in the original data will have a range, such as [0, 0.1), [0.1, 0.3), [0.3, 1]
Calculate a random number and choose the row where the value falls in the range
Here is standard'ish SQL for this approach:
with u as (
select u.*,
coalesce(lead(rangestart) over (order by points) as rangeend, 1)
from (select u.*,
sum(points*1.0) over (order by points) / sum(points) over () as rangestart
from users u
) u
),
r as (
select random() as rand
)
select u.*
from u
where r.rand between rangestart and rangeend;
In addition to using window functions (which can be handled by correlated subqueries in many cases), the exact format depends on whether the random number generator is deterministic for a query (such as SQL Server where random() returns one value no matter how often called in a query) or non-deterministic (such as in other databases). This method only requires one value for the random number generator, so it will work with either method.
So you want a winner with 1000 points have double the chances as another with only 500 points.
Sort the winners by whatever order and create a running total for the points:
id points
winner1 100
winner2 50
winner3 150
gives:
id points from to
winner1 100 1 100
winner2 50 101 150
winner3 150 151 300
Then compare with a random number from 1 to sum(points), in the example a number between 1 and 300. Find the winner with that number range and you're done.
select winpoints.id_winner
from
(
select
id as id_winner,
coalesce(sum(points) over(order by id rows between unbounded preceding and 1 preceding), 0) + 1 as from_points,
sum(points) over(order by id rows between unbounded preceding and current row) as to_points
from winners
) winpoints
where (select floor(rand() * (sum(points) from winners)) + 1)
between winpoints.from_points and winpoints.to_points;
This solution also works with fractional points/weights. It creates a helper table usersum.
create table user (id int primary key, points float);
insert into user values (1, 0.5), (2, 0), (3, 1);
create table usersum (id int primary key, pointsum float);
insert into usersum
select id, (select sum(points) from user b where b.id <= a.id)
from user a;
set #r = rand() * (select max(pointsum) from usersum);
select #r, usersum.* from usersum where pointsum >= #r order by id limit 1;
http://sqlfiddle.com/#!2/ae539e/1