Set to random for each row [duplicate] - sql

I have two tables. Table 1 has about 80 rows and Table 2 has about 10 million.
I would like to update all the rows in Table 2 with a random row from Table 1. I don't want the same row for all the rows. Is it possible to update Table 2 and have it randomly select a value for each row it is updating?
This is what I have tried, but it puts the same value in each row.
update member_info_test
set hostessid = (SELECT TOP 1 hostessId FROM hostess_test ORDER BY NEWID())
**Edited

Ok, I think that this is one of the weirdest query that I've wrote, and I think that this is gonna be terrible slow. But give it a shot:
UPDATE A
SET A.hostessid = B.hostessId
FROM member_info_test A
CROSS APPLY (SELECT TOP 1 hostessId
FROM hostess_test
WHERE A.somecolumn = A.somecolumn
ORDER BY NEWID()) B

I think this will work (at least, the with portion does):
with toupdate as (
select (select top . . . hostessId from hostess_test where mit.hostessId = mit.hostessId order by newid()) as newval,
mit.*
from member_info_test mit
)
update toupdate
set hostessid = newval;
The key to this (and to Lamak's) is the outer correlation in the subquery. This is convincing the optimizer to actually run the query for each row. I don't know why this would work and the other version would not.

Here is what i ended up using:
EnvelopeInformation would be your Table 2
PaymentAccountDropDown would be your Table 1 (in my case i had 3 items) - change 3 to 80 for your usecase.
;WITH cteTable1 AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
PaymentAccountDropDown_Id
FROM EnvelopeInformation
),
cteTable2 AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
t21.Id
FROM PaymentAccountDropDown t21
)
UPDATE cteTable1
SET PaymentAccountDropDown_Id = (
SELECT Id
FROM cteTable2
WHERE (cteTable1.n % 3) + 1 = cteTable2.n
)
reference:
http://social.technet.microsoft.com/Forums/sqlserver/pt-BR/f58c3bf8-e6b7-4cf5-9466-7027164afdc0/updating-multiple-rows-with-random-values-from-another-table

Update Table with Random fields
UPDATE p
SET p.City= b.City
FROM Person p
CROSS APPLY (SELECT TOP 1 City
FROM z.CityStateZip
WHERE p.SomeKey = p.SomeKey and -- ... the magic! ↓↓↓
Id = (Select ABS(Checksum(NewID()) % (Select count(*) from z.CityStateZip)))) b

Related

SQL Server : need help write what i think should be a simple code

I have a simple table:
id id_fk default
-- ----- -------
1 1 T
2 1 F
3 2 T
4 3 T
5 3 F
I would like to return one row for each id_fk. If the default is T then return that one. If their is no default T then return default F.
It seems simple enough but I have been struggling.
One option is to filter with a correlated subquery:
select t.*
from mytable t
where t.id = (
select top(1) t1.id
from mytable t1
where t1.id_fk = t.id_fk
oder by t1.default desc, t1.id
)
This produces one record for each id_fk: priority is given to the record that has 'T'as default, and then to the smalles id.
There are multiple ways to do this. One possible way is to use the row_number() function for this:
select a.*
from
(select x.*,
row_number() over(partition by x.id_fk order by x.Default desc) as rownum1
from table x) a
where a.rownum1=1
SELECT id_fk, MAX(default) As default
FROM SimpleTable
GROUP BY id_fk

Write Cross Apply to select last row with condition

I'm trying to make request for SQL Table which looks like:
CREATE TABLE StudentMark (
Id int NOT NULL IDENTITY(1,1),
StudentId int NOT NULL,
Mark int
);
Is that possible to select StudentMark rows where row should be last row for each user with mark greater than 4.
I'm trying to accomplish that by doing:
SELECT *
FROM [dbo].StudentMark outer
CROSS APPLY (
SELECT TOP(1) *
FROM [dbo].StudentMark inner
WHERE inner.StudentId= outer.StudentId AND inner.Mark>4
) cApply
But that doesn't do what's needed. Could anyone help?
I am curious if something like this will actually get what you are looking for:
SELECT Data.StudentId,
Data.Id,
Data.Mark
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY StudentMark.StudentId ORDER BY StudentMark.Id DESC) AS RowNumber,
StudentMark.StudentId,
StudentMark.Id,
StudentMark.Mark
FROM dbo.StudentMark
) AS Data
WHERE Data.RowNumber = 1
What this will do is get the row number of each and then let you filter by the row number on what is returned. I changed it to look for the first row instead of the last row, but sorted DESC, such that you will get what effectively would have been the last row entered for a student id.
Presumably, "last row" is based on id. If so:
SELECT *
FROM [dbo].StudentMark sm CROSS APPLY
(SELECT TOP (1) sm2.*
FROM [dbo].StudentMark sm2
WHERE sm2.StudentId = sm.StudentId AND sm2.Mark > 4
ORDER BY sm2.id DESC
) sm2;
EDIT:
If you only want the last row per student with Mark > 4, then use filtering:
select sm.*
from dbo.StudentMark sm
where sm.id = (select max(sm2.id)
from dbo.StudentMark sm2
where sm2.studentId = sm.studentId and sm2.Mark > 4
);
SELECT *
FROM StudentMark A
CROSS APPLY
(
SELECT TOP 1 Mark
FROM StudentMark B
WHERE A.StudentId = B.StudentId
ORDER BY StudentId
)M
WHERE M.Mark > 4

SQL query null value replaced based on another

So I have query to return data and a row number using ROW_NUMBER() OVER(PARTITION BY) and I place it into a temp table. The initial output looks the screenshot:
.
From here I need to, in the bt_newlabel column, replace the nulls respectively. So Rownumber 1-4 would be in progress, 5-9 would be underwriting, 10-13 would be implementation, and so forth.
I am hitting a wall trying to determine how to do this. Thanks for any help or input of how I would go about this.
One method is to assign groups, and then the value. Such as:
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t;
The group is simply the number of known values previously seen in the data.
You can update the field with:
with toupdate as (
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t
)
update toupdate
set bt_newlabel = new_newlabel
where bt_newlabel is null;
If I understood what you are trying to do, this is the type of update you need to do on your temp table:
--This will update rows 1-4 to 'Pre-Underwritting'
UPDATE temp_table SET bt_newlabel = 'Pre-Underwritting'
WHERE rownumber between
1 AND (SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting');
--This will update rows 5-9 to 'Underwritting'
UPDATE temp_table SET bt_newlabel = 'Underwritting'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting');
--This will update rows 10-13 to 'Implementation'
UPDATE temp_table SET bt_newlabel = 'Implementation'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Implementation');
I made a working Fiddle to check out the results: http://sqlfiddle.com/#!18/1cae2/1/3

Return two rows from SQL table with a difference in values [duplicate]

This question already has answers here:
How to request a random row in SQL?
(30 answers)
Closed 6 years ago.
iam trying to return 2 rows from table that have a difference in values, not being an SQL wise man i am stuck any help would be appreciated :-)
TABLE A:
NAME DATA
Oscar HOME1
Jens HOME2
Will HOME1
Jeremy HOME2
Al HOME1
Result, should be 2 random rows with a difference in DATA value
NAME DATA
Oscar HOME1
Jeremy HOME2
Anyone?
Easy way to have random data.
;with tblA as (
select name,data,
row_number() over(partition by data order by newid()) rn
from A
)
select name,data
from tblA
where rn = 1
Couuld be you need
select * from my_table a
inner join my_table b on a.data !=b.data
where a.data in ( SELECT data FROM my_table ORDER BY RAND() LIMIT 1);
For your code
SELECT *
FROM [dbo].[ComputerState] as a
INNER JOIN [dbo].[ComputerState] as b ON a.ServiceName != b.ServiceName
WHERE a.ServiceName IN (
SELECT top 1 [ServiceName] FROM [dbo].[ComputerState]
);
If the question is really this simple, you can use an aggregate such as MAX() or MIN() to grab one row for each different DATA:
SELECT MAX(NAME), DATA
FROM TABLE_A
GROUP BY DATA
Of course, if any other variables are introduced to the requirements, this may no longer work.
;WITH cteA AS (
SELECT
name
,data
,ROW_NUMBER() OVER (PARTITION BY data ORDER BY NEWID()) as DataRowNumber
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) as RandomRowNumber
FROM
A
)
SELECT *
FROM
cteA
WHERe
DataRowNumber = 1
AND RandomRowNumber <= 2
This Expands on #AlexKudryashev 's answer a little.
;with tblA as (
select name,data,
row_number() over(partition by data order by newid()) rn
from A
)
select name,data
from tblA
where rn = 1
The only issue with what he had Is that the number of Rows where rn = 1 will be depended on the COUNT(DISTINCT data) so it could lead to more than 2 results. To fix one could add a SELECT TOP 2 clause but it might not be fully random as results at that point as it will be dependent on the ordinal results of how SQL optimizes the query which is likely to be consistent. To get truly random add a second random row number and limit the results to the top 2 of those.

Update table with random record in update statment in SQL Server?

I have two tables. Table 1 has about 80 rows and Table 2 has about 10 million.
I would like to update all the rows in Table 2 with a random row from Table 1. I don't want the same row for all the rows. Is it possible to update Table 2 and have it randomly select a value for each row it is updating?
This is what I have tried, but it puts the same value in each row.
update member_info_test
set hostessid = (SELECT TOP 1 hostessId FROM hostess_test ORDER BY NEWID())
**Edited
Ok, I think that this is one of the weirdest query that I've wrote, and I think that this is gonna be terrible slow. But give it a shot:
UPDATE A
SET A.hostessid = B.hostessId
FROM member_info_test A
CROSS APPLY (SELECT TOP 1 hostessId
FROM hostess_test
WHERE A.somecolumn = A.somecolumn
ORDER BY NEWID()) B
I think this will work (at least, the with portion does):
with toupdate as (
select (select top . . . hostessId from hostess_test where mit.hostessId = mit.hostessId order by newid()) as newval,
mit.*
from member_info_test mit
)
update toupdate
set hostessid = newval;
The key to this (and to Lamak's) is the outer correlation in the subquery. This is convincing the optimizer to actually run the query for each row. I don't know why this would work and the other version would not.
Here is what i ended up using:
EnvelopeInformation would be your Table 2
PaymentAccountDropDown would be your Table 1 (in my case i had 3 items) - change 3 to 80 for your usecase.
;WITH cteTable1 AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
PaymentAccountDropDown_Id
FROM EnvelopeInformation
),
cteTable2 AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NEWID()) AS n,
t21.Id
FROM PaymentAccountDropDown t21
)
UPDATE cteTable1
SET PaymentAccountDropDown_Id = (
SELECT Id
FROM cteTable2
WHERE (cteTable1.n % 3) + 1 = cteTable2.n
)
reference:
http://social.technet.microsoft.com/Forums/sqlserver/pt-BR/f58c3bf8-e6b7-4cf5-9466-7027164afdc0/updating-multiple-rows-with-random-values-from-another-table
Update Table with Random fields
UPDATE p
SET p.City= b.City
FROM Person p
CROSS APPLY (SELECT TOP 1 City
FROM z.CityStateZip
WHERE p.SomeKey = p.SomeKey and -- ... the magic! ↓↓↓
Id = (Select ABS(Checksum(NewID()) % (Select count(*) from z.CityStateZip)))) b