Updating thousands of records with different values

Updating thousands of records with different values - sql

I've been given a spreadsheet in the format of :
Id | Val
1 57
2 99
There's approximately 10,000 records - Any ideas to handle the query below for 10,000 records without manually writing each case statement, tediously. Thanks.
update person
SET val = (
case
when Id = 1 then 57
when Id = 2 then 99
end),
where Id in (1, 2)

Quick and dirty? here you go
Add a new spredsheet call the old one datatable
In the first row first column you write
"Update person set val = ("
in the second column you link to the value on datatable spreadsheet
third column ") where ID = ("
fourth column you link to the ID of the datatable spreadsheet
fifth column ")"
Then you mark the whole row and pull it downwards to row 10000
Copy past into query escecute

I think this example can be help you :
CREATE TABLE #Person
(PrimaryKey int PRIMARY KEY,
ValueSome varchar(50)
);
GO
CREATE TABLE #MySpreadSheet
(PrimaryKey int PRIMARY KEY,
ValueSpread varchar(50)
);
GO
INSERT INTO #Person
SELECT 1, 'someValue'
INSERT INTO #Person
SELECT 2, 'someValueBeforeUpdate'
INSERT INTO #Person
SELECT 3, ''
INSERT INTO #MySpreadSheet
SELECT 1, '45'
INSERT INTO #MySpreadSheet
SELECT 2, '56'
INSERT INTO #MySpreadSheet
SELECT 3, '34'
SELECT * FROM #Person
SELECT * FROM #MySpreadSheet
UPDATE P SET P.ValueSome = SS.ValueSpread FROM #Person P JOIN #MySpreadSheet SS ON P.PrimaryKey = SS.PrimaryKey
SELECT * FROM #Person
DROP TABLE #Person
DROP TABLE #MySpreadSheet

If anyones interested, I went with this :
CREATE TABLE #TempTable(
Id int,
val int
)
INSERT INTO #TempTable (Id, val)
Values (1, 57),
(2, 99)
Update Person
Set Id = tp.Id,
val = tp.val
FROM Person p
INNER JOIN #TempTable as tp on tp.Id = p.Id

create table #example (id int , value int)
insert into #example (id, value) values (1, 10)
insert into #example (id, value) values (2, 20)
select * from #example
id value
1 10
2 20
update #example
set value = case when id = 1 then 100
when id = 2 then 200 end
where id in (1,2)
select * from #example
id value
1 100
2 200

Related

Find data by multiple Lookup table clauses

declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
The Name Table has more than just 1,2,3 and the list to parse on is dynamic
NameTable
id | name
----------
1 foo
2 bar
3 steak
CharacterTable
id | name
---------
1 tom
2 jerry
3 dog
NameToCharacterTable
id | nameId | characterId
1 1 1
2 1 3
3 1 2
4 2 1
I am looking for a query that will return a character that has two names. For example
With the above data only "tom" will be returned.
SELECT *
FROM nameToCharacterTable
WHERE nameId in (1,2)
The in clause will return every row that has a 1 or a 3. I want to only return the rows that have both a 1 and a 3.
I am stumped I have tried everything I know and do not want to resort to dynamic SQL. Any help would be great
The 1,3 in this example will be a dynamic list of integers. for example it could be 1,3,4,5,.....

Filter out a count of how many times the Character appears in the CharacterToName table matching the list you are providing (which I have assumed you can convert into a table variable or temp table) e.g.
declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
declare #RequiredNames table (nameId int);
insert into #RequiredNames (nameId)
values
(1),
(2);
select *
from #Character C
where (
select count(*)
from #NameToCharacter NC
where NC.characterId = c.id
and NC.nameId in (select nameId from #RequiredNames)
) = 2;
Returns:
id
name
1
tom
Note: Providing DDL+DML as shown here makes it much easier for people to assist you.

This is classic Relational Division With Remainder.
There are a number of different solutions. #DaleK has given you an excellent one: inner-join everything, then check that each set has the right amount. This is normally the fastest solution.
If you want to ensure it works with a dynamic amount of rows, just change the last line to
) = (SELECT COUNT(*) FROM #RequiredNames);
Two other common solutions exist.
Left-join and check that all rows were joined
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM #RequiredNames rn
LEFT JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = COUNT(nc.nameId) -- all rows are joined
);
Double anti-join, in other words: there are no "required" that are "not in the set"
SELECT *
FROM #Character c
WHERE NOT EXISTS (SELECT 1
FROM #RequiredNames rn
WHERE NOT EXISTS (SELECT 1
FROM #NameToCharacter nc
WHERE nc.nameId = rn.nameId AND nc.characterId = c.id
)
);
A variation on the one from the other answer uses a windowed aggregate instead of a subquery. I don't think this is performant, but it may have uses in certain cases.
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM (
SELECT *, COUNT(*) OVER () AS cnt
FROM #RequiredNames
) rn
JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = MIN(rn.cnt)
);
db<>fiddle

Nested while loop in SQL Server is not showing the expected result

I am trying to connect records from two different tables so I can display the data in a tabular format in an SSRS tablix.
The code below does not return the expected results.
As is, for each item in Temp_A the loop updates everything with the last item in Temp_C. Here is the code:
CREATE TABLE #Temp_A
(
[ID] INT,
[Name] VARCHAR(255)
)
INSERT INTO #Temp_A ([ID], [Name])
VALUES (1, 'A'), (2, 'B')
CREATE TABLE #Temp_C
(
[ID] INT,
[Name] VARCHAR(255)
)
INSERT INTO #Temp_C ([ID], [Name])
VALUES (1, 'C'), (2, 'D')
CREATE TABLE #Temp_Main
(
[Temp_A_ID] INT,
[Temp_A_Name] VARCHAR(255),
[Temp_C_ID] INT,
[Temp_C_Name] VARCHAR(255),
)
DECLARE #MIN_AID int = (SELECT MIN(ID) FROM #Temp_A)
DECLARE #MAX_AID int = (SELECT MAX(ID) FROM #Temp_A)
DECLARE #MIN_DID int = (SELECT MIN(ID) FROM #Temp_C)
DECLARE #MAX_DID int = (SELECT MAX(ID) FROM #Temp_C)
WHILE #MIN_AID <= #MAX_AID
BEGIN
WHILE #MIN_DID <= #MAX_DID
BEGIN
INSERT INTO #Temp_Main([Temp_A_ID], [Temp_A_Name])
SELECT ID, [Name]
FROM #Temp_A
WHERE ID = #MIN_AID
UPDATE #Temp_Main
SET [Temp_C_ID] = ID, [Temp_C_Name] = [Name]
FROM #Temp_C
WHERE ID = #MIN_DID
SET #MIN_DID = #MIN_DID + 1
END
SET #MIN_AID = #MIN_AID + 1
SET #MIN_DID = 1
END
SELECT * FROM #Temp_Main
DROP TABLE #Temp_A
DROP TABLE #Temp_C
DROP TABLE #Temp_Main
Incorrect result:
Temp_A_ID | Temp_A_Name | Temp_C_ID | Temp_C_Name
----------+-------------+-----------+---------------
1 A 2 D
1 A 2 D
2 B 2 D
2 B 2 D
Expected results:
Temp_A_ID | Temp_A_Name | Temp_C_ID | Temp_C_Name
----------+-------------+-----------+---------------
1 A 1 C
1 A 2 D
2 B 1 C
2 B 2 D
What am I missing?

You seem to want a cross join:
select a.*, c.*
from #Temp_A a cross join
#Temp_C c
order by a.id, c.id;
Here is a db<>fiddle.
There is no need to write a WHILE loop to do this.
You can use insert to insert this into #TempMain, but I don't se a need to have a temporary table for storing the results of this query.

comparing two colums in sqlserver and returing the remaining data

I have two tables. First one is student table where he can select two optional courses and other table is current semester's optional courses list.
When ever the student selects a course, row is inserted with basic details such as roll number, inserted time, selected course and status as "1". When ever a selected course is de-selected the status is set as "0" for that row.
Suppose the student has select course id 1 and 2.
Now using this query
select SselectedCourse AS [text()] FROM Sample.dbo.Tbl_student_details where var_rollnumber = '020803009' and status = 1 order by var_courseselectedtime desc FOR XML PATH('')
This will give me the result as "12" where 1 is physics and 2 is social.
the second table holds the value from 1-9
For e.g course id
1 = physics
2 = social
3 = chemistry
4 = geography
5 = computer
6 = Spoken Hindi
7 = Spoken English
8 = B.EEE
9 = B.ECE
now the current student has selected 1 and 2. So on first column, i get "12" and second column i need to get "3456789"(remaining courses).
How to write a query for this?

This is not in single query but is simple.
DECLARE #STUDENT AS TABLE(ID INT, COURSEID INT)
DECLARE #SEM AS TABLE (COURSEID INT, COURSE VARCHAR(100))
INSERT INTO #STUDENT VALUES(1, 1)
INSERT INTO #STUDENT VALUES(1, 2)
INSERT INTO #SEM VALUES(1, 'physics')
INSERT INTO #SEM VALUES(2, 'social')
INSERT INTO #SEM VALUES(3, 'chemistry')
INSERT INTO #SEM VALUES(4, 'geography')
INSERT INTO #SEM VALUES(5, 'computer')
INSERT INTO #SEM VALUES(6, 'Spoken Hindi')
INSERT INTO #SEM VALUES(7, 'Spoken English')
INSERT INTO #SEM VALUES(8, 'B.EEE')
INSERT INTO #SEM VALUES(9, 'B.ECE')
DECLARE #COURSEIDS_STUDENT VARCHAR(100), #COURSEIDS_SEM VARCHAR(100)
SELECT #COURSEIDS_STUDENT = COALESCE(#COURSEIDS_STUDENT, '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #STUDENT
SELECT #COURSEIDS_SEM = COALESCE(#COURSEIDS_SEM , '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #SEM WHERE COURSEID NOT IN (SELECT COURSEID FROM #STUDENT)
SELECT #COURSEIDS_STUDENT COURSEIDS_STUDENT, #COURSEIDS_SEM COURSEIDS_SEM

try this:
;WITH CTE as (select ROW_NUMBER() over (order by (select 0)) as rn,* from Sample.dbo.Tbl_student_details)
,CTE1 As(
select rn,SselectedCourse ,replace(stuff((select ''+courseid from course_details for xml path('')),1,1,''),SselectedCourse,'') as rem from CTE a
where rn = 1
union all
select c2.rn,c2.SselectedCourse,replace(rem,c2.SselectedCourse,'') as rem
from CTE1 c1 inner join CTE c2
on c2.rn=c1.rn+1
)
select STUFF((select ''+SselectedCourse from CTE1 for xml path('')),1,0,''),(select top 1 rem from CTE1 order by rn desc)

Segregate data between 2 intervals based on an identifier

One of my friends had this Q to me and I am too puzzled.
His team is loading a DW and the data keeps coming in incremental and full load fashion on an adhoc basic. Now there is identifier flag that says
as to when the full load has started or stopped. Now we need to collect and then segregate all full load.
For ex:
create table #tmp (
id int identity(1,1) not null,
name varchar(30) null,
val int null
)
insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8
drop table #tmp
The rule is:
whenever val is 9, the full load starts; whenever val is 8, the full
load stops; (or when whenever next val is 8, full load stops).
In this case, for full load, I should only collect these records:
id name val
3 houston 1
4 los angeles 4
10 philly 6
My Approach so far:
;with mycte as (
select id, name, val, row_number() over (order by id) as rnkst
from #tmp
where val in (8,9))
SELECT *
FROM mycte y
WHERE val = 9
AND Exists (
SELECT *
FROM mycte x
WHERE x.id =
----> this gives start 9 record but not stop record of 8
(SELECT MIN(id)
FROM mycte z
WHERE z.id > y.id)
AND val = 8)
I do not want to venture into cursor within cursor approach but with a CTE , please enlighten!
UPDATE:
As mentioned by one of the answerers I am restating the rules.
--> the full load records start coming AFTER 9. (9th records are NOT included)
--> the full load continues till it sees immediate 8.
--> So effectively all records BETWEEN 9 and 8 form small chunks of full load
--> An individual 9th record itself does not get considered as it has no 8 as partner
--> The result set shown below satisfies these conditions

I am not sure if my command of English will allow me to explain my approach fully, but I'll try, just in case it can help.
Rank all the rows and rank the bounds (val IN (8, 9)) separately.
Join the subset where val = 8 with the subset where val = 9 on the condition that the bound ranking of the former should be exactly 1 (one) greater than that of the latter.
Join the subset of non-(8, 9) rows to the result set of Step 2 on the condition that the (general) ranking should be between the ranking of the val = 9 subset and that of the val = 8 one.
Here's the query to illustrate my attempt at verbal description:
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (ORDER BY id),
bound_rnk = ROW_NUMBER() OVER (
PARTITION BY CASE WHEN val IN (8, 9) THEN 1 ELSE 2 END
ORDER BY id
)
FROM #tmp
)
SELECT
load.id,
load.name,
load.val
FROM ranked AS eight
INNER JOIN ranked AS nine ON eight.bound_rnk = nine.bound_rnk + 1
INNER JOIN ranked AS load ON load.rnk BETWEEN nine.rnk AND eight.rnk
WHERE eight.val = 8
AND nine .val = 9
AND load .val NOT IN (8, 9)
;
And you might not believe me but, when I tested it, it did return the following:
id name val
-- ----------- ---
3 houston 1
4 los angeles 4
10 philly 6

I do not believe there is a way to do this without a while loop or possibly a recursive cte that is going to be complex. So, my question would be if this is at all possible to accomplish in code? SQL is not as strong as a procedural language, so code would handle this better. If this is not an option, then I would go with a while loop (MUCH better than a cursor). I will create the SQL for this shortly.
/*
drop table #tmp
drop table #finalTmp
drop table #startStop
*/
create table #tmp (
id int identity(1,1) not null,
name varchar(30) null,
val int null
)
insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8
CREATE TABLE #Finaltmp
(
id INT,
name VARCHAR(30),
val INT
)
SELECT id, val, 0 AS Checked
INTO #StartStop
FROM #tmp
WHERE val IN (8,9)
DECLARE #StartId INT, #StopId INT
WHILE EXISTS (SELECT 1 FROM #StartStop WHERE Checked = 0)
BEGIN
SELECT TOP 1 #StopId = id
FROM #StartStop
WHERE EXISTS
--This makes sure we grab a stop that has a start before it
(
SELECT 1
FROM #StartStop AS TestCheck
WHERE TestCheck.id < #StartStop.id AND val = 9
)
AND Checked = 0 AND val = 8
ORDER BY id
--If no more starts, then the rest are stops
IF #StopId IS NULL
BREAK
SELECT TOP 1 #StartId = id
FROM #StartStop
WHERE Checked = 0 AND val = 9
--Make sure we only pick up the 9 that matches
AND Id < #StopId
ORDER BY Id DESC
IF #StartId IS NULL
BREAK
INSERT INTO #Finaltmp
SELECT *
FROM #tmp
WHERE id BETWEEN #StartId AND #StopId
AND val NOT IN (8,9)
--Make sure to "check" any values that fell in the middle (double 9's)
--If not, then you would start picking up overlap data
UPDATE #StartStop
SET Checked = 1
WHERE id <= #StopId
END
SELECT * FROM #Finaltmp
I noticed that the data looked a little wonky, so I tried to put some edge case checks and comments about them

Delete duplicated rows and Update references

How do I Delete duplicated rows in one Table and update References in another table to the remaining row? The duplication only occurs in the name. The Id Columns are Identity columns.
Example:
Assume we have two tables Doubles and Data.
Doubles table (
Id int,
Name varchar(50)
)
Data Table (
Id int,
DoublesId int
)
Now I Have Two entries in the Doubls table:
Id Name
1 Foo
2 Foo
And two entries in the Data Table:
ID DoublesId
1 1
2 2
At the end there should be only one entry in the Doubles Table:
Id Name
1 Foo
And two entries in the Data Table:
Id DoublesId
1 1
2 1
In the doubles Table there can be any number of duplicated rows per name (up to 30) and also regular 'single' rows.

I've not run this, but hopefully it should be correct, and close enough to the final soln to get you there. Let me know any mistakes if you like and I'll update the answer.
--updates the data table to the min ids for each name
update Data
set id = final_id
from
Data
join
Doubles
on Doubles.id = Data.id
join
(
select
name
min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
--deletes redundant ids from the Doubles table
delete
from Doubles
where id not in
(
select
min(id) as final_id
from Doubles
group by name
)

Note: I have taken the liberty to rename your Id's to DoubleID and DataID respectively. I find that eassier to work with.
DECLARE #Doubles TABLE (DoubleID INT, Name VARCHAR(50))
DECLARE #Data TABLE (DataID INT, DoubleID INT)
INSERT INTO #Doubles VALUES (1, 'Foo')
INSERT INTO #Doubles VALUES (2, 'Foo')
INSERT INTO #Doubles VALUES (3, 'Bar')
INSERT INTO #Doubles VALUES (4, 'Bar')
INSERT INTO #Data VALUES (1, 1)
INSERT INTO #Data VALUES (1, 2)
INSERT INTO #Data VALUES (1, 3)
INSERT INTO #Data VALUES (1, 4)
SELECT * FROM #Doubles
SELECT * FROM #Data
UPDATE #Data
SET DoubleID = MinDoubleID
FROM #Data dt
INNER JOIN #Doubles db ON db.DoubleID = dt.DoubleID
INNER JOIN (
SELECT db.Name, MinDoubleID = MIN(db.DoubleID)
FROM #Doubles db
GROUP BY db.Name
) dbmin ON dbmin.Name = db.Name
/* Kudos to quassnoi */
;WITH q AS (
SELECT Name, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) AS rn
FROM #Doubles
)
DELETE
FROM q
WHERE rn > 1
SELECT * FROM #Doubles
SELECT * FROM #Data

Take a look at this one, i have tried this, working fine
--create table Doubles ( Id int, Name varchar(50))
--create table Data( Id int, DoublesId int)
--select * from doubles
--select * from data
Declare #NonDuplicateID int
Declare #NonDuplicateName varchar(max)
DECLARE #sqlQuery nvarchar(max)
DECLARE DeleteDuplicate CURSOR FOR
SELECT Max(id),name AS SingleID FROM Doubles
GROUP BY [NAME]
OPEN DeleteDuplicate
FETCH NEXT FROM DeleteDuplicate INTO #NonDuplicateID, #NonDuplicateName
--Fetch next record
WHILE ##FETCH_STATUS = 0
BEGIN
--select b.ID , b.DoublesID, a.[name],a.id asdasd
--from doubles a inner join data b
--on
--a.ID=b.DoublesID
--where b.DoublesID<>#NonDuplicateID
--and a.[name]=#NonDuplicateName
print '---------------------------------------------';
select
#sqlQuery =
'update b
set b.DoublesID=' + cast(#NonDuplicateID as varchar(50)) + '
from
doubles a
inner join
data b
on
a.ID=b.DoublesID
where b.DoublesID<>' + cast(#NonDuplicateID as varchar(50)) +
' and a.[name]=''' + cast(#NonDuplicateName as varchar(max)) +'''';
print #sqlQuery
exec sp_executeSQL #sqlQuery
print '---------------------------------------------';
-- now move the cursor
FETCH NEXT FROM DeleteDuplicate INTO #NonDuplicateID ,#NonDuplicateName
END
CLOSE DeleteDuplicate --Close cursor
DEALLOCATE DeleteDuplicate --Deallocate cursor
---- Delete duplicate rows from original table
DELETE
FROM doubles
WHERE ID NOT IN
(
SELECT MAX(ID)
FROM doubles
GROUP BY [NAME]
)
Please try and let me know if this helped you
Thanks
~ Aamod

If you are using MYSQL following worked for me. I did it for 2 steps
Step 1 -> Update all Data rows to one Double table reference (with lowest id)
Step 2 -> Delete all duplicates with keeping lowest id
Step 1 ->
update Data
join
Doubles
on Data.DoublesId = Doubles.id
join
(
select name, min(id) as final_id
from Doubles
group by name
) min_ids
on min_ids.name = Doubles.name
set DoublesId = min_ids.final_id;
Step 2 ->
DELETE c1 FROM Doubles c1
INNER JOIN Doubles c2
WHERE
c1.id > c2.id AND
c1.name = c2.name;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Updating thousands of records with different values - sql

If anyones interested, I went with this : CREATE TABLE #TempTable( Id int, val int ) INSERT INTO #TempTable (Id, val) Values (1, 57), (2, 99) Update Person Set Id = tp.Id, val = tp.val FROM Person p INNER JOIN #TempTable as tp on tp.Id = p.Id

Related

Find data by multiple Lookup table clauses

Nested while loop in SQL Server is not showing the expected result

comparing two colums in sqlserver and returing the remaining data

Segregate data between 2 intervals based on an identifier

Delete duplicated rows and Update references

Categories

Resources