One of my friends had this Q to me and I am too puzzled.
His team is loading a DW and the data keeps coming in incremental and full load fashion on an adhoc basic. Now there is identifier flag that says
as to when the full load has started or stopped. Now we need to collect and then segregate all full load.
For ex:
create table #tmp (
id int identity(1,1) not null,
name varchar(30) null,
val int null
)
insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8
drop table #tmp
The rule is:
whenever val is 9, the full load starts; whenever val is 8, the full
load stops; (or when whenever next val is 8, full load stops).
In this case, for full load, I should only collect these records:
id name val
3 houston 1
4 los angeles 4
10 philly 6
My Approach so far:
;with mycte as (
select id, name, val, row_number() over (order by id) as rnkst
from #tmp
where val in (8,9))
SELECT *
FROM mycte y
WHERE val = 9
AND Exists (
SELECT *
FROM mycte x
WHERE x.id =
----> this gives start 9 record but not stop record of 8
(SELECT MIN(id)
FROM mycte z
WHERE z.id > y.id)
AND val = 8)
I do not want to venture into cursor within cursor approach but with a CTE , please enlighten!
UPDATE:
As mentioned by one of the answerers I am restating the rules.
--> the full load records start coming AFTER 9. (9th records are NOT included)
--> the full load continues till it sees immediate 8.
--> So effectively all records BETWEEN 9 and 8 form small chunks of full load
--> An individual 9th record itself does not get considered as it has no 8 as partner
--> The result set shown below satisfies these conditions
I am not sure if my command of English will allow me to explain my approach fully, but I'll try, just in case it can help.
Rank all the rows and rank the bounds (val IN (8, 9)) separately.
Join the subset where val = 8 with the subset where val = 9 on the condition that the bound ranking of the former should be exactly 1 (one) greater than that of the latter.
Join the subset of non-(8, 9) rows to the result set of Step 2 on the condition that the (general) ranking should be between the ranking of the val = 9 subset and that of the val = 8 one.
Here's the query to illustrate my attempt at verbal description:
WITH ranked AS (
SELECT
*,
rnk = ROW_NUMBER() OVER (ORDER BY id),
bound_rnk = ROW_NUMBER() OVER (
PARTITION BY CASE WHEN val IN (8, 9) THEN 1 ELSE 2 END
ORDER BY id
)
FROM #tmp
)
SELECT
load.id,
load.name,
load.val
FROM ranked AS eight
INNER JOIN ranked AS nine ON eight.bound_rnk = nine.bound_rnk + 1
INNER JOIN ranked AS load ON load.rnk BETWEEN nine.rnk AND eight.rnk
WHERE eight.val = 8
AND nine .val = 9
AND load .val NOT IN (8, 9)
;
And you might not believe me but, when I tested it, it did return the following:
id name val
-- ----------- ---
3 houston 1
4 los angeles 4
10 philly 6
I do not believe there is a way to do this without a while loop or possibly a recursive cte that is going to be complex. So, my question would be if this is at all possible to accomplish in code? SQL is not as strong as a procedural language, so code would handle this better. If this is not an option, then I would go with a while loop (MUCH better than a cursor). I will create the SQL for this shortly.
/*
drop table #tmp
drop table #finalTmp
drop table #startStop
*/
create table #tmp (
id int identity(1,1) not null,
name varchar(30) null,
val int null
)
insert into #tmp (name, val) select 'detroit', 3
insert into #tmp (name, val) select 'california', 9
insert into #tmp (name, val) select 'houston', 1
insert into #tmp (name, val) select 'los angeles', 4
insert into #tmp (name, val) select 'newyork', 8
insert into #tmp (name, val) select 'chicago', 1
insert into #tmp (name, val) select 'seattle', 9
insert into #tmp (name, val) select 'michigan', 6
insert into #tmp (name, val) select 'atlanta', 9
insert into #tmp (name, val) select 'philly', 6
insert into #tmp (name, val) select 'brooklyn', 8
CREATE TABLE #Finaltmp
(
id INT,
name VARCHAR(30),
val INT
)
SELECT id, val, 0 AS Checked
INTO #StartStop
FROM #tmp
WHERE val IN (8,9)
DECLARE #StartId INT, #StopId INT
WHILE EXISTS (SELECT 1 FROM #StartStop WHERE Checked = 0)
BEGIN
SELECT TOP 1 #StopId = id
FROM #StartStop
WHERE EXISTS
--This makes sure we grab a stop that has a start before it
(
SELECT 1
FROM #StartStop AS TestCheck
WHERE TestCheck.id < #StartStop.id AND val = 9
)
AND Checked = 0 AND val = 8
ORDER BY id
--If no more starts, then the rest are stops
IF #StopId IS NULL
BREAK
SELECT TOP 1 #StartId = id
FROM #StartStop
WHERE Checked = 0 AND val = 9
--Make sure we only pick up the 9 that matches
AND Id < #StopId
ORDER BY Id DESC
IF #StartId IS NULL
BREAK
INSERT INTO #Finaltmp
SELECT *
FROM #tmp
WHERE id BETWEEN #StartId AND #StopId
AND val NOT IN (8,9)
--Make sure to "check" any values that fell in the middle (double 9's)
--If not, then you would start picking up overlap data
UPDATE #StartStop
SET Checked = 1
WHERE id <= #StopId
END
SELECT * FROM #Finaltmp
I noticed that the data looked a little wonky, so I tried to put some edge case checks and comments about them
Related
declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
The Name Table has more than just 1,2,3 and the list to parse on is dynamic
NameTable
id | name
----------
1 foo
2 bar
3 steak
CharacterTable
id | name
---------
1 tom
2 jerry
3 dog
NameToCharacterTable
id | nameId | characterId
1 1 1
2 1 3
3 1 2
4 2 1
I am looking for a query that will return a character that has two names. For example
With the above data only "tom" will be returned.
SELECT *
FROM nameToCharacterTable
WHERE nameId in (1,2)
The in clause will return every row that has a 1 or a 3. I want to only return the rows that have both a 1 and a 3.
I am stumped I have tried everything I know and do not want to resort to dynamic SQL. Any help would be great
The 1,3 in this example will be a dynamic list of integers. for example it could be 1,3,4,5,.....
Filter out a count of how many times the Character appears in the CharacterToName table matching the list you are providing (which I have assumed you can convert into a table variable or temp table) e.g.
declare #Character table (id int, [name] varchar(12));
insert into #Character (id, [name])
values
(1, 'tom'),
(2, 'jerry'),
(3, 'dog');
declare #NameToCharacter table (id int, nameId int, characterId int);
insert into #NameToCharacter (id, nameId, characterId)
values
(1, 1, 1),
(2, 1, 3),
(3, 1, 2),
(4, 2, 1);
declare #RequiredNames table (nameId int);
insert into #RequiredNames (nameId)
values
(1),
(2);
select *
from #Character C
where (
select count(*)
from #NameToCharacter NC
where NC.characterId = c.id
and NC.nameId in (select nameId from #RequiredNames)
) = 2;
Returns:
id
name
1
tom
Note: Providing DDL+DML as shown here makes it much easier for people to assist you.
This is classic Relational Division With Remainder.
There are a number of different solutions. #DaleK has given you an excellent one: inner-join everything, then check that each set has the right amount. This is normally the fastest solution.
If you want to ensure it works with a dynamic amount of rows, just change the last line to
) = (SELECT COUNT(*) FROM #RequiredNames);
Two other common solutions exist.
Left-join and check that all rows were joined
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM #RequiredNames rn
LEFT JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = COUNT(nc.nameId) -- all rows are joined
);
Double anti-join, in other words: there are no "required" that are "not in the set"
SELECT *
FROM #Character c
WHERE NOT EXISTS (SELECT 1
FROM #RequiredNames rn
WHERE NOT EXISTS (SELECT 1
FROM #NameToCharacter nc
WHERE nc.nameId = rn.nameId AND nc.characterId = c.id
)
);
A variation on the one from the other answer uses a windowed aggregate instead of a subquery. I don't think this is performant, but it may have uses in certain cases.
SELECT *
FROM #Character c
WHERE EXISTS (SELECT 1
FROM (
SELECT *, COUNT(*) OVER () AS cnt
FROM #RequiredNames
) rn
JOIN #NameToCharacter nc ON nc.nameId = rn.nameId AND nc.characterId = c.id
HAVING COUNT(*) = MIN(rn.cnt)
);
db<>fiddle
I have a sample data like this
Declare #table Table
(
ID INT,
Value VARCHAR(10),
Is_failure int
)
insert into #table(ID, Value, Is_failure) values (1, 'Bits', 0)
insert into #table(ID, Value, Is_failure) values (2, 'Ip', 0)
insert into #table(ID, Value, Is_failure) values (3, 'DNA', 0)
insert into #table(ID, Value, Is_failure) values (6, 'DCP', 1)
insert into #table(ID, Value, Is_failure) values (8, 'Bits', 0)
insert into #table(ID, Value, Is_failure) values (11, 'calc', 0)
insert into #table(ID, Value, Is_failure) values (14, 'DISC', 0)
insert into #table(ID, Value, Is_failure) values (19, 'DHCP', 1)
Looks like this:
ID Value Is_failure
1 Bits 0
2 Ip 0
3 DNA 0
6 DCP 1
8 Bits 0
11 calc 0
14 DISC 0
19 DHCP 1
Data continuous like this ... I need to fetch top 2 records along with Is_failure whenever Is_failure = 1 comes if it is 0 no need to pick up .
Sample output:
ID Value Is_failure
2 Ip 0
3 DNA 0
6 DCP 1
11 calc 0
14 DISC 0
19 DHCP 1
Suggest on this I have tried with having count(*) and other things but not fruitful.
You can use this query
Declare #tmptable Table
(
ID INT,
Value VARCHAR(10),
Is_failure int,
rowNum int
)
Declare #continuousRows int =2
insert into #tmptable
select *,ROW_NUMBER() over (order by id) from #table
;with cte1 as
(select *
from #tmptable t
where (select sum(Is_failure) from #tmptable t1 where t1.rowNum between t.rowNum-#continuousRows and t.rowNum
having count(*)=#continuousRows+1)=1
and t.Is_failure=1
)
,cte2 as
(
select t.* from #tmptable t
join cte1 c on t.rowNum between c.rowNum-#continuousRows and c.rowNum
)
select c.ID,value,Is_failure from cte2 c
You can use window functions for this:
select id, value, is_failure
from (select t.*,
lead(Is_failure) over (order by id) as next_if,
lead(Is_failure, 2) over (order by id) as next_if2
from #table t
) t
where 1 in (Is_failure, next_if, next_if2)
order by id;
You can simplify this with a windowing clause:
select id, value, is_failure
from (select t.*,
max(is_failure) over (order by id rows between current row and 2 following) as has_failure
from #table t
) t
where has_failure > 0
order by id;
I've been given a spreadsheet in the format of :
Id | Val
1 57
2 99
There's approximately 10,000 records - Any ideas to handle the query below for 10,000 records without manually writing each case statement, tediously. Thanks.
update person
SET val = (
case
when Id = 1 then 57
when Id = 2 then 99
end),
where Id in (1, 2)
Quick and dirty? here you go
Add a new spredsheet call the old one datatable
In the first row first column you write
"Update person set val = ("
in the second column you link to the value on datatable spreadsheet
third column ") where ID = ("
fourth column you link to the ID of the datatable spreadsheet
fifth column ")"
Then you mark the whole row and pull it downwards to row 10000
Copy past into query escecute
I think this example can be help you :
CREATE TABLE #Person
(PrimaryKey int PRIMARY KEY,
ValueSome varchar(50)
);
GO
CREATE TABLE #MySpreadSheet
(PrimaryKey int PRIMARY KEY,
ValueSpread varchar(50)
);
GO
INSERT INTO #Person
SELECT 1, 'someValue'
INSERT INTO #Person
SELECT 2, 'someValueBeforeUpdate'
INSERT INTO #Person
SELECT 3, ''
INSERT INTO #MySpreadSheet
SELECT 1, '45'
INSERT INTO #MySpreadSheet
SELECT 2, '56'
INSERT INTO #MySpreadSheet
SELECT 3, '34'
SELECT * FROM #Person
SELECT * FROM #MySpreadSheet
UPDATE P SET P.ValueSome = SS.ValueSpread FROM #Person P JOIN #MySpreadSheet SS ON P.PrimaryKey = SS.PrimaryKey
SELECT * FROM #Person
DROP TABLE #Person
DROP TABLE #MySpreadSheet
If anyones interested, I went with this :
CREATE TABLE #TempTable(
Id int,
val int
)
INSERT INTO #TempTable (Id, val)
Values (1, 57),
(2, 99)
Update Person
Set Id = tp.Id,
val = tp.val
FROM Person p
INNER JOIN #TempTable as tp on tp.Id = p.Id
create table #example (id int , value int)
insert into #example (id, value) values (1, 10)
insert into #example (id, value) values (2, 20)
select * from #example
id value
1 10
2 20
update #example
set value = case when id = 1 then 100
when id = 2 then 200 end
where id in (1,2)
select * from #example
id value
1 100
2 200
We have a requirement to assign row number to all rows using following rule
Row if pinned should have same row number
Otherwise sort it by GMD
Example:
ID GMD IsPinned
1 2.5 0
2 0 1
3 2 0
4 4 1
5 3 0
Should Output
ID GMD IsPinned RowNo
5 3 0 1
2 0 1 2
1 2.5 0 3
4 4 1 4
3 2 0 5
Please Note row number for Id's 2 and 4 stayed intact as they are pinned with values of 2 and 4 respectively even though the GMD are not in any order
Rest of rows Id's 1, 3 and 5 row numbers are sorted using GMD desc
I tried using RowNumber SQL 2012 however, it is pushing pinned items from their position
Here's a set-based approach to solving this. Note that the first CTE is unnecessary if you already have a Numbers table in your database:
declare #t table (ID int,GMD decimal(5,2),IsPinned bit)
insert into #t (ID,GMD,IsPinned) values
(1,2.5,0), (2, 0 ,1), (3, 2 ,0), (4, 4 ,1), (5, 3 ,0)
;With Numbers as (
select ROW_NUMBER() OVER (ORDER BY ID) n from #t
), NumbersWithout as (
select
n,
ROW_NUMBER() OVER (ORDER BY n) as rn
from
Numbers
where n not in (select ID from #t where IsPinned=1)
), DataWithout as (
select
*,
ROW_NUMBER() OVER (ORDER BY GMD desc) as rn
from
#t
where
IsPinned = 0
)
select
t.*,
COALESCE(nw.n,t.ID) as RowNo
from
#t t
left join
DataWithout dw
inner join
NumbersWithout nw
on
dw.rn = nw.rn
on
dw.ID = t.ID
order by COALESCE(nw.n,t.ID)
Hopefully my naming makes it clear what we're doing. I'm a bit cheeky in the final SELECT by using a COALESCE to get the final RowNo when you might have expected a CASE expression. But it works because the contents of the DataWithout CTE is defined to only exist for unpinned items which makes the final LEFT JOIN fail.
Results:
ID GMD IsPinned RowNo
----------- --------------------------------------- -------- --------------------
5 3.00 0 1
2 0.00 1 2
1 2.50 0 3
4 4.00 1 4
3 2.00 0 5
Second variant that may perform better (but never assume, always test):
declare #t table (ID int,GMD decimal(5,2),IsPinned bit)
insert into #t (ID,GMD,IsPinned) values
(1,2.5,0), (2, 0 ,1), (3, 2 ,0), (4, 4 ,1), (5, 3 ,0)
;With Numbers as (
select ROW_NUMBER() OVER (ORDER BY ID) n from #t
), NumbersWithout as (
select
n,
ROW_NUMBER() OVER (ORDER BY n) as rn
from
Numbers
where n not in (select ID from #t where IsPinned=1)
), DataPartitioned as (
select
*,
ROW_NUMBER() OVER (PARTITION BY IsPinned ORDER BY GMD desc) as rn
from
#t
)
select
dp.ID,dp.GMD,dp.IsPinned,
CASE WHEN IsPinned = 1 THEN ID ELSE nw.n END as RowNo
from
DataPartitioned dp
left join
NumbersWithout nw
on
dp.rn = nw.rn
order by RowNo
In the third CTE, by introducing the PARTITION BY and removing the WHERE clause we ensure we have all rows of data so we don't need to re-join to the original table in the final result in this variant.
this will work:
CREATE TABLE Table1
("ID" int, "GMD" number, "IsPinned" int)
;
INSERT ALL
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (1, 2.5, 0)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (2, 0, 1)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (3, 2, 0)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (4, 4, 1)
INTO Table1 ("ID", "GMD", "IsPinned")
VALUES (5, 3, 0)
SELECT * FROM dual
;
select * from (select "ID","GMD","IsPinned",rank from(select m.*,rank()over(order by
"ID" asc) rank from Table1 m where "IsPinned"=1)
union
(select "ID","GMD","IsPinned",rank from (select t.*,rank() over(order by "GMD"
desc)-1 rank from (SELECT * FROM Table1)t)
where "IsPinned"=0) order by "GMD" desc) order by rank ,GMD;
output:
2 0 1 1
5 3 0 1
1 2.5 0 2
4 4 1 2
3 2 0 3
Can you try this query
CREATE TABLE Table1
(ID int, GMD numeric (18,2), IsPinned int);
INSERT INTO Table1 (ID,GMD, IsPinned)
VALUES (1, 2.5, 0),
(2, 0, 1),
(3, 2, 0),
(4, 4, 1),
(5, 3, 0)
select *, row_number () over(partition by IsPinned order by (case when IsPinned =0 then GMD else id end) ) [CustOrder] from Table1
This took longer then I thought, the thing is row_number would take a part to resolve the query. We need to differentiate the row_numbers by id first and then we can apply the while loop or cursor or any iteration, in our case we will just use the while loop.
dbo.test (you can replace test with your table name)
1 2.5 False
2 0 True
3 3 False
4 4 True
6 2 False
Here is the query I wrote to achieve your result, I have added comment under each operation you should get it, if you have any difficultly let me know.
Query:
--user data table
DECLARE #userData TABLE
(
id INT NOT NULL,
gmd FLOAT NOT NULL,
ispinned BIT NOT NULL,
rownumber INT NOT NULL
);
--final result table
DECLARE #finalResult TABLE
(
id INT NOT NULL,
gmd FLOAT NOT NULL,
ispinned BIT NOT NULL,
newrownumber INT NOT NULL
);
--inserting to uer data table from the table test
INSERT INTO #userData
SELECT t.*,
Row_number()
OVER (
ORDER BY t.id ASC) AS RowNumber
FROM test t
--creating new table for ids of not pinned
CREATE TABLE #ids
(
rn INT,
id INT,
gmd FLOAT
)
-- inserting into temp table named and adding gmd by desc
INSERT INTO #ids
(rn,
id,
gmd)
SELECT DISTINCT Row_number()
OVER(
ORDER BY gmd DESC) AS rn,
id,
gmd
FROM #userData
WHERE ispinned = 0
--declaring the variable to loop through all the no pinned items
DECLARE #id INT
DECLARE #totalrows INT = (SELECT Count(*)
FROM #ids)
DECLARE #currentrow INT = 1
DECLARE #assigningNumber INT = 1
--inerting pinned items first
INSERT INTO #finalResult
SELECT ud.id,
ud.gmd,
ud.ispinned,
ud.rownumber
FROM #userData ud
WHERE ispinned = 1
--looping through all the rows till all non-pinned items finished
WHILE #currentrow <= #totalrows
BEGIN
--skipping pinned numers for the rows
WHILE EXISTS(SELECT 1
FROM #finalResult
WHERE newrownumber = #assigningNumber
AND ispinned = 1)
BEGIN
SET #assigningNumber = #assigningNumber + 1
END
--getting row by the number
SET #id = (SELECT id
FROM #ids
WHERE rn = #currentrow)
--inserting the non-pinned item with new row number into the final result
INSERT INTO #finalResult
SELECT ud.id,
ud.gmd,
ud.ispinned,
#assigningNumber
FROM #userData ud
WHERE id = #id
--going to next row
SET #currentrow = #currentrow + 1
SET #assigningNumber = #assigningNumber + 1
END
--getting final result
SELECT *
FROM #finalResult
ORDER BY newrownumber ASC
--dropping table
DROP TABLE #ids
Output:
I have two tables. First one is student table where he can select two optional courses and other table is current semester's optional courses list.
When ever the student selects a course, row is inserted with basic details such as roll number, inserted time, selected course and status as "1". When ever a selected course is de-selected the status is set as "0" for that row.
Suppose the student has select course id 1 and 2.
Now using this query
select SselectedCourse AS [text()] FROM Sample.dbo.Tbl_student_details where var_rollnumber = '020803009' and status = 1 order by var_courseselectedtime desc FOR XML PATH('')
This will give me the result as "12" where 1 is physics and 2 is social.
the second table holds the value from 1-9
For e.g course id
1 = physics
2 = social
3 = chemistry
4 = geography
5 = computer
6 = Spoken Hindi
7 = Spoken English
8 = B.EEE
9 = B.ECE
now the current student has selected 1 and 2. So on first column, i get "12" and second column i need to get "3456789"(remaining courses).
How to write a query for this?
This is not in single query but is simple.
DECLARE #STUDENT AS TABLE(ID INT, COURSEID INT)
DECLARE #SEM AS TABLE (COURSEID INT, COURSE VARCHAR(100))
INSERT INTO #STUDENT VALUES(1, 1)
INSERT INTO #STUDENT VALUES(1, 2)
INSERT INTO #SEM VALUES(1, 'physics')
INSERT INTO #SEM VALUES(2, 'social')
INSERT INTO #SEM VALUES(3, 'chemistry')
INSERT INTO #SEM VALUES(4, 'geography')
INSERT INTO #SEM VALUES(5, 'computer')
INSERT INTO #SEM VALUES(6, 'Spoken Hindi')
INSERT INTO #SEM VALUES(7, 'Spoken English')
INSERT INTO #SEM VALUES(8, 'B.EEE')
INSERT INTO #SEM VALUES(9, 'B.ECE')
DECLARE #COURSEIDS_STUDENT VARCHAR(100), #COURSEIDS_SEM VARCHAR(100)
SELECT #COURSEIDS_STUDENT = COALESCE(#COURSEIDS_STUDENT, '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #STUDENT
SELECT #COURSEIDS_SEM = COALESCE(#COURSEIDS_SEM , '') + CONVERT(VARCHAR(10), COURSEID) + ' ' FROM #SEM WHERE COURSEID NOT IN (SELECT COURSEID FROM #STUDENT)
SELECT #COURSEIDS_STUDENT COURSEIDS_STUDENT, #COURSEIDS_SEM COURSEIDS_SEM
try this:
;WITH CTE as (select ROW_NUMBER() over (order by (select 0)) as rn,* from Sample.dbo.Tbl_student_details)
,CTE1 As(
select rn,SselectedCourse ,replace(stuff((select ''+courseid from course_details for xml path('')),1,1,''),SselectedCourse,'') as rem from CTE a
where rn = 1
union all
select c2.rn,c2.SselectedCourse,replace(rem,c2.SselectedCourse,'') as rem
from CTE1 c1 inner join CTE c2
on c2.rn=c1.rn+1
)
select STUFF((select ''+SselectedCourse from CTE1 for xml path('')),1,0,''),(select top 1 rem from CTE1 order by rn desc)