MS SQL Set Group ID Without Looping - sql

I would like create a query in MS-SQL to make a column containing an incrementing group number.
This is how I want my data to return:
Column 1 | Column 2 | Column 3
------------------------------
I | 1 | 1
O | 2 | 2
O | 2 | 3
I | 3 | 4
O | 4 | 5
O | 4 | 6
O | 4 | 7
O | 4 | 8
I | 5 | 9
O | 6 | 10
Column 1 is the I and O meaning In and Out.
Column 2 is the row Group (this should increment when Column 1 changes).
Column 3 is the Row-number.
So how can I write my query so that Column 2 increments every time Column 1 changes?

Firstly, to perform this kind of operation you need some column that can identify the order of the rows. If you have a column that determines this order, an identity column for example, it can be used to do something like this:
Runnable sample:
CREATE TABLE #Groups
(
id INT IDENTITY(1, 1) , -- added identity to provide order
Column1 VARCHAR(1)
)
INSERT INTO #Groups
( Column1 )
VALUES ( 'I' ),
( 'O' ),
( 'O' ),
( 'I' ),
( 'O' ),
( 'O' ),
( 'O' ),
( 'O' ),
( 'I' ),
( 'O' );
;
WITH cte
AS ( SELECT id ,
Column1 ,
1 AS Column2
FROM #Groups
WHERE id = 1
UNION ALL
SELECT g.id ,
g.Column1 ,
CASE WHEN g.Column1 = cte.Column1 THEN cte.Column2
ELSE cte.Column2 + 1
END AS Column2
FROM #Groups g
INNER JOIN cte ON cte.id + 1 = g.id
)
SELECT *
FROM cte
OPTION (MAXRECURSION 0) -- required to allow for more than 100 recursions
DROP TABLE #Groups
This code effectively loops through the records, comparing each row to the next and incrementing the value of Column2 if the value in Column1 changes.
If you don't have an identity column, then you might consider adding one.
Credit #AeroX:
With 30K records, the last line: OPTION (MAXRECURSION 0) is required to override the default of 100 recursions when using a Common Table Expression (CTE). Setting it to 0, means that it isn't limited.

This will work if you have sqlserver 2012+
DECLARE #t table(col1 char(1), col3 int identity(1,1))
INSERT #t values
('I'), ('O'), ('O'), ('I'), ('O'), ('O'), ('O'), ('O'), ('I'), ('O')
;WITH CTE AS
(
SELECT
case when lag(col1) over (order by col3) = col1
then 0 else 1 end increase,
col1,
col3
FROM #t
)
SELECT
col1,
sum(increase) over (order by col3) col2,
col3
FROM CTE
Result:
col1 col2 col3
I 1 1
O 2 2
O 2 3
I 3 4
O 4 5
O 4 6
O 4 7
O 4 8
I 5 9
O 6 10

Related

T-SQL sequential updating with two columns

I have a table created by:
CREATE TABLE table1
(
id INT,
multiplier INT,
col1 DECIMAL(10,5)
)
INSERT INTO table1
VALUES (1, 2, 1.53), (2, 3, NULL), (3, 2, NULL),
(4, 2, NULL), (5, 3, NULL), (6, 1, NULL)
Which results in:
id multiplier col1
-----------------------
1 2 1.53000
2 3 NULL
3 2 NULL
4 2 NULL
5 3 NULL
6 1 NULL
I want to add a column col2 which is defined as multiplier * col1, however the next value of col1 then updates to take the previous calculated value of col2.
The resulting table should look like:
id multiplier col1 col2
---------------------------------------
1 2 1.53000 3.06000
2 3 3.06000 9.18000
3 2 9.18000 18.36000
4 2 18.36000 36.72000
5 3 36.72000 110.16000
6 1 110.16000 110.16000
Is this possible using T-SQL? I've tried a few different things such as joining id to id - 1 and have played around with a sequential update using UPDATE and setting variables but I can't get it to work.
A recursive CTE might be the best approach. Assuming your ids have no gaps:
with cte as (
select id, multiplier, convert(float, col1) as col1, convert(float, col1 * multiplier) as col2
from table1
where id = 1
union all
select t1.id, t1.multiplier, cte.col2 as col1, cte.col2 * t1.multiplier
from cte join
table1 t1
on t1.id = cte.id + 1
)
select *
from cte;
Here is a db<>fiddle.
Note that I converted the destination type to float, which is convenient for this sort of operation. You can convert back to decimal if you prefer that.
Basically, this would require an aggregate/window function that computes the product of column values. Such set function does not exists in SQL though. We can work around this with arithmetics:
select
id,
multiplier,
coalesce(min(col1) over() * exp(sum(log(multiplier)) over(order by id rows between unbounded preceding and 1 preceding)), col1) col1,
min(col1) over() * exp(sum(log(multiplier)) over(order by id)) col2
from table1
Demo on DB Fiddle:
id | multiplier | col1 | col2
-: | ---------: | -----: | -----:
1 | 2 | 1.53 | 3.06
2 | 3 | 3.06 | 9.18
3 | 2 | 9.18 | 18.36
4 | 2 | 18.36 | 36.72
5 | 3 | 36.72 | 110.16
6 | 1 | 110.16 | 110.16
This will fail if there are negative multipliers.
If you wanted an update statement:
with cte as (
select col1, col2,
coalesce(min(col1) over() * exp(sum(log(multiplier)) over(order by id rows between unbounded preceding and 1 preceding)), col1) col1_new,
min(col1) over() * exp(sum(log(multiplier)) over(order by id)) col2_new
from table1
)
update cte set col1 = col1_new, col2 = col2_new

How to exclude certain rows from sql select

How do I exclude certain rows?
For example, I have the following table:
+------+------+------+
| Col1 | Col2 | Col3 |
+------+------+------+
| 1 | 1 | R |
| 1 | 2 | D |
| 2 | 3 | R |
| 2 | 4 | R |
| 3 | 5 | R |
| 4 | 6 | D |
+------+------+------+
I need to select only:
| 2 | 3 | R |
| 2 | 4 | R |
| 3 | 5 | R |
My select that does not work properly:
with t (c1,c2,c3) as(
select 1 , 1 , 'R' from dual union all
select 1 , 2 , 'D' from dual union all
select 2 , 3 , 'R' from dual union all
select 2 , 4 , 'R' from dual union all
select 3 , 5 , 'R' from dual union all
select 4 , 6 , 'D' from dual),
tt as (select t.*,count(*) over (partition by c1) cc from t ) select * from tt where cc=1 and c3='R';
Thanks in advance!
select * from table where col2 = 'R'
or if you want to exclude rows with D value just
select * from table where col2 != 'D'
It depends on your requirements but you can do in this way:
SELECT * FROM `table` WHERE col1 = 2 AND col3 = "R"
if you want to exclude just do it like WHERE col1 != 1
You ca also use IN clause also e.g.
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1, value2, ...);
This syntax is for MySql, but you can modify it as per your requirement or database you are using.
this will work :
select * from (select * from table_name) where rownum<=4
minus
select * from ( select * from table_name) where rownum<=2
My guess is that you want all rows for a col1 where no row for a col1 = D and at least 1 row for a col1 = R. # where [not] exists may do
DROP TABLE T;
CREATE TABLE T
(Col1 NUMBER, Col2 NUMBER, Col3 VARCHAR(1));
INSERT INTO T VALUES ( 1 , 1 , 'R');
INSERT INTO T VALUES ( 1 , 2 , 'D');
INSERT INTO T VALUES ( 2 , 3 , 'R');
INSERT INTO T VALUES ( 2 , 4 , 'R');
INSERT INTO T VALUES ( 3 , 5 , 'R');
INSERT INTO T VALUES ( 3 , 6 , 'D');
INSERT INTO T VALUES ( 4 , 5 , 'X');
INSERT INTO T VALUES ( 4 , 6 , 'Y');
INSERT INTO T VALUES ( 5 , 6 , 'X');
INSERT INTO T VALUES ( 5 , 5 , 'R');
INSERT INTO T VALUES ( 5 , 6 , 'Y');
SELECT *
FROM T
WHERE NOT EXISTS(SELECT 1 FROM T T1 WHERE T1.COL1 = T.COL1 AND COL3 = 'D') AND
EXISTS(SELECT 1 FROM T T1 WHERE T1.COL1 = T.COL1 AND COL3 = 'R');
Result
COL1 COL2 COL3
---------- ---------- ----
5 6 X
5 5 R
5 6 Y
2 3 R
2 4 R
use row_number() window function
with t (c1,c2,c3) as(
select 1 , 1 , 'R' from dual union all
select 1 , 2 , 'D' from dual union all
select 2 , 3 , 'R' from dual union all
select 2 , 4 , 'R' from dual union all
select 3 , 5 , 'R' from dual union all
select 4 , 6 , 'D' from dual
),
t1 as
(
select c1,c2,c3,row_number() over(order by c2) rn from t
) select * from t1 where t1.rn>=3 and t1.rn<=5
demo link
C1 C2 C3
2 3 R
2 4 R
3 5 R
You can try using correlated subquery
select * from tablename a
from
where exists (select 1 tablename b where a.col1=b.col1 having count(*)>1)
Based on what you have provided I can only surmise that the only requirement is for COL1 to be equal to 2 or 3 in that case all you have to do is (assuming that you actually have table);
SELECT * FROM <table_name>
WHERE col1 IN (2,3);
This will give you the desired output for the particular example provided in the question. If there is a selection requirement that goes beyond retrieving data where column 1 is either 2 or 3 than a more specific or precise answer can be provided.

DENSE_RANK() without duplication

Here's what my data looks like:
| col1 | col2 | denserank | whatiwant |
|------|------|-----------|-----------|
| 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 1 |
| 3 | 2 | 2 | 2 |
| 4 | 2 | 2 | 2 |
| 5 | 1 | 1 | 3 |
| 6 | 2 | 2 | 4 |
| 7 | 2 | 2 | 4 |
| 8 | 3 | 3 | 5 |
Here's the query I have so far:
SELECT col1, col2, DENSE_RANK() OVER (ORDER BY COL2) AS [denserank]
FROM [table1]
ORDER BY [col1] asc
What I'd like to achieve is for my denserank column to increment every time there is a change in the value of col2 (even if the value itself is reused). I can't actually order by the column I have denserank on, so that won't work). See the whatiwant column for an example.
Is there any way to achieve this with DENSE_RANK()? Or is there an alternative?
I would do it with a recursive cte like this:
declare #Dept table (col1 integer, col2 integer)
insert into #Dept values(1, 1),(2, 1),(3, 2),(4, 2),(5, 1),(6, 2),(7, 2),(8, 3)
;with a as (
select col1, col2,
ROW_NUMBER() over (order by col1) as rn
from #Dept),
s as
(select col1, col2, rn, 1 as dr from a where rn=1
union all
select a.col1, a.col2, a.rn, case when a.col2=s.col2 then s.dr else s.dr+1 end as dr
from a inner join s on a.rn=s.rn+1)
col1, col2, dr from s
result:
col1 col2 dr
----------- ----------- -----------
1 1 1
2 1 1
3 2 2
4 2 2
5 1 3
6 2 4
7 2 4
8 3 5
The ROW_NUMBER is only required in case your col1 values are not sequential. If they are you can use the recursive cte straight away
Try this using window functions:
with t(col1 ,col2) as (
select 1 , 1 union all
select 2 , 1 union all
select 3 , 2 union all
select 4 , 2 union all
select 5 , 1 union all
select 6 , 2 union all
select 7 , 2 union all
select 8 , 3
)
select t.col1,
t.col2,
sum(x) over (
order by col1
) whatyouwant
from (
select t.*,
case
when col2 = lag(col2) over (
order by col1
)
then 0
else 1
end x
from t
) t
order by col1;
Produces:
It does a single table read and forms group of consecutive equal col2 values in increasing order of col1 and then finds dense rank on that.
x: Assign value 0 if previous row's col2 is same as this row's col2 (in order of increasing col1) otherwise 1
whatyouwant: create groups of equal values of col2 in order of increasing col1 by doing an incremental sum of the value x generated in the last step and that's your output.
Here is one way using SUM OVER(Order by) window aggregate function
SELECT col1,Col2,
Sum(CASE WHEN a.prev_val = a.col2 THEN 0 ELSE 1 END) OVER(ORDER BY col1) AS whatiwant
FROM (SELECT col1,
col2,
Lag(col2, 1)OVER(ORDER BY col1) AS prev_val
FROM Yourtable) a
ORDER BY col1;
How it works:
LAG window function is used to find the previous col2 for each row ordered by col1
SUM OVER(Order by) will increment the number only when previous col2 is not equal to current col2
I think this is possible in pure SQL using some gaps and islands tricks, but the path of least resistance might be to use a session variable combined with LAG() to keep track of when your computed dense rank changes value. In the query below, I use #a to keep track of the change in the dense rank, and when it changes this variable is incremented by 1.
DECLARE #a int
SET #a = 1
SELECT t.col1,
t.col2,
t.denserank,
#a = CASE WHEN LAG(t.denserank, 1, 1) OVER (ORDER BY t.col1) = t.denserank
THEN #a
ELSE #a+1 END AS [whatiwant]
FROM
(
SELECT col1, col2, DENSE_RANK() OVER (ORDER BY COL2) AS [denserank]
FROM [table1]
) t
ORDER BY t.col1

SQL Split Single Row into Fixed Number of Columns

We need to split a single row into fixed number of multiple columns. Following is an example for the data set:
1
2
3
4
5
6
7
Desired Output:
Column A Column B Column C Column D
1 2 3 4
5 6 7 NULL
Thanks for your help in advance.
SQL Server Solution:
Create Sample Table:
create table mytable (col1 int)
insert into mytable values
(1),
(2),
(3),
(4),
(5),
(6),
(7);
Using Modulo and Row_Number(), you could easily do this:
Modulo Query:
SELECT
R1.col1 as columnA,
R2.col1 as columnB,
R3.col1 as columnC,
R4.col1 as columnD
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY col1 ASC) AS RowNum, col1
FROM mytable
WHERE
col1 % 4 = 1
) AS R1
FULL OUTER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY col1 ASC) AS RowNum, col1
FROM mytable
WHERE
col1 % 4 = 2
) AS R2
ON R1.RowNum = R2.RowNum
FULL OUTER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY col1 ASC) AS RowNum, col1
FROM mytable
WHERE
col1 % 4 = 3
) AS R3
ON R2.RowNum = R3.RowNum
FULL OUTER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY col1 ASC) AS RowNum, col1
FROM mytable
WHERE
col1 % 4 = 0
) AS R4
ON R4.RowNum = R3.RowNum
Result:
+---------+---------+---------+---------+
| columnA | columnB | columnC | columnD |
+---------+---------+---------+---------+
| 1 | 2 | 3 | 4 |
| 5 | 6 | 7 | (null) |
+---------+---------+---------+---------+
SQL Fiddle Demo

SQL: Making a 'computation row'

I have a table that looks like this
TYPE | A | B | C | ... | Z
one | 4 | 4 | 4 | ... | 4
two | 3 | 2 | 2 | ... | 1
And I wanted to insert a row with a computation (row one minus row two):
TYPE | A | B | C | ... | Z
one | 4 | 4 | 4 | ... | 4
two | 3 | 2 | 2 | ... | 1
delta| 1 | 2 | 2 | ... | 3
I was thinking of a SQL command that looks like
(select A from table where type=one) - (select A from table where type=two)
Down side is, it's too long and I also have to do that for all the columns (A-Z) and that's quite a lot.
I'm sure there's a more elegant way of doing this.
PS:
The sequence of my code looks like this btw:
// I'm inserting the data from a RawTable to a TempTable
INSERT one
INSERT two
INSERT delta
INSERT three
INSERT four
INSERT delta
...
INSERT onehundredone
INSERT onehundredtwo
INSERT delta
I have added an ID column with identity to your temp table. You can use that to figure out what rows should be grouped.
create table YourTable
(
ID int identity primary key,
[TYPE] varchar(20),
A int,
B int,
C int
)
insert into YourTable ([TYPE], A, B, C)
select 'one', 4, 4, 4 union all
select 'two', 3, 2, 2 union all
select 'three', 7, 4, 4 union all
select 'four', 3, 2, 2 union all
select 'five', 8, 4, 4 union all
select 'six', 3, 2, 2
select T.[TYPE], T.A, T.B, T.C
from
(
select
T.ID,
T.[TYPE],
T.A,
T.B,
T.C
from YourTable as T
union all
select
T2.ID,
'delta' as [TYPE],
T1.A-T2.A as A,
T1.B-T2.B as B,
T1.C-T2.C as C
from YourTable as T1
inner join YourTable as T2
on T1.ID = T2.ID-1 and
T2.ID % 2 = 0
) as T
order by T.ID, case T.[TYPE] when 'delta' then 1 else 0 end
Result:
TYPE A B C
-------------------- ----------- ----------- -----------
one 4 4 4
two 3 2 2
delta 1 2 2
three 7 4 4
four 3 2 2
delta 4 2 2
five 8 4 4
six 3 2 2
delta 5 2 2
Sorting on column C from first row in group:
select T.[TYPE], T.A, T.B, T.C
from
(
select
T1.ID,
T1.[TYPE],
case T1.ID % 2 when 1 then T1.C else T2.C end as Sortorder,
T1.A,
T1.B,
T1.C
from YourTable as T1
left outer join YourTable as T2
on T1.ID = T2.ID+1
union all
select
T2.ID,
'delta' as [TYPE],
T1.C as Sortorder,
T1.A-T2.A as A,
T1.B-T2.B as B,
T1.C-T2.C as C
from YourTable as T1
inner join YourTable as T2
on T1.ID = T2.ID-1 and
T2.ID % 2 = 0
) as T
order by T.Sortorder, T.ID, case T.[TYPE] when 'delta' then 1 else 0 end
I'm not aware of any way to do this "easily" (i.e. without having to specify every column), I can't come up with any way to do it easily, so I'll go on the record as saying that it can't be done. Easily.
The non-easy way would be to build dynamic code--something that loops through the database metadata, builds a string containing the statement(s) to execute your desired routine column by column, and then execute that string. You really want to avoid this whenever possible.
One shortcut, if you just need to build a procedure or function that does this (i.e. build once run many), you could copy the list of columns into a spreadsheet (Excel), build out the highly-repetitive statements using forumlas that reference the column names, and then copying the results back. (This is much simpler to do than it is to explain.)
I have no idea why you're doing this, but the way I'd approach it is:
insert into table
select 'delta',
t1.a - t2.a,
t1.b - t2.b
.....
from table t1,
table t2
where t1.type = 'one'
and t2.type = 'two'
You would have to run this query immediately after inserting "one" and "two", then re-run it after inserting "three" and "four". Nasty nasty nasty.
If you can re-name the columns in some way, or create a numerical column, you could run it in a single query.
When you replace one for 1, two for 2, and so on, then maybe this sql could work:
INSERT INTO PodMays
SELECT
"Delta", A.A-B.A, A.B-B.B, A.C-B.C, A.D-B.D, A.E-B.E
FROM
(
SELECT TOP 1
*
FROM
(SELECT TOP 2 * FROM PodMays WHERE Type <> "Delta" ORDER BY Type DESC)
ORDER BY
Type ASC
) AS A,
(
SELECT TOP 1
*
FROM
(SELECT TOP 2 * FROM PodMays WHERE Type <> "Delta" ORDER BY Type DESC)
ORDER BY
Type DESC
) AS B