Assign row number per ID, where certain values always have fixed numbers - sql

I want to create a query that assigns row numbers per ID in a database table, and certain specific values always get fixed row numbers. For instance, if the value in col2 is A, then the row number should be consistently set to 1. Similarly, if col2 contains the value B, then the row number should always be 2. All other values in col2 should be assigned row numbers in consecutive order starting from 3.
Desired result:
myid col1 col2 row_number
----------------------------------
1 foo A 1
1 bar B 2
1 foobar C 3
1 foobar D 4
2 foobar A 1
2 foob X 3
3 hello B 2
3 hello Z 3
3 hi Y 4
Here is an example which is not working properly.

Sounds like you want to start the row_number with a specific offset, ignoring constant values and assigning them a constant row number.
You can do something a bit ugly like this:
SELECT myid, col1, col2,
case
when col2 = 'A' then 1
when col2 = 'B' then 2
else row_number() over (partition by myid
order by case when col2 = 'A' then 'ZZZ'
when col2 = 'B' then 'ZZZ1'
else col2
end) + 2
end as row_number
FROM newtable
ORDER BY myid, row_number
Result:
MYID COL1 COL2 ROW_NUMBER
1 foo A 1
1 bar B 2
1 foobar C 3
1 foobar D 4
2 foobar A 1
2 foob X 3
3 hello B 2
3 hello Y 3
3 hello Z 4
This start the row number from +2 (Depending on the number of constant values [A,B]), giving each constant value a value that will be sorted last in the row_number window function so the rest will be sorted first.

Related

Finding adjacent column values from the last non-null value of a certain column in Snowflake (SQL) using partition by

Say I have the following table:
ID
T
R
1
2
1
3
Y
1
4
1
5
1
6
Y
1
7
I would like to add a column which equals the value from column T based on the last non-null value from column R. This means the following:
ID
T
R
GOAL
1
2
1
3
Y
1
4
Y
3
1
5
4
1
6
Y
4
1
7
6
I do have many ID's so I need to make use of the OVER (PARTITION BY ...) clause. Also, if possible, I would like to use a single statement, like
SELECT *
, GOAL
FROM TABLE
So without any extra select statement.
T is in ascending order so just null it out according to R and take the maximum looking backward.
select *,
max(case when R is not null then T end)
over (
partition by id
order by T
rows between unbounded preceding and 1 preceding
) as GOAL
from TBL
http://sqlfiddle.com/#!18/c927a5/5

Assigning string group ID in pandas

I have a data frame (data)
Col 1 Col 2 Combination
1 2 (1,2)
3 4 (3,4)
1 2 (1,2)
2 3 (2,3)
4 6 (4,6)
3 4 (3,4)
I want to assign a group ID based on Col 1 and Col 2 as a categorical variable not a numerical one
My output needed
Col 1 Col 2 Combination GroupID
1 2 (1,2) A
3 4 (3,4) C
1 2 (1,2) A
2 3 (2,3) B
4 6 (4,6) D
3 4 (3,4) C
The GroupID need to be a categorical data type need not to be numerical and can follow any order.
I have tried this code but the GroupID column is treated as numerical datatype
data['GroupID']=data1.groupby(['Col','Col2']).ngroup()
data['GroupID'] = data['GroupID'].astype('category')
Can anyone suggest a proper way to deal with this issue?

How to sum all row per item?

In relation to my previous question How to use the result of previous row in oracle?
I need to sum the value per item.
Col | Col A | Col B
Item1 1 | 1 (col A)
Item1 2 | 3 (colA + prevColB)
Item1 3 | 6 (colA + prevColB)
Item2 1 | 1 (colA)
Item2 4 | 5 (colA + prevColB)
Item2 3 | 8 (colA + prevColB)
SQL tables represent unordered sets. Your cumulative sum assumes an ordering of the table, that is not apparent in the question.
The syntax for the cumulative sum is:
select t.*
sum(cola) over (partition by col order by ?) as colb
from t;
The ? is for the column (or expression) that represents the ordering of the rows.
If you mean Just one previous row value(but not overall sum), then use lag function ,which gives the value of the column for the previous row, as in the following SQL :
select colA, colA+
lag(colA,1,0) over (partition by Col order by Col ) as ColB
from tab;
COLA COLB
1 1
2 3
3 5
1 1
4 5
3 7
SQL Fiddle Demo
col is item i thing, u can try bellow
select col, sum(col A), sum( col B) from tb group by col
enjoy it broh

Row to Column and Column to Row conversion in query

I want to invert the data in my table, i.e. convert a:
Row to Column.
Column to Row.
The actual table data is
col1 col2
---------------------------------
row1 1 2
row2 3 4
Expected output :
col1 col2
---------------------------------
row1 1 3
row2 2 4
I have tired with a normal select query but couldn't work out how to do this. Is it possible in PL/SQL?
(select * from (select a,b from t) pivot(sum(a) for b in(2,4)))
union
(select * from (select a,b from t) pivot(sum(b) for a in(1,3)));

How to ignore certain similar rows when select

I have the following table
Id col1 col2 col3
1 c 2 m
2 c 3 6
2 b d u
3 e 6 9
4 1 v 8
4 2 b t
4 4 5 g
As you can see, there are duplicate value in id column, 2 and 4. I only want to select rows with unique id value and ignore the following rows with duplicate id value. I just want to keep the first of the rows with duplicate values
1 c 2 m
2 c 3 6
3 e 6 9
4 1 v 8
There is FK constraint, so I cannot delete rows with duplicate values.
I am using SQL SERVER 2008 R2
Any reply will be appreciated.
You can use row_number to number each row with the same id. Then you can select only the first row per id:
select *
from (
select row_number() over (partition by id order by col1, col2, col3) rn
from YourTable
) as SubQueryAlias
where rn = 1
The subquery is required because SQL Server doesn't allow row_number directly in the where clause.