Can I split a dynamic semicolon delimited string into columns using T-SQL? - sql

I have a table that looks like so:
Id SubNumber Values
1 1 1;4;8;3
2 2 8;9;7;10
3 3 41;45;23;0
I will not always only have 4 values and the number of "SubNumbers" can be greater than 3. Is there any way I can query this table to look like this?
Id SubNumber 1 2 3 4
1 1 1 4 8 3
2 2 8 9 7 10
2 3 41 45 23 0
The rows will always have the same number of values delimited by a semicolon but the amount separated by a semicolon can vary. So a table may even have 10 values or 1 or more.
The 2nd table doesn't have to have numbers to represent the values. It can even be blank or the default that is given by SQL when no name is provided.
This is not a duplicate of the example provided because this deals with separating a dynamic number of values into columns.

Related

We have age columns and in that we have single values or 15+ values we need to have single value or 15+

If source value is 3 or 4 then target value is 3 or 4. If source having any minus value then -1 and if source value is 15 or more than 15 then 15+.
Table 1 Table 2
Age column. Age column
3 3
4 4
15 15+
-2 -1
-3 -1
100 15+
you can use the CASE WHEN THEN END syntax for problems like that (https://www.w3schools.com/sql/sql_case.asp)
i assume you have 3 conditions:
negative values are always -1
up to 14 it is the original value
15 and above is 15+
that means, you have to cover these 3 cases with the CASE WHEN THEN END clause. Just evaluate what comes out from your select to the first table and then transform it to the wanted outcome

How to compare columns with equal values?

I have a dataframe which looks as follows:
colA colB
0 2 1
1 4 2
2 3 7
3 8 5
4 7 2
I have two datasets one with customer code and other information and the other with addresses plus related customer code.
I did a merge with the two bases and now I want to return the lines where the values ​​in the columns are the same, but I'm not able to do it.
Can someone help me?
Thanks
you can try :
dfs=df.loc[df['colA']==df['colB']]

How to sum values of two columns by an ID column, keeping some columns with repeated values and excluding others?

I need to organize a large df adding values of a column by a column ID (the ID is not sequencial), keeping some columns of the df that have repeated values by ID and excluding column that have different values by ID. Below I inserted a reproducible example and the output I need. I think there is a simple way to do that, but I am not soo familiar with R.
df=read.table(textConnection("
ID spp effort generalist specialist
1 a 10 1 0
1 b 10 1 0
1 c 10 0 1
1 d 10 0 1
2 a 16 1 0
2 b 16 1 0
2 e 16 0 1
"), header = TRUE)
The output I need:
ID effort generalist specialist
1 10 2 2
2 16 2 1

Parsing Values and splitting based on conditions in DB2 tables

I have 2 tables.
In the first table Table1 I have a column called DOTINCS and the values are as follows
PARID
1000150004
1152611254
2015620001
Now I have another tables DTINCS with 5 columns
BORO BLOCK LOT
------------------------
1 15 4
1 15261 1254
2 1562 1
I want to join these 2 tables PARID in DOTINCS is of 10 digits and is split into 3 columns in DTINCS table by removing leading zeros. The BORO is of 1 digit, Block 5 and Lot 4 digits.
How do I parse the PARID in such a way that Boro is 1 digit and Block I take 5 digits and take the integer portion of it and same with lot, 4 digits and only integer portion of it and exclude leading zeroes?
thanks in advance.
I made them into strings because I could. If they are stored as numbers you can cast them to strings first.
with vals (col1) as (
values ('1000150004'),
('1152611254'),
('2015620001')
)
select int(left(col1,1)) boro,
int(substr(col1,2,6)) block,
int(right(col1, 4)) lot
from vals;
BORO BLOCK LOT
----------- ----------- -----------
1 15 4
1 152611 1254
2 15620 1
3 record(s) selected.

In MSSQL filter rows based on an ID exists in a column as comma separated string

I've Benchmarking table like this
BMID TestID BMTitle ConnectedTestID
---------------------------------------------------
1 5 My BM1 0
2 6 My BM2 5
3 7 My BM3 5,6
4 8 My BM4 10,12,8
5 9 My BM5 0
6 10 My BM6 3,6
7 5 My BM7 8,3,12,9
8 3 My BM8 7,10
9 8 My BM9 0
10 12 My BM10 9
---------------------------------------------
Explaining the table a little
Here the TestID and the connected TestID is playing the roles. If the user wants all the benchmarks for the TestID 3
It should return rows where testID=3 and also if any rows having connectedTestID column having that testID in it among the comma separated values
That means if the user specify the value 3 as the testID, it should return
---------------------------------------------
8 3 My BM8 7,10
7 5 My BM7 8,3,12,9
6 10 My BM6 3,6
--------------------------------------------
Hope its clear how those 3 rows returned. Means First row is because the testID 3 is there. the other two rows because 3 is in their connectedIDs cell
You should fix the data structure. Storing numeric ids in a comma-delimited list is a bad, bad, bad idea:
SQL Server doesn't have the best string manipulation functions.
Storing numberings as character strings is a bad idea.
Having undeclared foreign key relationships is a bad idea.
The resulting queries cannot make use of indexes.
While you are exploring what a junction table is so you can fix the problem with the data structure, you can use a query such as this:
where testid = 3 or
',' + ConnectedTestID + ',' like '%,3,%'