update comma separated values in a column - sql

Before we begin, this is a bad situation / design, but I am not the able to fix the design as the column is used by an application and database that I ( my company ) don't own. I do not control the new or old values either.
I have a table similar to this:
Create Table #tblCRigrationTest
(
ReasonString VarChar(1000),
ReasonStringNew VarChar(1000)
)
With data like this:
Insert Into #tblCRigrationTest ( ReasonString, ReasonStringNew )
Values ('5016|5005|5006|5032|5020|5010|5007|5011|5012|5028|5024|5008|5029', '')
What I need to do is "loop" through each ID and based on its value, update it, concatenate it into a new string, and then store it in the ReasonStringNew column. The new ID's appear in the second column below:
Old New
--------------
5005 1
5006 2
5020 3
5032 4
5010 5
5007 6
5011 7
5012 8
5028 9
5024 10
5008 11
5016 12
5009 13
5029 14
Any suggestions on how to do this?

just take your column values in a temp table then try to update
SET #STRSQL = 'SELECT ''' + REPLACE('Yourcolumn, ',',
''' ,(your saparator)''') + ''''
DECLARE #tbl TABLE
(
col1 VARCHAR(100) ,
)
INSERT INTO #tbl
EXECUTE ( #STRSQL
)

Related

SQL: Deleting Identical Columns With Different Names

My original table ("original_table") looks like this (contains both numeric and character variables):
age height height2 gender gender2
1 18 76.1 76.1 M M
2 19 77.0 77.0 F F
3 20 78.1 78.1 M M
4 21 78.2 78.2 M M
5 22 78.8 78.8 F F
6 23 79.7 79.7 F F
I would like to remove columns from this table that have identical entries, but are named differently. In the end, this should look like this ("new_table"):
age height gender
1 18 76.1 M
2 19 77.0 F
3 20 78.1 M
4 21 78.2 M
5 22 78.8 F
6 23 79.7 F
My Question: Is there a standard way to do this in SQL? I tried to do some research and came across the following link : How do I compare two columns for equality in SQL Server?
What I Tried So Far: It seems that something like this might work:
CREATE TABLE new_table AS SELECT * FROM original_table;
ALTER TABLE new_table
ADD does_age_equal_height varchar(255);
UPDATE new_table
SET does_age_equal_height = CASE
WHEN age = height THEN '1' ELSE '0' END AS does_age_equal_height;
From here, if the "sum" of all values in the "does_age_equal_height" column equals to the number of rows from "new_table" (i.e. select count(rownum) from new_table) - this must mean that both columns are equal, and that one of the columns can be dropped.
However, this is a very inefficient method, even for tables having a small number of columns. In my example, I have 5 columns - this means that I would have to repeat the above process " 5C2" times, i.e. 5! / (2!*3!) = 10 times. For example:
ALTER TABLE employees
ADD does_age_equal_height varchar(255),
does_age_equal_height2 varchar(255)
does_age_equal_gender varchar(255)
does_age_equal_gender2 varchar(255)
does_height_equal_height2 varchar(255)
does_height_equal_gender varchar(255)
does_height_equal_gender2 varchar(255)
does_height2_equal_gender varchar(255)
does_height2_equal_gender2 varchar(255)
does_gender_equal_gender2 varchar(255);
This would then be followed by multiple CASE statements - further complicating the process.
Can someone please show me a more efficient way of doing this?
Thanks!
I hope to get your problem in the right way. This is my code in SqlServer to handle it, you should customize it based on Netezza SQL.
My idea is:
Calculate MD5 for each column and then compare these columns together, if there is the same hash, one of the columns will be chosen.
I going to create the below table for this problem:
CREATE TABLE Students
(
Id INT PRIMARY KEY IDENTITY,
StudentName VARCHAR (50),
Course VARCHAR (50),
Score INT,
lastName VARCHAR (50) -- another alias for StudentName ,
metric INT, -- another alias for score
className VARCHAR(50) -- another alias for Course
)
GO
INSERT INTO Students VALUES ('Sally', 'English', 95, 'Sally', 95, 'English');
INSERT INTO Students VALUES ('Sally', 'History', 82, 'Sally', 82, 'History');
INSERT INTO Students VALUES ('Edward', 'English', 45, 'Edward', 45, 'English');
INSERT INTO Students VALUES ('Edward', 'History', 78, 'Edward', 78, 'History');
after creating the table and inserting sample records, it turns to find similar columns.
step 1. Declare variables.
DECLARE #cols_q VARCHAR(max),
#cols VARCHAR(max),
#table_name VARCHAR(max)= N'Students',
#res NVARCHAR(max),
#newCols VARCHAR(max),
#finalResQuery VARCHAR(max);
step 2. Generate dynamics query for calculating a hash for every column.
SELECT #cols_q = COALESCE(#cols_q+ ', ','')+'HASHBYTES(''MD5'', CONVERT(varbinary(max), (select '+ COLumn_NAME +' as t from Students FOR XML AUTO))) as '+ COLumn_NAME,
#cols = coalesce(#cols + ',','')+COLumn_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = #table_name;
set #cols_q = 'select '+ #cols_q +' into ##tmp_'+ #table_name+' from '+ #table_name;
step 3. Run generated query.
exec(#cols_q)
step 4. Get columns that duplicated columns removed.
set #res = N'select uniq_colname into ##temp_colnames
from(
select max(colname) as uniq_colname from (
select * from ##tmp_Students
)tt
unpivot (
md5_hash for colname in ( '+ #cols +')
) as tbl
group by md5_hash
)tr';
exec ( #res);
step 5. Get final results
select #newCols = COALESCE(#newCols+ ', ','')+ uniq_colname from ##temp_colnames
set #finalResQuery = 'select '+ #newCols +' from '+ #table_name;
exec (#finalResQuery)

Convert CSV stored in a string variable to table

I've got CSV data stored in a string variable in SQL:
#csvContent =
'date;id;name;position;street;city
19.03.2019 10:06:00;1;Max;President;Langestr. 35;Berlin
19.04.2019 12:36:00;2;Bernd;Vice President;Haupstr. 40;Münster
21.06.2019 14:30:00;3;Franziska;financial;Hofstr. 19;Frankfurt'
What I want to do is to convert it to a #table, so it would look like
SELECT * FROM #table
date id name position street city
---------------------------------------------------------------------
19.03.2019 10:06:00 1 Max President Langestr. 35 Berlin
19.04.2019 12:36:00 2 Bernd Vice President Haupstr. 40 Münster
21.06.2019 14:30:00 3 Franzi financial Hofstr. 19 Frankfurt
The headers aren't fixed so the CSV could have more or less columns with differnt Header names.
I've tried it with split_string() and pivot but didn't find a solution for this.
If you are using SQL server, this might be a solution for your request:
How to split a comma-separated value to columns
Hope it will help you
CREATE TABLE #temp(
date date,
id int ,
name varchar(100),
. ....... //create column that you needed
)
DECLARE #sql NVARCHAR(4000) = 'BULK INSERT #temp
FROM ''' + #CSVFILE+ ''' WITH
(
FIELDTERMINATOR ='';'',
ROWTERMINATOR =''\n'',
FIRSTROW = 2
)';
EXEC(#sql);
SELECT *FROM #temp
DROP TABLE #temp

How can I update strings within an SQL server table based on a query?

I have two tables A and B. A has an Id and a string with some embedded information for some text and ids from a table C that is not shown
Aid| AString
1 "<thing_5"><thing_6">"
2 "<thing_5"><thing_6">"
Bid|Cid|Aid
1 5 1
2 6 1
3 5 2
4 6 2
I realise this is an insane structure but that is life.
I need to update the strings within A so that instead of having the Cid they have the corresponding Bid (related by the Aid and Bid pairing)
Is this even something I should be thinking of doing in SQL... A has about 300 entries and B about 1200 so not something doing by hand
For clarity I wish for B to remain the same and A to finally look like this
Aid| AString
1 "<thing_1"><thing_2">"
2 "<thing_3"><thing_4">"
This script relies on generating dynamic SQL statements to update the table, then executes those statements.
Taking into account that the cid's are within thing_ and ":
First replaces the cid's using a placeholder ($$$$$$ in this case) to account for the fact that cid's and bid's may overlap (example, changing 3->2 and later 2->1)
Then changes the placeholders to the proper bid
CREATE TABLE #a(aid INT,astr VARCHAR(MAX));
INSERT INTO #a(aid,astr)VALUES(1,'<thing_5"><thing_6">'),(2,'<thing_5"><thing_6">');
CREATE TABLE #rep(aid INT,bid INT,cid INT);
INSERT INTO #rep(bid,cid,aid)VALUES(5,6,1),(6,5,1),(3,5,2),(4,6,2);
DECLARE #cmd NVARCHAR(MAX)=(
SELECT
'UPDATE #a '+
'SET astr=REPLACE(astr,''thing_'+CAST(r.cid AS VARCHAR(16))+'"'',''thing_$$$$$$'+CAST(r.cid AS VARCHAR(16))+'"'') '+
'WHERE aid='+CAST(a.aid AS VARCHAR(16))+';'
FROM
(SELECT DISTINCT aid FROM #a AS a) AS a
INNER JOIN #rep AS r ON
r.aid=a.aid
FOR
XML PATH('')
);
EXEC sp_executesql #cmd;
SET #cmd=(
SELECT
'UPDATE #a '+
'SET astr=REPLACE(astr,''thing_$$$$$$'+CAST(r.cid AS VARCHAR(16))+'"'',''thing_'+CAST(r.bid AS VARCHAR(16))+'"'') '+
'WHERE aid='+CAST(a.aid AS VARCHAR(16))+';'
FROM
(SELECT DISTINCT aid FROM #a AS a) AS a
INNER JOIN #rep AS r ON
r.aid=a.aid
FOR
XML PATH('')
);
EXEC sp_executesql #cmd;
SELECT * FROM #a;
DROP TABLE #rep;
DROP TABLE #a;
Result is:
+-----+----------------------+
| aid | astr |
+-----+----------------------+
| 1 | <thing_6"><thing_5"> |
| 2 | <thing_3"><thing_4"> |
+-----+----------------------+
You could do this with SQL with something like below. It wasn't clear to me how c was related, but you can adjust it as necessary...
create table a (
Aid int null,
AString varchar(25) null)
insert into a values(1,'"<thing_5"><thing_6">"')
insert into a values(2,'"<thing_5"><thing_6">"')
create table b (
Aid int null,
Bid int null,
Cid int null)
insert into b values(1,1,5)
insert into b values(1,2,6)
insert into b values(2,3,5)
insert into b values(2,4,6)
UPDATE Ax
SET Ax.ASTRING = REPLACE(Ax.ASTRING, 'thing_' + cast(cID as varchar(1)),'thing_' + cast(BID as varchar(1)))
FROM A Ax
INNER JOIN Bx
on ax.Aid=bx.Aid
and Ax.AString like '%thing_' + cast(Cid as varchar(1)) + '%'

TSQL Select all columns without first n from function/procedure

I know this may sounds silly but I would like to create function that will process data from tables with different size.
Lets say I have first table like so:
ID IRR M0 M1
----------------------
1 0 -10 5
2 0 -20 10
3 0 -100 100
4 0 -10 0
And second table like so:
ID IRR M0 M1 M2
----------------------------
1 0 -10 5 60
2 0 -20 10 0
3 0 -100 100 400
4 0 -10 0 10
I would like to create function that will be able to process data from both tables.
I know that first column contains ID, second IRR, rest of columns will hold cash flow for specific month.
Function should be able to process all columns instead of first 2 and store result in second column.
I know that I can get all columns from specific table with:
SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.columns
WHERE TABLE_NAME = 'First_Table'
Problems begin when I would like to create function that will return those columns as rows.
I can create function like so:
CREATE FUNCTION UnpivotRow (#TableName varchar(50), #FromWhichColumn int, #Row int)
RETURNS #values TABLE
(
id INT IDENTITY(0, 1),
value DECIMAL(30, 10)
)
...
But how this function should look?
I think that ideal for this kind of processing table should look like so:
ProjectID TimePeriod Value
--------------------------------
1 0 -10
1 1 5
2 0 -20
2 1 10
3 0 -100
3 1 100
4 0 -10
4 1 0
I need to unpivot whole table without knowing number of columns.
EDIT:
If this can't be done inside function, then maybe inside a procedure?
This can be done using dynamic SQL to perform the UNPIVOT:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX),
#colTP as NVARCHAR(MAX)
select #cols = stuff((select ','+quotename(C.name)
from sys.columns as C
where C.object_id = object_id('table1') and
C.name like 'M%'
for xml path('')), 1, 1, '')
set #query = 'SELECT id,
replace(timeperiod, ''M'', '''') timeperiod,
value
from table1
unpivot
(
value
for timeperiod in (' + #cols + ')
) u '
exec(#query)
See SQL Fiddle with Demo.
This solution would have to be placed in a stored procedure.

get specific rows of table given a rule SQL Server 2008

I have a table like:
ID NAME VAL
----------------------
1 a1*a1 90052
2 a1*a2 236
3 a1*a3 56
4 a1*a4 6072
5 a1*a5 1004
6 a2*a2 4576
7 a2*a3 724
8 a2*a4 230
9 a2*a5 679
10 a3*a3 5
11 a3*a4 644
12 a3*a5 23423
13 a4*a4 42354
14 a4*a5 10199
15 a5*a5 10279
Given a number given S = 5, I want to query
the rows wth id: 1,6,10,13,15
they are a1*a1,a2*a2,a3*a3,a4*a4 and a5*a5
I would like something like:
INSERT #NEW_TABLE (ID,NAME,Value) (
SELECT ordinal, NAME, VAL FROM myTable where id = 1,6,10,13,15)
to get
ID NAME VAL
----------------------
1 a1*a1 90052
2 a2*a2 4576
3 a3*a3 5
4 a4*a4 42354
5 a5*a5 10279
Is there a way to do this for any given S, Maybe wth dynamic sql?
I was getting the formula and I got this:
S=5
ID formula
1 1
6 1+S
10 1+S+ (S-1)
13 1+S+ (S-1) + (S-2)
15 1+S+ (S-1) + (S-2) + (S-3)
Is there a way to do this inside a case or a while loop?
This worked in testing.
You can just inner join on #Tab to limit your results. You probably also want to add some traps for values below 3, which I haven't done.
The basic process is
Declare your #s value
Insert the first two rows since they will always be the same
In a loop, insert one row at a time with an incrementing difference
Loop exits once it has run #s-2 times
Try:
DECLARE #Tab Table (id INT)
DECLARE #S int = 5,
#ct int
DECLARE #cur int = (1 + #S)
INSERT INTO #Tab SELECT 1
INSERT INTO #Tab SELECT (1 + #S)
SET #ct = 1
WHILE #ct <= #S - 2
BEGIN
SET #cur = #cur + (#S - #ct)
INSERT INTO #Tab SELECT #cur
SET #ct = #ct + 1
END
SELECT * FROM #Tab
ORDER BY id
To use this in your query, you can do either:
SELECT ordinal, NAME, VAL
FROM myTable
WHERE id IN (SELECT id FROM #Tab)
-- OR
SELECT ordinal, NAME, VAL
FROM myTable t
INNER JOIN #tab t2
ON t2.id = t.id