SQL Server using the REPLACE function to replace values in a column spanning multiple rows - sql

I have a table like this:
Rule | Mask |Replacement
---------------------------------
# # 12 | # |[^0-9]
# # 12 | # |[0-9]
That I constructed joining these two tables
Table 1
Mask_ID | Mask |Replacement
---------------------------------
1 | # |[^0-9]
2 | # |[0-9]
Table 2
Rule_ID | Rule
--------------
1 | # # 12
The result I want is this:
Rule | Expression
--------------------
# # 12 | [^0-9] [0-9] 12
I've been trying to use the REPLACE button to do this, but I've only been able to generate this result
Rule | Expression
--------------------
# # 12 | [^0-9] # 12
# # 12 | # [0-9] 12
I'm not sure how to get the REPLACE function to apply multiple rows to a single row. If anyone has any suggestions, I would appreciate it.
This is what I have so far, but its causing me to get the result I mentioned above:
SELECT
A.PointMask_CODE
,B.PointMasking_Rule_CODE
,B.Mask
,B.Escape_Character
,B.EscapedMaskRule
,REPLACE(A.PointMask_CODE, B.Mask, B.EscapedMaskRule)
FROM
tblStatusPointMasks_CORE A
LEFT JOIN
vwAORs_Status_PointMasks_EscapedRules B
ON
PointMask_CODE LIKE '%' + B.EscapedMask + '%' ESCAPE ISNULL(B.Escape_Character, '\')

For your given sample data, you could use a recursive common table expression (cte).
create table masks (mask_id int, mask varchar(32), replacement varchar(32));
insert into masks values
(1, '#', '[^0-9]')
,(2, '#', '[0-9]');
create table rules (rule_id int, rule_txt varchar(32));
insert into rules values
(1, '# # 12');
with cte as (
select
r.rule_id
, r.rule_txt
, masks = 0
from rules r
union all
select
r.rule_id
, rule_txt = convert(varchar(32),replace(r.rule_txt,m.mask,m.replacement))
, masks = r.masks+1
from masks m
inner join cte r
on r.rule_txt like '%'+m.mask+'%'
)
select top 1 *
from cte
order by masks desc
rextester demo: http://rextester.com/KAV58392
returns:
+---------+-----------------+-------+
| rule_id | rule_txt | masks |
+---------+-----------------+-------+
| 1 | [^0-9] [0-9] 12 | 2 |
+---------+-----------------+-------+

One more option, using string operations...
create table masks (mask_id int, mask varchar(32), replacement varchar(32));
insert into masks values
(1, '#', '[^0-9]')
,(2, '#', '[0-9]');
create table rules (rule_id int, rule_txt varchar(32));
insert into rules values
(1, '# # # # 12');
declare #Table Table (charval varchar(10))
declare #char varchar(10), #rule_txt varchar(50)
select #rule_txt=rule_txt FROM rules
while charindex(' ',#rule_txt)>0
begin
select #char=substring(#rule_txt,1,charindex(' ',#rule_txt)-1)
FROM rules
insert into #Table values (#char)
SET #rule_txt=RIGHT(#rule_txt,(len(#rule_txt)-charindex(' ',#rule_txt)))
END
insert into #Table values (#rule_txt)
select stuff((SELECT ' '+isnull(replacement,charval)
from #Table T left join masks M on M.mask=T.charval
for xml path('')),1,1,'')
drop table rules
drop table masks

Related

Adding hyphen in a column in sql table

I need help understanding how to add a hyphen to a column where the values are as follows,
8601881, 9700800,2170
The hyphen is supposed to be just before the last digit. There are multiple such values in the column and the length of numbers could be 5,6 or more but the hyphen has to be before the last digit.
Any help is greatly appreciated.
The expected output should be as follows,
860188-1,970080-0,217-0
select concat(substring(value, 1, len(value)-1), '-', substring(value, len(value), 1)) from data;create table data(value varchar(100));
Here is the full example:
create table data(value varchar(100));
insert into data values('6789567');
insert into data values('98765434');
insert into data values('1234567');
insert into data values('876545');
insert into data values('342365');
select concat(substring(value, 1, len(value)-1), '-', substring(value, len(value), 1)) from data;
| (No column name) |
| :--------------- |
| 678956-7 |
| 9876543-4 |
| 123456-7 |
| 87654-5 |
| 34236-5 |
In case OP meant there can be multiple numbers in the column value here is the solution:
create table data1(value varchar(100));
insert into data1 values('6789567,5467474,846364');
insert into data1 values('98765434,6474644,76866,68696');
insert into data1 values('1234567,35637373');
select t.value, string_agg(concat(substring(token.value, 1, len(token.value)-1), '-',
substring(token.value, len(token.value), 1)), ',') as result
from data1 t cross apply string_split(value, ',') as token group by t.value;
value | result
:--------------------------- | :-------------------------------
1234567,35637373 | 123456-7,3563737-3
6789567,5467474,846364 | 678956-7,546747-4,84636-4
98765434,6474644,76866,68696 | 9876543-4,647464-4,7686-6,6869-6
Using SQL SERVER 2017, you can leverage STRING_SPLIT, STUFF, & STRING_AGG to handle this fairly easily.
DECLARE #T TABLE (val VARCHAR(100)) ;
INSERT INTO #T (val) VALUES ('8601881,9700800,2170') ;
SELECT t.val,
STRING_AGG(STUFF(ss.value, LEN(ss.value), 0, '-'), ',') AS Parsed
FROM #T AS t
CROSS APPLY STRING_SPLIT(t.val, ',') AS ss
GROUP BY t.val ;
Returns
8601881,9700800,2170 => 860188-1,970080-0,217-0
STRING_SPLIT breaks them into individual values, STUFF inserts the hyphen into each individual value, STRING_AGG combines them back into a single row per original value.
You can use LEN and LEFT/RIGHT method to get your desired output. Logic are given below:
Note: this will work for any length's value.
DECLARE #T VARCHAR(MAX) = '8601881'
SELECT LEFT(#T,LEN(#T)-1)+'-'+RIGHT(#T,1)
If you have "dash/hyphen" in your data, and you have to store it in varchar or nvarchar just append N before the data.
For example:
insert into users(id,studentId) VALUES (6,N'12345-1001-67890');

Create one json per one table row

I would like to create jsons from the data in the table.
Table looks like that:
|code |
+------+
|D5ABX0|
|MKT536|
|WAEX44|
I am using FOR JSON PATH which is nice:
SELECT [code]
FROM feature
FOR JSON PATH
but the return value of this query are three concatenated jsons in one row:
|JSON_F52E2B61-18A1-11d1-B105-00805F49916B |
+----------------------------------------------------------+
1 |[{"code":"D5ABX0"},{"code":"MKT536"},{"code":"WAEX44"}]|
I need to have each row to be a separate json, like that:
|JSON_return |
+---------------------+
1 |{"code":"D5ABX0"} |
2 |{"code":"MKT536"} |
3 |{"code":"WAEX44"} |
I was trying to use splitting function (CROSS APPLY) which needs to have a separator as a parameter but this is not a robust solution as the json could be more expanded or branched and this could separate not the whole json but the json inside the json:
;WITH split AS (
SELECT [json] = (SELECT code FROM feature FOR JSON PATH)
)
SELECT
T.StringElement
FROM split S
CROSS APPLY dbo.fnSplitDelimitedList([json], '},{') T
The output is:
|StringElement |
+---------------------+
1 |[{"code":"D5ABX0" |
2 |"code":"MKT536" |
3 |"code":"WAEX44"}] |
Is there a way to force sqlserver to create one json per row?
You'll need to use as subquery to achieve this; FOR JSON will create a JSON string for the entire returned dataset. This should get you what you're after:
CREATE TABLE #Sample (code varchar(6));
INSERT INTO #Sample
VALUES ('D5ABX0'),
('MKT536'),
('WAEX44');
SELECT (SELECT Code
FROM #Sample sq
WHERE sq.code = S.code
FOR JSON PATH)
FROM #Sample S;
DROP TABLE #Sample;
CREATE TABLE #Temp
(
ID INT IDENTITY(1, 1) ,
StringValue NVARCHAR(100)
);
INSERT INTO #Temp
( StringValue )
VALUES ( N'D5ABX0' -- StringValue - nvarchar(100)
),
( 'MKT536' ),
( 'WAEX44' );
SELECT ID,'[{"code:":'''''+StringValue+'''''}]' AS JSON_return FROM #Temp
DROP TABLE #Temp

SQL Server : using a LEN or variable in where clause that contains a join

I have created a map table to find various unique strings within a large list of unique hostnames.
The initial code works if I enter the various lengths i.e. varchar(2), varchar(11), etc. It's trying to reference the variable lengths is where my issues began.
I have tried several different combinations before attempting to use a variable.
For example in the where clause, substituting the varchar(2) with the m.[HostNameAlias_IDLength]
I am also having difficulty using variables.
Any thoughts would be much appreciated.
TM
P.S. A listing of the code and sample tables are listed below.
Table1
HostNameAlias_id (pk, varchar(5), not null)
ProjectName_ID (int, not null)
HostnameAlias_IDLength (computed, int, null)
Data
HostNameAlias_ID ProjectName_ID HostNameAlias_IDLength
----------------------------------------------------------
H123456789023456 16009 16
B123456789023 16005 13
C1234567890 16009 11
d12345678 16009 9
e123456 16009 8
f12345 16003 6
g1234 16035 5
h123 16035 4
j12 16005 3
k1 16007 2
Table2
[host name] (pk, nvarchar(50), not null
Projectname_id (int, not null)
Sample data:
Host name Title projectname_ID
--------------------------------------------------
C1234567890a1 vp 16009
C1234567890a2 avp 16009
h12335 student 16009
h12356 teacher 16009
h12357 prof 16009
Query
DECLARE #len = INT()
DECLARE #slen = VARCHAR(2);
SELECT DISTINCT
#len = m.[HostNameAlias_IDLength],
#slen = CONVERT(varchar(2), m.[HostNameAlias_ID]),
c.[Host Name],
m.[projectname_id]
FROM
[table1] c
JOIN
[table2] m ON c.[projectname_id] = m.[projectname_id]
WHERE
CONVERT(varchar(2), [Host Name]) IN (SELECT [HostNameAlias_ID]
FROM [table2])
The length of a result cannot be known in the where clause used to discover that length, so I fail to see why you are attempting this. In addition the column [Host Name] is a varchar(16) so you could encounter up to 16 characters, so just use that maximum ... if the conversion is needed at all.
Below I have just used LIKE instead of IN, perhaps that will assist.
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Table1
([HostNameAlias_ID] varchar(16), [ProjectName_ID] int, [HostNameAlias_IDLength] int)
;
INSERT INTO Table1
([HostNameAlias_ID], [ProjectName_ID], [HostNameAlias_IDLength])
VALUES
('H123456789023456', 16009, 16),
('B123456789023', 16005, 13),
('C1234567890', 16009, 11),
('d12345678', 16009, 9),
('e123456', 16009, 8),
('f12345', 16003, 6),
('g1234', 16035, 5),
('h123', 16035, 4),
('j12', 16005, 3),
('k1', 16007, 2)
;
CREATE TABLE Table2
([HostName] varchar(13), [Title] varchar(7), [projectname_ID] int)
;
INSERT INTO Table2
([HostName], [Title], [projectname_ID])
VALUES
('C1234567890a1', 'vp', 16009),
('C1234567890a2', 'avp', 16009),
('h12335', 'student', 16009),
('h12356', 'teacher', 16009),
('h12357', 'prof', 16009)
;
Query 1:
SELECT
m.[HostName]
, c.[HostNameAlias_ID]
, m.[projectname_id]
, c.[HostNameAlias_IDLength]
FROM [table1] c
JOIN [table2] m ON c.[projectname_id] = m.[projectname_id]
WHERE [HostName] LIKE ([HostNameAlias_ID] + '%')
Results:
| HostName | HostNameAlias_ID | projectname_id | HostNameAlias_IDLength |
|---------------|------------------|----------------|------------------------|
| C1234567890a1 | C1234567890 | 16009 | 11 |
| C1234567890a2 | C1234567890 | 16009 | 11 |
re: [Host name] including spaces in column names is a complication that can and should be avoided, so I have used [HostName] instead.

SELECT COL HIVE SQL VALUE WHERE VALUES <5000

I'm learning about HIVE and I have come across a question I cannot seem to find a workable answer for. I have to extract all of the numeric columns that ONLY contain integer values <5000 from a table and create a space separated text file. I am familiar with creating text files and selecting rows but selecting columns that meet a specific parameter I am not familiar with, any help or guidance will be appreciated! Below I've listed the structure of the table. Also, there is an image attached showing the data in table format. For OUTPUT I need to go through ALL the COLUMNS and RETURN ONLY the the COLUMNS that meet the parameter of integer values LESS THAN 5000.
create table lineorder (
lo_orderkey int,
lo_linenumber int,
lo_custkey int,
lo_partkey int,
lo_suppkey int,
lo_orderdate int,
lo_orderpriority varchar(15),
lo_shippriority varchar(1),
lo_quantity int,
lo_extendedprice int,
lo_ordertotalprice int,
lo_discount int,
lo_revenue int,
lo_supplycost int,
lo_tax int,
lo_commitdate int,
lo_shipmode varchar(10)
)
Data in tbl format
Conditional columns selecting is a terrible, horrible, no good, very bad idea.
Being that said, here is a demo.
with t as
(
select stack
(
3
,10 ,100 ,1000 ,'X' ,null
,20 ,null ,2000 ,'Y' ,200000
,30 ,300 ,3000 ,'Z' ,300000
) as (c1,c2,c3,c4,c5)
)
select regexp_replace
(
printf(concat('%s',repeat(concat(unhex(1),'%s'),field(unhex(1),t.*,unhex(1))-2)),*)
,concat('([^\\x01]*)',repeat('\\x01([^\\x01]*)',field(unhex(1),t.*,unhex(1))-2))
,c.included_columns
) as record
from t
cross join (select ltrim
(
regexp_replace
(
concat_ws(' ',sort_array(collect_set(printf('$%010d',pos+1))))
,concat
(
'( ?('
,concat_ws
(
'|'
,collect_set
(
case
when cast(pe.val as int) >= 5000
or cast(pe.val as int) is null
then printf('\\$%010d',pos+1)
end
)
)
,'))|(?<=\\$)0+'
)
,''
)
) as included_columns
from t
lateral view posexplode(split(printf(concat('%s',repeat(concat(unhex(1),'%s'),field(unhex(1),*,unhex(1))-2)),*),'\\x01')) pe
) c
+---------+
| record |
+---------+
| 10 1000 |
| 20 2000 |
| 30 3000 |
+---------+
I don't think hive supports variable substitution in the function. So you would have to write a shell scripts that executes the first query which returns the required columns.Then you can assign it to a variable in shell script and then create a new query for creating files in local directory and run it via hive -e from bash.
create table t1(x int , y int) ; // table used for below query
Sample bash script :
cols =hive -e 'select concat_ws(',', case when min(x) > 5000 then 'x' end , case when min(y) > 5000 then 'y' end) from t1'
query ="INSERT OVERWRITE LOCAL DIRECTORY <directory name> ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' select $cols from t1 "
hive -e query

SQL Sever 2008 R2 - Transforming a table with xml list columns to individual rows in new table

I'm trying to write some SQL to help transition from one database to another. It's gone well so far, but I ran into a problem I can't wrap my brain around.
Original:
Id (bigint) | ColA (XML) | ColB (XML) | ... | RecordCreation
The XML for each column with XML looks like the following:
<ColA count="3"><int>3</int><int>9</int><int>6</int></ColA>
For any particular row, the "count" is the same for each list, ColB will also have 3, etc., but some lists are of strings.
In the new database:
Id (bigint) | Index (int) | ColA (int) | ColB (nvarchar(20)) | ... | RecordCreation
So if I start with
5 | <ColA count="3"><int>9</int><int>8</int><int>7</int></ColA> | <ColB count="3"><string>A</string><string>B</string><string>C</string></ColB> | ... | 2014-01-15 ...
I need out:
5 | 1 | 9 | A | ... | 2014-01-15 ...
5 | 2 | 8 | B | ... | 2014-01-15 ...
5 | 3 | 7 | C | ... | 2014-01-15 ...
For each of the rows in the original DB where Index (second column) is the position in the XML list the values for that row are coming from.
Any ideas?
Thanks.
Edit:
A colleague showed me a dirty way that looks like it might get me there. This is to transfer some existing data into the new database for testing purposes, it's not production and won't be used often; we're just starving for data to test on.
declare #count int
set #count = 0
declare #T1 ( Id bigint, Index int, ColA int, ColB nvarchar(20),..., MaxIndex int)
while #count < 12 begin
Insert into #T1
select Id, #count,
CAST(CONVERT(nvarchar(max), ColA.query('/ColA/int[sql:variable("#count")]/text()')) as int),
CONVERT(nvarchar(20), ColB.query('/ColB/string[sql:variable("#count")]/text()')),
...,
CAST(CONVERT(nvarchar(max), ColA.query('data(/ColA/#count)')) as int)
From mytable
set #count = #count + 1
end
Then I can insert from the temp table where Index < MaxIndex. There'll never be more than 12 indices and I think index is 0 based; easy fix if not. And each row may have a different count in its lists (but all lists of the same row will have the same count); that's why I went with MaxIndex and a temp table. And I may switch to real table that I drop when I'm done if the performance is too bad.
Try this query:
DECLARE #MyTable TABLE (
ID INT PRIMARY KEY,
ColA XML,
ColB XML
);
INSERT #MyTable (ID, ColA, ColB)
SELECT 5, N'<ColA count="3"><int>9</int><int>8</int><int>7</int></ColA>', N'<ColB count="3"><string>A</string><string>B</string><string>C</string></ColB>';
SELECT x.ID,
ab.*
FROM #MyTable x
CROSS APPLY (
SELECT a.IntValue, b.VarcharValue
FROM
(
SELECT ax.XmlCol.value('(text())[1]', 'INT') AS IntValue,
ROW_NUMBER() OVER(ORDER BY ax.XmlCol) AS RowNum
FROM x.ColA.nodes('/ColA/int') ax(XmlCol)
) a INNER JOIN
(
SELECT bx.XmlCol.value('(text())[1]', 'VARCHAR(50)') AS VarcharValue,
ROW_NUMBER() OVER(ORDER BY bx.XmlCol) AS RowNum
FROM x.ColB.nodes('/ColB/string') bx(XmlCol)
) b ON a.RowNum = b.RowNum
) ab;
Output:
/*
ID IntValue VarcharValue
-- -------- ------------
5 9 A
5 8 B
5 7 C
*/
Note: very likely, the performance could be horrible (even for an ad-hoc task)
Assumption:
For any particular row, the "count" is the same for each list, ColB
will also have 3, etc., but some lists are of strings.
A colleague showed me a dirty way that looks like it might get me there. This is to transfer some existing data into the new database for testing purposes, it's not production and won't be used often; we're just starving for data to test on.
declare #count int
set #count = 0
declare #T1 ( Id bigint, Index int, ColA int, ColB nvarchar(20),..., MaxIndex int)
while #count < 12 begin
Insert into #T1
select Id, #count,
CAST(CONVERT(nvarchar(max), ColA.query('/ColA/int[sql:variable("#count")]/text()')) as int),
CONVERT(nvarchar(20), ColB.query('/ColB/string[sql:variable("#count")]/text()')),
...,
CAST(CONVERT(nvarchar(max), ColA.query('data(/ColA/#count)')) as int)
From mytable
set #count = #count + 1
end
Then I can insert from the temp table where Index < MaxIndex. There'll never be more than 12 indices and I think index is 0 based; easy fix if not. And each row may have a different count in its lists (but all lists of the same row will have the same count); that's why I went with MaxIndex and a temp table. And I may switch to real table that I drop when I'm done if the performance is too bad.