Creating a lookup table from a column in table - sql

I have a table in SQL Server that contains an ID and also a column with multiple values separated by a comma (like the example below)
ID Category_number
-------------------------
1 3,5,6,8
2 4,8,23
3 4,7,5,3
I need to make this into a lookup table with 1 category number per row, so like below;
ID Category_Number
-------------------------
1 3
1 5
1 6
I have been told that XPATH might be the solution to this. Does anyone have any sample code that will do this?
Thanks

See this answer Split
Create that function in your database.
Then you can create the results you want using:
SELECT
ID,
A.S Category_Number
FROM
MyCsvTable
CROSS APPLY dbo.Split (',', MyCsvTable.Category_Number) A

I know you just accepted the solution, but assuming you're using SQL Server, here is an alternative approach without building a function:
SELECT A.[id],
Split.a.value('.', 'VARCHAR(100)') AS Cat
FROM
(SELECT [id],
CAST ('<M>' + REPLACE(Cat, ',', '</M><M>') + '</M>' AS XML) AS String
FROM YourTable
) AS A
CROSS APPLY String.nodes ('/M') AS Split(a)
And some Fiddle: http://sqlfiddle.com/#!3/cf427/3
Best of luck!

You may also want to check HierarchyID in SQL SERVER. Some tutorial: http://www.codeproject.com/Articles/37171/HierarchyID-Data-Type-in-SQL-Server-2008
Basically all you need to do is to loop through string and find position of commas and replace them with chr(13) or such. In Oracle it can be done easily in a single query. I think the same can be done using HierarchyID in SQL Server starting from 2008 or maybe even earlier versions.

Related

Stripping Values between two brackets {}

Good Afternoon,
I'm trying to query a column that gets data between two brackets. there may be multiple sets in the column such as : {Abrasision} {None} {Bruise}
i use this and it doesn't do exactly what i want, because i think i only use one bracket in the query. i want to get each value in my result set and insert into a table variable. Just having a little bit of trouble.
SELECT
LEFT(InjuryCategory, CHARINDEX('{', InjuryCategory)-1),
SUBSTRING(InjuryCategory, CHARINDEX('{', InjuryCategory)+1, LEN(InjuryCategory)-CHARINDEX('{', InjuryCategory)-CHARINDEX('{',REVERSE(InjuryCategory ))),
RIGHT(InjuryCategory, CHARINDEX('{', REVERSE(InjuryCategory))-1)
FROM TblVictim
You may use STRING_SPLIT(), STUFF() and STRING_AGG() to get the expected results. Note, that STRING_SPLIT() orders the results (using enable_ordinal parameter) only in Azure SQL Database, Azure SQL Managed Instance, and Azure Synapse Analytics (serverless SQL pool only), so STRING_AGG() may aggregate differently.
Test data:
SELECT *
INTO tblVictim
FROM (
VALUES ('{Abrasision} {None} {Bruise}')
) t (InjuryCategory)
Statement:
SELECT STRING_AGG(STUFF(s.[value], 1, CHARINDEX('{', s.[value]), ''), ' ') AS Category
FROM tblVictim t
CROSS APPLY STRING_SPLIT(t.InjuryCategory, '}') s
WHERE s.[value] <> ''
Result:
Category
----------------------
Abrasision None Bruise
In newer versions of SQL Server, you can combine STRING_SPLIT and TRIM
SELECT TRIM('{}' FROM s.[value]) AS Category
FROM TblVictim v
CROSS APPLY STRING_SPLIT(v.InjuryCategory, ' ') s
WHERE s.[value] <> '';
db<>fiddle
Quick and dirty, since this is delimited data, pretend it's XML. Setup:
DECLARE #tblVictim TABLE(ID INT IDENTITY, InjuryCategory NVARCHAR(MAX));
INSERT #tblVictim(InjuryCategory)
VALUES
('{Abrasision} {None} {Bruise}'),
('{Abrasision} {<5} {Bruise; very severe}');
Query:
WITH data AS (
SELECT ID, xml = CAST(REPLACE(REPLACE(InjuryCategory,
'{', '<i><![CDATA['),
'}', ']]></i>') AS XML
)
FROM #tblVictim
)
SELECT ID, node.value('text()[1]', 'nvarchar(max)')
FROM data
CROSS APPLY xml.nodes('i') AS nodes(node)
Note that this completely breaks down (with no easy fixes) if there are unbalanced delimiters.

SQL Server: display whole column only if substring found

Working with SQL Sever 2016. I am constrained by the fact we cannot create functions or stored procedures. I am trying to find %word% in many columns across a table (75). Right now, I have a very large clump of
and (fieldname1 like %word%
or fieldname2 like %word%
or fieldname3 like %word%) etc.
While cumbersome, this does provide me the correct results. However:
I am looking to simplify this and
in the select, I want to display the whole column if and only if it finds %word% (or even just the column name would work)
Thank you in advance for any thoughts.
--...slow...
declare #searchfor varchar(100) = '23';
select #searchfor as [thevalue],
thexml.query('for $a in (/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))])
return concat(local-name($a[1]), ",")').value('.', 'nvarchar(max)') as [appears_in_columns],
*
from
(
select *, (select o.* for xml path(''), type) as thexml
from sys.all_objects as o --table goes here
) as src
where thexml.exist('/*[contains(upper-case(.), upper-case(sql:variable("#searchfor")))]') = 1;
One option uses cross apply to unpivot the table and then search:
select v.*
from mytable t
cross apply (values
('fieldname1', fieldname1),
('fieldname2', fieldname2),
('fieldname3', fieldname3)
) v(fieldname, fieldvalue)
where v.fieldvalue like '%word%'
Note that if more than one column contains the search word, you will get several rows in the resultset. I am unsure how you want to handle this use case (there are options).
SELECT OBJECT_NAME(id) ObjectName , [Text]
FROM syscomments
WHERE TEXT LIKE '%word%'

data type of each characters in a varchar T-sql

I'm curious on the data I get from someone. Most of the time I need to get 3 integers then a space then eight integers.
And The integration created a column varchar(20) ... Don't doubt it works, but that gives me some matching errors.
Because of this, I'd like to know what is the data type of the characters on each row.
For exemple : 0 is for integer, s for space, a for char, * for specific
AWB | data type
---------------------------------
012 12345678 | 000s00000000
9/5 ab0534 | 0*0saa0000
I'd like to know if there is a function or a formula to get this kind of results.
Right after I'll be able to group by this column and finally be able to check how good is the data quality.
I don't know if there is a specific word for what I tried to explain, so excuse me if this is a duplicate of a post, I didn't find it.
Thank you for your feedback.
There's nothing built-in, but you might use an approach like this:
DECLARE #tbl TABLE(ID INT IDENTITY,AWB VARCHAR(100));
INSERT INTO #tbl VALUES
('012 12345678')
,('9/5 ab0534');
WITH cte AS
(
SELECT t.ID
,t.AWB
,A.Nmbr
,C.YourMask
FROM #tbl t
CROSS APPLY (SELECT TOP (DATALENGTH(t.AWB)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY (SELECT SUBSTRING(t.AWB,A.Nmbr,1)) B(SingleCharacter)
CROSS APPLY (SELECT CASE WHEN B.SingleCharacter LIKE '[0-9]' THEN '0'
WHEN B.SingleCharacter LIKE '[a-z]' THEN 'a'
WHEN B.SingleCharacter = ' ' THEN 's'
ELSE '*' END) C(YourMask)
)
SELECT ID
,AWB
,(
SELECT YourMask
FROM cte cte2
WHERE cte2.ID=cte.ID
ORDER BY cte2.Nmbr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') YourMaskConcatenated
FROM cte
GROUP BY ID,AWB;
The idea in short:
The cte will create a derived set of your table.
The first CROSS APPLY will create a list of numbers as long as the current AWB value.
The second CROSS APPLY will read each character separately.
The third CROSS APPLY will finally use some rather simple logic to translate your values to the mask you expect.
The final SELECT will then use GROUP BY and a correlated sub-query with FOR XML to get the mask characters re-concatenated (With version v2017+ this would be easier calling STRING_AGG()).

how to replace a column value which is separated by comma in SQL

I am having a table which is having a column named as CDR.
In that CDR column we have values stored as comma separated like 20,5,40,10,30
I just need to replace last value(here it is 30) to 0 in every row.
Can someone suggest me how can we do?
Thanks
If you are able, first correct the database design as the table is not in first normal form. It is bad design to have more than one value stored in one column, as evidenced by you having to ask this question. :-) Having said that, I have to deal with vendor data that has the same issue that is beyond my control to change, so in Oracle 11g I would do this:
update table_name
set CDR = regexp_replace(CDR, '(.*,)\d+$', '\10');
The regex matches and remembers all characters up to and including the last comma before one or more digits right before the end of the string. The replace string is the remembered part referenced by the \1, referring to the first grouping of characters inside parenthesis), plus the 0.
If you are using SQL Server, this should do for you.
create table #A(id int , cdr varchar(100))
insert into #A values(1,'10,20,30,40'),(2, '20,30,40,50'),(3,'30,40,50,60,70')
Declare #tA as table(id int , String varchar(10))
insert into #tA
SELECT id,
Split.a.value('.', 'VARCHAR(100)') AS String
FROM (SELECT [id],
CAST ('<M>' + REPLACE([cdr], ',', '</M><M>') + '</M>' AS XML) AS String
FROM #A) AS A CROSS APPLY String.nodes ('/M') AS Split(a);
delete from #tA where [String] = '30'
SELECT distinct id,
ISNULL(STUFF((SELECT ', ' + String
FROM #tA t
WHERE t.id = ta.id
FOR XML PATH('')
), 1, 1, ''), '') AS Str
into #tempA
FROM #tA ta
select * from #tempA
drop table #A, #tempA
UPDATE TableName
SET CDR = REPLACE(CDR, (SUBSTRING( CDR, LEN(CDR) - CHARINDEX(',',REVERSE(CDR)) + 2 , LEN(CDR))),0);
You should think about splitting up your comma separated list into a separate table. That way you can do other things in SQL. SQL is not the best with string manipulation and your queries are gonna get obscene and unruly.
table Users
user_id user_name job_list
1 Billy "1,2,3,4"
table Jobs
job_id job_desc
1 plumber
2 carpenter
3 electrician
4 programmer
If you do this you're gonna have some heartaches where a job goes away or something you're gonna have a lot of annoying cleanup like #jarlh suggests.
If you make a third table to hold the relationships user_id to job_id you will have a much better time if you need to do something like delete a job_id from existence. Of course this is all made up based on your limited question, but it should help you out.
table UserJobRelationship
relationship_id user_id job_id
1 1 1
2 1 2
3 1 3
4 1 4
Gives you much more flexibility and allows you to delete the most recent entry. You can simply just do max of relationship_id where user_id equals that user or you can do it for the whole table.

Count the occurences of all individual values in a multivalued field in SQL Server

Features Impressive
A,B,C
D,C
A,D
B,C,D
This is a column in my database that contains multiple values that comes from combobox.
I want to count the number of occurrences of each value in this column so that I can generate a bar chart out of this reflecting how many people liked the specific features.
Output I want is
A- 2
B- 2
C- 3
D- 3
Please help me with this SQL query.
You have a very poor design. You should be storing individual values in a separate row in a junction table -- one row per whatever and value.
Given the data structure, here is a method to do what you want -- assuming that you have a lit of allowed values:
select av.feature, count(t.feature)
from AllowedValues av left join
tables t
on ',' + av.feature + ',' like '%,' + t.features + ',%'
group by av.feature;
If you don't have an explicit list of features, you can create one using a CTE, something like:
with AllowedValues as
select 'A' as feature union all
. . .
)
The performance of this query will be lousy. And, there is really no way to make it better without fixing the data structure.
So, I repeat. You should fix the data structure and use a junction table instead of storing a list as a string. In SQL, tables are for storing lists. Strings are for, well, storing strings.
As mentioned by others really this is poor design you should never store comma separated values in a single column.
Use a Split Function to split the comma separated values into individual rows then count the individual rows. Something like this.
;With CTE as
(
SELECT Split.a.value('.', 'VARCHAR(100)') SP_COL
FROM (SELECT Cast ('<M>' + Replace(feature, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM [table]) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
)
Select SP_COL,COUNT(1) as [COUNT]
FROM CTE
Group By SP_COL