Here is the output.
ID Stack
-----------------------------------
123 307290,303665,307285
123 307290,307285,303424,303665
123 307290,307285,303800,303665
123 307061,307290
I want output like only last three row. The reason is in 1st output line stack column all three numbers are available in output line 2 and 3 stack column, so I don't need output line 1.
But the output lines 2,3,4 is different so I want those lines in my result.
I have tried doing it with row_number() and charindex but I'm not getting the proper result.
Thank you.
All the comments telling you to change your database's structure are right! You really should avoid comma separated values. This is breaking 1.NF and will be a pain in the neck forever.
The result of the second CTE might be used to shift all data into a new 1:n related structure.
Something like this?
DECLARE #tbl TABLE(ID INT,Stack VARCHAR(100));
INSERT INTO #tbl VALUES
(123,'307290,303665,307285')
,(123,'307290,307285,303424,303665')
,(123,'307290,307285,303800,303665')
,(123,'307061,307290');
WITH Splitted AS
(
SELECT ID
,Stack
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowIndex
,CAST('<x>' + REPLACE(Stack,',','</x><x>') + '</x>' AS XML) Casted
FROM #tbl
)
,DerivedDistinctValues AS
(
SELECT DISTINCT
ID
,Stack
,RowIndex
,StackNr.value('.','int') AS Nr
FROM Splitted
CROSS APPLY Casted.nodes('/x') AS A(StackNr)
)
SELECT ddv1.ID
,ddv1.Stack
FROM DerivedDistinctValues AS ddv1
FULL OUTER JOIN DerivedDistinctValues AS ddv2 ON ddv1.RowIndex<>ddv2.RowIndex
AND ddv1.Nr=ddv2.Nr
WHERE ddv2.ID IS NULL
GROUP BY ddv1.ID,ddv1.Stack
This will be slow, especially with larger data sets.
Some explanation:
The first CTE will transform the CSV numbers to <x>307290</x><x>303665</x>... This can be casted to XML, which allows to generate a derived table returning all the numbers as rows. This happens in the second CTE calling the XQuery function .nodes().
The last query will do a full outer join - each with each. All rows, where there is at least one row without a corresponding row are to be kept.
But I assume, that this might not work with each and any situation (e.g. circular data)
Related
I have the following table:
Id
Category
1
some thing
2
value
This table contains a lot of rows and what I'm trying to do is to update all the Category values to change every first letter to caps. For example, some thing should be Some Thing.
At the moment this is what I have:
UPDATE MyTable
SET Category = (SELECT UPPER(LEFT(Category,1))+LOWER(SUBSTRING(Category,2,LEN(Category))) FROM MyTable WHERE Id = 1)
WHERE Id = 1;
But there are two problems, the first one is trying to change the Category Value to upper, because only works ok for 1 len words (hello=> Hello, hello world => Hello world) and the second one is that I'll need to run this query X times following the Where Id = X logic. So my question is how can I update X rows? I was thinking in a cursor but I don't have too much experience with it.
Here is a fiddle to play with.
You can split the words apart, apply the capitalization, then munge the words back together. No, you shouldn't be worrying about subqueries and Id because you should always approach updating a set of rows as a set-based operation and not one row at a time.
;WITH cte AS
(
SELECT Id, NewCat = STRING_AGG(CONCAT(
UPPER(LEFT(value,1)),
SUBSTRING(value,2,57)), ' ')
WITHIN GROUP (ORDER BY CHARINDEX(value, Category))
FROM
(
SELECT t.Id, t.Category, s.value
FROM dbo.MyTable AS t
CROSS APPLY STRING_SPLIT(Category, ' ') AS s
) AS x GROUP BY Id
)
UPDATE t
SET t.Category = cte.NewCat
FROM dbo.MyTable AS t
INNER JOIN cte ON t.Id = cte.Id;
This assumes your category doesn't have non-consecutive duplicates within it; for example, bora frickin bora would get messed up (meanwhile bora bora fickin would be fine). It also assumes a case insensitive collation (which could be catered to if necessary).
In Azure SQL Database you can use the new enable_ordinal argument to STRING_SPLIT() but, for now, you'll have to rely on hacks like CHARINDEX().
Updated db<>fiddle (thank you for the head start!)
I want to convert a comma separated list back into a table.
For eg.
I have a table which looks like this
Sid roleid
500 1,5,
501 1,5,6,
I want output like this
Sid roleid
500 1
500 5
501 1
501 5
501 6
Please help.
Create table #temp(Sid int,roleid varchar(100))
Insert into #temp values(500,'1,5,'),(501,'1,5,6,')
Using STRING_SPLIT() means, that you are working on SQL Server 2016 (oder higher).
However, STRING_SPLIT() has a huge draw back: It is not guaranteed to return the items in the expected order (see the docs, section "Remarks"). In my eyes this is an absolut show stopper...
But - luckily - there is a fast and easy-to-use workaround in v2016+:
Create table #temp(Sid int,roleid varchar(100))
Insert into #temp values(500,'1,5,'),(501,'1,5,6,');
SELECT t.[Sid]
,A.[key] AS position
,A.[value] AS roleid
FROM #temp t
CROSS APPLY OPENJSON(CONCAT('["',REPLACE(t.roleid,',','","'),'"]')) A
WHERE A.[value]<>'';
A simple number array 1,3,5 needs nothing more than brackets to be a JSON array ([1,3,5]). In your case, due to the trailing comma, I deal with it as strings. 1,3,5, will be taken as array of strings: ["1","3","5",""]. The final empty string is taken away by the WHERE clause. The rest is easy...
Other than STRING_SPLIT() the docs proof, that OPENJSON will reflect an item's position in the [key] column:
When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.
General hint: avoid STRING_SPLIT() as lons as there is no additional key/position column added to its result set.
Use string_split() :
select t.sid, spt.value
from table t cross apply
string_split(t.roleid, ',') as spt
order by t.sid, spt.value;
Use sring_split():
select t.sid, value rid
from t
cross apply string_split(t.roleid, ',') rid
order by t.sid, rid
I'm curious on the data I get from someone. Most of the time I need to get 3 integers then a space then eight integers.
And The integration created a column varchar(20) ... Don't doubt it works, but that gives me some matching errors.
Because of this, I'd like to know what is the data type of the characters on each row.
For exemple : 0 is for integer, s for space, a for char, * for specific
AWB | data type
---------------------------------
012 12345678 | 000s00000000
9/5 ab0534 | 0*0saa0000
I'd like to know if there is a function or a formula to get this kind of results.
Right after I'll be able to group by this column and finally be able to check how good is the data quality.
I don't know if there is a specific word for what I tried to explain, so excuse me if this is a duplicate of a post, I didn't find it.
Thank you for your feedback.
There's nothing built-in, but you might use an approach like this:
DECLARE #tbl TABLE(ID INT IDENTITY,AWB VARCHAR(100));
INSERT INTO #tbl VALUES
('012 12345678')
,('9/5 ab0534');
WITH cte AS
(
SELECT t.ID
,t.AWB
,A.Nmbr
,C.YourMask
FROM #tbl t
CROSS APPLY (SELECT TOP (DATALENGTH(t.AWB)) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values) A(Nmbr)
CROSS APPLY (SELECT SUBSTRING(t.AWB,A.Nmbr,1)) B(SingleCharacter)
CROSS APPLY (SELECT CASE WHEN B.SingleCharacter LIKE '[0-9]' THEN '0'
WHEN B.SingleCharacter LIKE '[a-z]' THEN 'a'
WHEN B.SingleCharacter = ' ' THEN 's'
ELSE '*' END) C(YourMask)
)
SELECT ID
,AWB
,(
SELECT YourMask
FROM cte cte2
WHERE cte2.ID=cte.ID
ORDER BY cte2.Nmbr
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)') YourMaskConcatenated
FROM cte
GROUP BY ID,AWB;
The idea in short:
The cte will create a derived set of your table.
The first CROSS APPLY will create a list of numbers as long as the current AWB value.
The second CROSS APPLY will read each character separately.
The third CROSS APPLY will finally use some rather simple logic to translate your values to the mask you expect.
The final SELECT will then use GROUP BY and a correlated sub-query with FOR XML to get the mask characters re-concatenated (With version v2017+ this would be easier calling STRING_AGG()).
I have a similar situation. I start out with a table that has data input into a column from another source. This data is comma delimited coming in. I need to manipulate the data to remove a section at the end of each. So I split the data and remove the end with the code below. (I added the ID column later to be able to sort. I also added WITH SCHEMABINDING later to add an XML index but nothing works. I can remove this ... and the ID column, but I do not see any difference one way or the other):
ALTER VIEW [dbo].[vw_Routing]
WITH SCHEMABINDING
AS
SELECT TOP 99.9999 PERCENT
ROW_NUMBER() OVER (ORDER BY CableID) - 1 AS ID,
CableID AS [CableID],
SUBSTRING(m.n.value('.[1]', 'varchar(8000)'), 1, 13) AS Routing
FROM
(SELECT
CableID,
CAST('<XMLRoot><RowData>' + REPLACE([RouteNodeList], ',', '</RowData><RowData>') + '</RowData></XMLRoot>' AS xml) AS x
FROM
[dbo].[Cables]) t
CROSS APPLY
x.nodes('/XMLRoot/RowData') m (n)
ORDER BY
ID)
Now I need to concatenate data from the Routing column's rows into one row grouped by another column into a column again. I have the code working except that it is reordering my data; I must have the data in the order it is input into the table as it is Cable Routing information. I must also remove duplicates. I use the following code. The SELECT DISTINCT removes the duplicates, but reorders the data. The SELECT (without DISTINCT) keeps the correct data order, but does NOT remove the duplicates:
Substring(
(
SELECT DISTINCT ','+ x3.Routing AS [text()] --This DISTINCT reorders the routes once concatenated.
--SELECT ','+ x3.Routing AS [text()] --This without the DISTINCT does not remove duplicates.
From vw_Routing x3
Where x3.CableID = c.CableId
For XML PATH ('')
), 2, 1000) [Routing],
I tried the code you gave above and it provided the same results with the DISTINCT reordering the data but without DISTINCT not removing the duplicates.
Perhaps GROUP BY with ORDER BY will work:
stuff((select ','+ x3.Routing AS [text()] --This DISTINCT reorders the routes once concatenated.
--SELECT ','+ x3.Routing AS [text()] --This without the DISTINCT does not remove duplicates.
from vw_Routing x3
where x3.CableID = c.CableId
group by x3.Routing
order by min(x3.id)
for XML PATH ('')
), 1, 1, '') as [Routing],
I also replaced the SUBSTRING() with STUFF(). The latter is more standard for this operation.
To https://stackoverflow.com/users/1144035/gordon-linoff
Unfortunately, that did not work. It gave me the same result as my select statement; that is, no dups but reordering data.
HOWEVER, I found the correct answer earlier today:
I figured it out finally!! I still have to get implement it within the other code and add the new Cable Area code, but the hard part it over!!!!!
I am going to post the following to the forums so that they know not to work on it .... I was writing this to send to my friend for his help, but I figured it out myself before I sent it.
I started with raw, comma separated data in the records of a table … the data is from another source. I had to remove some of the information from each value, so I used the following code to split it up and manipulate it:
Code1
Once that was done, I had to put the manipulated data back in the same form in the same order and with no duplicates. So I needed a SELECT DISTINCT. When I used the commented out SELECT DISTINCT below, it removed duplicates but it changed the order of the data which I could not have as it is Cable Tray Routing Data. When I took out the SELECT DISTINCT, it kept correct order, but left duplicates.
Because I was using XML PATH, I had to change this code …… To this code so that I could use SELECT DISTINCT remove the duplicates:Code2 and Code3
Features Impressive
A,B,C
D,C
A,D
B,C,D
This is a column in my database that contains multiple values that comes from combobox.
I want to count the number of occurrences of each value in this column so that I can generate a bar chart out of this reflecting how many people liked the specific features.
Output I want is
A- 2
B- 2
C- 3
D- 3
Please help me with this SQL query.
You have a very poor design. You should be storing individual values in a separate row in a junction table -- one row per whatever and value.
Given the data structure, here is a method to do what you want -- assuming that you have a lit of allowed values:
select av.feature, count(t.feature)
from AllowedValues av left join
tables t
on ',' + av.feature + ',' like '%,' + t.features + ',%'
group by av.feature;
If you don't have an explicit list of features, you can create one using a CTE, something like:
with AllowedValues as
select 'A' as feature union all
. . .
)
The performance of this query will be lousy. And, there is really no way to make it better without fixing the data structure.
So, I repeat. You should fix the data structure and use a junction table instead of storing a list as a string. In SQL, tables are for storing lists. Strings are for, well, storing strings.
As mentioned by others really this is poor design you should never store comma separated values in a single column.
Use a Split Function to split the comma separated values into individual rows then count the individual rows. Something like this.
;With CTE as
(
SELECT Split.a.value('.', 'VARCHAR(100)') SP_COL
FROM (SELECT Cast ('<M>' + Replace(feature, ',', '</M><M>') + '</M>' AS XML) AS Data
FROM [table]) AS A
CROSS APPLY Data.nodes ('/M') AS Split(a)
)
Select SP_COL,COUNT(1) as [COUNT]
FROM CTE
Group By SP_COL