I want generate XML file in a hierarchical form - sql

I have a table like this (Actually it contains more 6000 records)
IdIndustry | IndustryCode | IndustryName | ParentId
---------------------------------
1 | IND | Industry | NULL
2 | PHARM | Pharmacy | 1
3 | FIN | Finance | NULL
4 | CFIN | Corporate | 3
5 | CMRKT | Capital M | 4
DDL:
CREATE TABLE [dbo].[tblIndustryCodes](
[IdIndustry] [int] IDENTITY(1,1) NOT NULL,
[IndustryCode] [nvarchar](5) NULL,
[IndustryName] [nvarchar](50) NULL,
[ParentId] [int] NULL,
CONSTRAINT [PK_tblIndustryCodes] PRIMARY KEY CLUSTERED ([IdIndustry] ASC)
Inserts:
INSERT INTO [tblIndustryCodes]
([IndustryCode]
,[IndustryName]
,[ParentId])
VALUES
('IND','Industry',NULL),
('PHARM','Pharmacy',1),
('FIN','Finance',NULL),
('CFIN','Corporate Finance',3),
('CMRKT','Capital Markets',4)
And i want to generate a XML file like this(Simplified tree like structure)
<IND>
<PHARM>
</PHARM>
</IND>
<FIN>
<CFIN>
<CMRKT>
</CMRKT>
</CFIN>
<FIN>
I don't want to use recursion as it would downgrade the performance dramatically as this table has more than 60000 records in table.
I would be glad if i get the output in same format, since i will be using this output XML to send a request.
And more importantly it will be dynamic in nature.

Try this procedure not much sure about its efficiency as I am creating a temp table to get result
create procedure get_path as begin
DECLARE #cnt INT
DECLARE #n INT
DECLARE #tmpTable TABLE(id int,
indCode varchar(50),
indName varchar(100),
parentId int,
path varchar(500))
insert #tmpTable
select [IdIndustry], [IndustryCode], [IndustryName], [ParentId],
null from tbl
select #cnt = count(*) from #tmpTable where parentId is null
update a set a.path = CONCAT(b.indName,'/',a.indName) from #tmpTable a, #tmpTable b where b.parentid is null and a.parentid = b.id
select #n = count(*) from #tmpTable where path is null
while (#cnt < #n) begin
update a set a.path = concat(b.path, '/', b.indName, '/', a.indName) from #tmpTable a, #tmpTable b where b.path is not null and a.parentid = b.id
select #n = count(*) from #tmpTable where path is null
end
update #tmpTable set path = indName where parentid is null
select * from #tmpTable order by path
end
go
Query 1:
exec get_path
Results:
| ID | INDCODE | INDNAME | PARENTID | PATH |
-------------------------------------------------------------------------------
| 3 | FIN | Finance | (null) | Finance |
| 4 | CFIN | Corporate | 3 | Finance/Corporate |
| 5 | CMRKT | Capital M | 4 | Finance/Corporate/Corporate/Capital M |
| 1 | IND | Industry | (null) | Industry |
| 2 | PHARM | Pharmacy | 1 | Industry/Pharmacy |
Hope this helps.....
SQL FIDDLE

Related

Destruct column into multiple columns with Regex on Azure?

I've encountered a problem which I think can only be solved by regex functions only.
Sadly the support for regex based operations seems to be very poor on Microsoft side.
(Forgive me if I'm wrong, that's the first case when I've to use this platform)
What I've:
~400 million records in MS Azure SQL DB
encoded functionality what I've to decode into multiple columns (later I'll join metadata by these columns)
What I need:
a regex based function which can parse out the data (the output will be written into de1-6 columns)
The encoded column (what needs to be decoded) looks like this:
|encoded_val |
|-------------|
|PIT273OF_21 |
|PT273CT_21 |
|LT171CT2_31 |
|TV273JM_11 |
|TV273CND_13 |
|FIT865_11_CLC|
|AT865_104 |
|E865MFSP01 |
|LIT273CU_61 |
|E273_RH |
|E273CU_GTH |
|VSZ171JM_31 |
|E171CU_GTH |
|IT171RC_11 |
|WY171CU_61N |
|FV864_11 |
I need to decode this column with a regexp and create multiple columns with
| encoded | de1 | de2 | de3 | de4 | de5 | de6 |
|---------------|-----|-----|------|------|------|------|
| PIT273OF_21 | PIT | 273 | OF | NULL | 21 | NULL |
| PT273CT_21 | PT | 273 | CT | NULL | 21 | NULL |
| LT171CT2_31 | LT | 171 | CT | 2 | 31 | NULL |
| TV273JM_11 | TV | 273 | JM | NULL | 11 | NULL |
| TV273CND_13 | TV | 273 | CND | NULL | 13 | NULL |
| FIT865_11_CLC | FIT | 865 | NULL | NULL | 11 | CLC |
| AT865_104 | AT | 865 | NULL | NULL | 104 | NULL |
| E865MFSP01 | E | 865 | MFSP | 01 | NULL | NULL |
| LIT273CU_61 | LIT | 273 | CU | NULL | 61 | NULL |
| E273_RH | E | 273 | NULL | NULL | NULL | RH |
| E273CU_GTH | E | 273 | CU | NULL | NULL | GTH |
| VSZ171JM_31 | VSZ | 171 | JM | NULL | 31 | NULL |
| E171CU_GTH | E | 171 | CU | NULL | NULL | GTH |
| IT171RC_11 | IT | 171 | RC | NULL | 11 | NULL |
| WY171CU_61N | WY | 171 | CU | NULL | 61 | N |
| FV864_11 | FV | 864 | NULL | NULL | 11 | NULL |
Problems
the format is not fixed
the length of the blocks can vary
there can be missing values
... but let's say, I've some magic regex pattern, what can parse useful data from any string
What I've tried:
compute columns - they seem to only construct new columns from existing ones, not to destruct existing columns into new columns in clever way with pattern matching
user defined function - I did not figured out yet, how they could help, but they seem promising
overcomplicated functions what are impossible to understand/maintain, their execution time seems to be unacceptable
update [bd].[table]
set unitid = cast(LEFT(SUBSTRING([colname],PATINDEX('%[0-9]%',[colname]),100),PATINDEX('%[^0-9]%',SUBSTRING([colname],PATINDEX('%[0-9]%',[colname]),100) + '*') -1) as smallint);
Question:
What is the appropriate way to do this fast (on 400M records)?
CREATE TABLE [dbo].[Sample](
[id] [int] IDENTITY(1,1) NOT NULL,
[encoded] [varchar](60) NOT NULL,
CONSTRAINT [PK_Sample] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
insert into [Sample] values ('FIT865_11_CLC')
insert into [Sample] values ('PIT273OF_21')
declare #count int
declare #id int
declare #content varchar(50)
declare #table table
(
id int,
String varchar(256)
)
set #count = (select count(1) from sample)
set #id = 0
while(#count > 0)
begin
if (#id > 0)
select top(1) #id = id, #content = encoded from Sample where id > #id order by id
else
select top(1) #id = id, #content = encoded from Sample order by id
declare #len int = len(#content)
declare #position int = 1
declare #output varchar(256) = ''
declare #currentChar varchar(1)
declare #lastChar varchar(1)
while(#len > 0)
begin
set #currentChar = (select substring(#content, #position, 1))
if(#position = 1)
begin
set #lastChar = #currentChar
end
if(ISNUMERIC(#currentChar)<> ISNUMERIC(#lastChar)) --alpha
set #output = #output + ',' + #currentChar
else
set #output = #output + #currentChar
set #len = #len -1
set #position = #position + 1
set #lastChar = #currentChar
end
set #count = #count - 1
set #output = (select replace (#output,'_',','))
insert into #table values (#id, #output)
end
SELECT DISTINCT B.*
FROM #table A
CROSS APPLY (
SELECT
JSON_VALUE(J,'$[0]') AS de0
,JSON_VALUE(J,'$[1]') AS de1
,JSON_VALUE(J,'$[2]') AS de2
,JSON_VALUE(J,'$[3]') AS de3
,JSON_VALUE(J,'$[4]') AS de4
,JSON_VALUE(J,'$[5]') AS de5
,JSON_VALUE(J,'$[6]') AS de6
,JSON_VALUE(J,'$[7]') AS de7
FROM (VALUES ('["'+replace(replace(String,'"','\"'),',','","')+'"]')) A(J)
) B

SQL Server - SQL query to convert 0 or 1 or 2 rows into a single row with 2 columns

I have a schema as below
Test
--------------------
| Id | Name |
--------------------
| 1 | A001 |
| 2 | B001 |
| 3 | C001 |
--------------------
RelatedTest
---------------------------------
| Id | Name | TestId |
---------------------------------
| 1 | Jack | NULL |
| 2 | Joe | 2 |
| 3 | Jane | 3 |
| 4 | Julia | 3 |
---------------------------------
To briefly explain this schema RelatedTest has a nullable FK to Test and the FKId can appear either 0 or 1 or 2 times but never more than 2 times.
I am after a t-SQL query that reports the data in Test in the following format
TestReport
---------------------------------------------------------------------------
| TestId | TestName | RelatedTestName1 | RelatedTestName2 |
---------------------------------------------------------------------------
| 1 | A001 | NULL | NULL |
| 2 | B001 | Joe | NULL |
| 3 | C001 | Jane | Julia |
I can safely assume that TestReport will not need any more than two columns for RelatedTestName.
The schema is beyond my control and I am just looking to query it for some reporting.
I've been trying to utilise the Pivot function but I'm not entirely sure how I can use it so that RelatedTestName1 and RelatedTestName1 can be NULL in the case where there is no RelatedTest records. And also since RelatedTestName is a varchar I'm not sure how to apply an appropriate aggregate if that's what is needed.
Preparing Data:
DROP TABLE IF EXISTS Test
GO
CREATE TABLE Test (Id INT PRIMARY KEY, Name VARCHAR(10)) ON [PRIMARY]
GO
INSERT INTO Test Values
(1, 'A001')
,(2, 'B001')
,(3, 'C001')
GO
DROP TABLE IF EXISTS RelatedTest
GO
CREATE TABLE RelatedTest (
Id INT,
Name VARCHAR(10),
TestId INT FOREIGN KEY REFERENCES Test (Id)
) ON [PRIMARY]
GO
INSERT INTO RelatedTest Values
(1, 'Jack', NULL)
,(2, 'Joe', 2)
,(3, 'Jane', 3)
,(3, 'Julia', 3)
GO
Query:
;WITH CTE AS
(
SELECT TestId = T.Id
,TestName = T.Name
,RelatedTestName = RT.Name
,RN = ROW_NUMBER() OVER(PARTITION BY T.Id ORDER BY RT.Id ASC)
FROM Test T
LEFT JOIN RelatedTest RT
ON T.Id = RT.TestId
)
SELECT DISTINCT
C.TestId
,C.TestName
,RelatedTestName1 = (SELECT RelatedTestName FROM CTE A WHERE A.TestId = C.TestId AND A.RN = 1)
,RelatedTestName2 = (SELECT RelatedTestName FROM CTE A WHERE A.TestId = C.TestId AND A.RN = 2)
FROM CTE C;

Insert on a child table and update FK on parent

I have a parent table with the following structure and data:
---------------------------------------------
| Id | TranslationId | Name |
---------------------------------------------
| 1 | NULL | Image1.jpg |
| 2 | NULL | Image7.jpg |
| 3 | NULL | Picture_Test.png |
---------------------------------------------
And the empty child table which holds the translated images:
-------------------------------------------------------------------------
| Id | De | Fr | En |
-------------------------------------------------------------------------
| | | | |
-------------------------------------------------------------------------
Now I'm looking for a single query statement or at least few queries which I can run without any further programming. Doing this job with scripting or programming would be easy but I have often situations where I need this kind of insert / update. And developing each time a small console app is not feasible.
At the end the two tables should look like this:
---------------------------------------------
| Id | TranslationId | Name |
---------------------------------------------
| 1 | 28 | NULL |
| 2 | 29 | NULL |
| 3 | 30 | NULL |
---------------------------------------------
-------------------------------------------------------------------------
| Id | De | Fr | En |
-------------------------------------------------------------------------
| 28 | Image1.jpg | NULL | NULL |
| 29 | Image7.jpg | NULL | NULL |
| 30 | Picture_Test.png | NULL | NULL |
-------------------------------------------------------------------------
Thank you for any advice.
You could do it something like the below :
INSERT INTO Child
(
Id
,De
,Fr
,En
)
OUTPUT Inserted.Id INTO #Temp
SELECT Id
,De
,Fr
,En
FROM #Values --If you are using a table type to insert into the Child table as a set based approach
;WITH CTE
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY Id) AS Rnk
,Id
FROM #Temp
)
,CTE1 AS
(
SELECT ROW_NUMBER() OVER(ORDER BY Id) AS Rnk
,*
FROM Parent
)
UPDATE cte1
SET TranslationId = cte.Id
FROM CTE1 cte1
JOIN CTE cte ON cte.Rnk = cte1.Rnk
Demo, assuming Name is unique in the first table
create table tab1 (
id int identity
,TranslationId int null
,Name nvarchar(max) null
);
insert tab1 (Name)
values
('Image1.jpg')
,('Image7.jpg')
,('Picture_Test.png')
,(null)
create table tab2 (
id int identity (100,1)
,De nvarchar(max) null
,Fr nvarchar(max) null
,En nvarchar(max) null
);
-- Update them
declare #map table(
name nvarchar(max)
,ref int
);
insert tab2 (de)
output inserted.De, inserted.id
into #map(Name, ref)
select Name
from tab1 src
where Name is not null and not exists (select 1 from tab2 t2 where t2.De = src.Name);
update t1 set TranslationId = ref, Name = null
from tab1 t1
join #map m on t1.Name = m.Name;
select * from tab1;
select * from tab2;
I figured out in the meantime how to do it. Applied on the database, the query looks like this:
DECLARE #Temp TABLE (ImageId INT, Id INT)
MERGE INTO Translation USING
(
SELECT Image.Name AS Name, Image.Id AS ImageId
FROM Candidate
INNER JOIN Candidacy ON Candidate.Id = Candidacy.CandidateId
INNER JOIN Election ON Candidacy.ElectionId = Election.Id
INNER JOIN SmartVoteCandidate ON Candidate.Id = SmartVoteCandidate.CandidateId
INNER JOIN Image ON SmartVoteCandidate.SpiderImageId = Image.Id
WHERE Election.Id = 1575) AS temp ON 1 = 0
WHEN NOT MATCHED THEN
INSERT (De)
VALUES (temp.Name)
OUTPUT temp.ImageId, INSERTED.Id
INTO #Temp (ImageId, Id);
UPDATE Image
SET Image.TranslationId = t.Id, Name = NULL
FROM #Temp t
WHERE Image.Id = t.ImageId
The solution is heavily inspired by
Is it possible to for SQL Output clause to return a column not being inserted?
Using a join in a merge statement

SQL - How do I add the same Scope_Identity() value in two tables in the same stored procedure, provided I'm adding data into them in the form of TVP?

I have the following tables -
CREATE TABLE Entity
(
EntityId INT IDENTITY(1,1) PRIMARY KEY,
EntityName NVARCHAR(20),
EntityShortName NVARCHAR(5),
EntityDescription NVARCHAR(100)
)
CREATE TABLE EntityAttributes
(
EntityId INT FOREIGN KEY REFERENCES Queues(EntityId),
AttributeType INT,
AttributeValue NVARCHAR(20)
)
And the following TVPs that I'm making use of to add values into these tables -
CREATE TYPE TVP_Entity AS TABLE
(
EntityName NVARCHAR(20),
EntityShortName NVARCHAR(5),
EntityDescription NVARCHAR(100)
)
CREATE TYPE TVP_EntityAttributes AS TABLE
(
AttributeType INT,
AttributeValue NVARCHAR(20)
)
An entity would be defined with Name, ShortName, Description in the first table and other attributes which will be stored in the second table (in EAV form).
I want to write a stored procedure with which I can add multiple entities in a single execution using the TVPs defined above. How can I accomplish this? I'm stuck at the point where I'm unable to use SCOPE_IDENTITY() to add values to the second table as I'm inserting multiple entities.
Try this by adding the EntityName to the TVP_EntityAttributes type. Full example:
SQL Fiddle
MS SQL Server 2008 Schema Setup:
CREATE TABLE Entity
(
EntityId INT IDENTITY(1,1) PRIMARY KEY,
EntityName NVARCHAR(20) UNIQUE,
EntityShortName NVARCHAR(5),
EntityDescription NVARCHAR(100)
)
CREATE TABLE EntityAttributes
(
EntityId INT FOREIGN KEY REFERENCES Entity(EntityId),
AttributeType INT,
AttributeValue NVARCHAR(20)
)
CREATE TYPE TVP_Entity AS TABLE
(
EntityName NVARCHAR(20),
EntityShortName NVARCHAR(5),
EntityDescription NVARCHAR(100)
)
CREATE TYPE TVP_EntityAttributes AS TABLE
(
EntityName NVARCHAR(20),
AttributeType INT,
AttributeValue NVARCHAR(20)
)
GO
Now add your stored procedure:
CREATE PROCEDURE AddEntity
(
#entities TVP_Entity READONLY,
#attributes TVP_EntityAttributes READONLY
)
AS
INSERT INTO Entity(EntityName, EntityShortName, EntityDescription)
SELECT [EntityName], [EntityShortName], [EntityDescription]
FROM #entities
INSERT INTO EntityAttributes([EntityId],[AttributeType],[AttributeValue])
SELECT E.EntityId, A.[AttributeType], A.[AttributeValue]
FROM Entity E
INNER JOIN #attributes A
ON E.EntityName = A.EntityName
Query 1:
DECLARE #entities TVP_Entity
DECLARE #attributes TVP_EntityAttributes
INSERT INTO #entities(EntityName, EntityShortName, EntityDescription)
SELECT 'Entity1', 'E1', 'Entity One'
UNION
SELECT 'Entity2', 'E2', 'Entity Two'
UNION
SELECT 'Entity3', 'E3', 'Entity Three'
INSERT INTO #Attributes( EntityName,AttributeType,AttributeValue)
SELECT 'Entity1', 1, 2
UNION
SELECT 'Entity1', 2, 3
UNION
SELECT 'Entity2', 3, 2
UNION
SELECT 'Entity2', 4, 3
UNION
SELECT 'Entity3', 5, 4
UNION
SELECT 'Entity3', 6, 4
UNION
SELECT 'Entity3', 7, 21
UNION
SELECT 'Entity3', 8, 32
EXEC AddEntity #entities , #attributes
SELECT *
FROM Entity E
INNER JOIN EntityAttributes EA
ON E.EntityId = EA.EntityId
Results:
| EntityId | EntityName | EntityShortName | EntityDescription | EntityId | AttributeType | AttributeValue |
|----------|------------|-----------------|-------------------|----------|---------------|----------------|
| 1 | Entity1 | E1 | Entity One | 1 | 1 | 2 |
| 1 | Entity1 | E1 | Entity One | 1 | 2 | 3 |
| 2 | Entity2 | E2 | Entity Two | 2 | 3 | 2 |
| 2 | Entity2 | E2 | Entity Two | 2 | 4 | 3 |
| 3 | Entity3 | E3 | Entity Three | 3 | 5 | 4 |
| 3 | Entity3 | E3 | Entity Three | 3 | 6 | 4 |
| 3 | Entity3 | E3 | Entity Three | 3 | 7 | 21 |
| 3 | Entity3 | E3 | Entity Three | 3 | 8 | 32 |

Inserting four columns into one

Good morning,
I have a table TestSeed that stores a multiple choices test with the following structure:
QNo QText QA1 QA2 QA3 QA4
It already contains data.
I would like to move some of the columns to a temp table with the following structure:
QNo QA
Where QNo will store the question number from the first table and QA will store QA1, QA2, QA3 and QA4 over four rows of data.
I am trying to do it in a SQL stored procedure. And it got down to the following situation:
I want to create a nested loop where I can go through the TestSeed table rows in the outer loop and then go through the four QA fields and insert them in the inner loop.
So my code will look something like this:
Declare #TempAnswers as table
(
[QNo] int,
[QAnswer] [nvarchar](50) NULL,
)
DECLARE #QNO int
DECLARE QROW CURSOR LOCAL FOR select QNo from #TempSeed
OPEN QROW
FETCH NEXT FROM QROW into #QNO
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #i INT
SET #i = 1
WHILE (#i <=4)
Begin
insert into #TempAnswers
(
[QNo],
[QAnswer]
)
select QNo, 'QA'+#i --This is the part I need
from #TempSeed
SET #i = #i +1
END
FETCH NEXT FROM QROW into #QNO
END
CLOSE IDs
DEALLOCATE IDs
So I guess my question is: can I use a concatenated string to refer to a column name in SQL? and if so how?
I am sort of a beginner. I would appreciate any help I can.
No need for loop, you can simply use the UNPIVOT table operator to do this:
INSERT INTO temp
SELECT
QNO,
val
FROM Testseed AS t
UNPIVOT
(
val
FOR col IN([QA1], [QA2], [QA3], [QA4])
) AS u;
For example, if you have the following sample data:
| QNO | QTEXT | QA1 | QA2 | QA3 | QA4 |
|-----|-------|-----|-----|-----|-----|
| 1 | q1 | a | b | c | d |
| 2 | q2 | b | c | d | e |
| 3 | q3 | e | a | b | c |
| 4 | q4 | a | c | d | e |
| 5 | q5 | c | d | e | a |
The previous query will fill the temp table with:
| QNO | QA |
|-----|----|
| 1 | a |
| 1 | b |
| 1 | c |
| 1 | d |
| 2 | b |
| 2 | c |
| 2 | d |
| 2 | e |
| 3 | e |
| 3 | a |
| 3 | b |
| 3 | c |
| 4 | a |
| 4 | c |
| 4 | d |
| 4 | e |
| 5 | c |
| 5 | d |
| 5 | e |
| 5 | a |
SQL Fiddle Demo
The UNPIVOT table operator, will convert the values of the four columns [QA1], [QA2], [QA3], [QA4] into rows, only one row.
Then you can put that query inside a stored procedure.
So, to answer your last question, you can use Dynamic SQL which involves creating your query as a STRING and then executing it, in case you really want to stick to the method you already started.
You will have to declare a variable to store the text of your query:
DECLARE #query NVARCHAR(MAX)
SET #query = 'SELECT QNo, QA' + #i + ' FROM #TempSeed'
EXEC sp_executesql #query
This will have to be done everytime you build your query which is to be executed (declaration, seting the text of the query and executing it).
If you want something simpler, there are other answers here which will work.
Try this:
Declare #TempAnswers as table
(
[QNo] int,
[QAnswer] [nvarchar](50) NULL,
);
INSERT INTO #TempAnswers(QNo, QAnswer)
SELECT QNo, QA
FROM (SELECT QNo, QA1 AS QA FROM TestSeed
UNION
SELECT QNo, QA2 AS QA FROM TestSeed
UNION
SELECT QNo, QA3 AS QA FROM TestSeed
UNION
SELECT QNo, QA4 AS QA FROM TestSeed
) AS A
ORDER BY QNo;