Related
I've seen similar questions to mine but the XML format has always been different, as the XML format I have does not follow the "standard" strcuture. My table looks like the following (single column with XML bracket as row values):
|VAL|
|<person name="bob" age="22" city="new york" occupation="student"></person>|
|<person name="bob" age="22" city="new york" occupation="student"></person>|
And the outcome I'm looking for is:
|Name|age|city |occupation|
|bob |22 |new york|student |
|bob |22 |new york|student |
I can create hardcoded script with these column names, but the problem is that I have over 20 tables that would then all require a custom script. My thinking is that there is a way where I can dynamically, taken into account destination table and a source table (xml), I could have a procedure where this data is generated.
Your question is not all clear...
As far as I understand, you have a variety of different XMLs and you want to read them generically. If this is true, I'd suggest for your next question, to reflect this in your sample data.
One general statement is: There is no way around dynamically created statements, in cases, where you want to set the descriptive elements of a resultset (in this case: the names of the columns) dynamically. T-SQL relies on some things you must know in advance.
Try this:
I set up a mockup scenario to simulate your issue (please try to do this yourself in your next question):
DECLARE #tbl TABLE(ID INT IDENTITY, Descr VARCHAR(100), VAL XML);
INSERT INTO #tbl VALUES
('One person',N'|<person name="bob" age="22" city="new york" occupation="student"></person>')
,('One more person','<person name="bob" age="22" city="new york" occupation="student"></person>')
,('One country','<country name="Germany" capital="Berlin" continent="Europe"></country>');
--this query relies on all possible attributes known in advance.
--common attributes, like the name, are returned for a person and for a country
--differing attributes return as NULL.
--One advantage might be, that you can use a specific datatype if appropriate.
SELECT t.ID
,t.Descr
,t.VAL.value('(/*[1]/#name)[1]','nvarchar(max)') AS [name]
,t.VAL.value('(/*[1]/#age)[1]','nvarchar(max)') AS [age]
,t.VAL.value('(/*[1]/#city)[1]','nvarchar(max)') AS [city]
,t.VAL.value('(/*[1]/#occupation)[1]','nvarchar(max)') AS [occupation]
,t.VAL.value('(/*[1]/#city)[1]','nvarchar(max)') AS [city]
,t.VAL.value('(/*[1]/#capital)[1]','nvarchar(max)') AS [capital]
,t.VAL.value('(/*[1]/#continent)[1]','nvarchar(max)') AS [continent]
FROM #tbl t;
--This query returns as classical EAV (entity-attribute-value) list
--In this result you get each attribute on its own line
SELECT t.ID
,t.Descr
,A.attrs.value('local-name(.)','nvarchar(max)') AS AttrName
,A.attrs.value('.','nvarchar(max)') AS AttrValue
FROM #tbl t
CROSS APPLY t.VAL.nodes('/*[1]/#*') A(attrs);
Both approaches might be generated as a statement on string-level and then executed by EXEC() or sp_executesql.
Hint: One approach might be to insert the EAV list into a tolerant staging table and proceed from there with conditional aggregation, PIVOT or hardcoded VIEWs.
Dynamic approach
In order to read the <person> elements we would need this:
SELECT t.ID
,t.Descr
,t.VAL.value('(/*[1]/#name)[1]','nvarchar(max)') AS [name]
,t.VAL.value('(/*[1]/#age)[1]','nvarchar(max)') AS [age]
,t.VAL.value('(/*[1]/#city)[1]','nvarchar(max)') AS [city]
,t.VAL.value('(/*[1]/#occupation)[1]','nvarchar(max)') AS [occupation]
FROM #tbl t
WHERE VAL.value('local-name(/*[1])','varchar(100)')='person';
All we have to do is to generate the changing part:
Try this:
A new mockup with a real table
CREATE TABLE SimulateYourTable(ID INT IDENTITY, Descr VARCHAR(100), VAL XML);
INSERT INTO SimulateYourTable VALUES
('One person',N'|<person name="bob" age="22" city="new york" occupation="student"></person>')
,('One more person','<person name="bob" age="22" city="new york" occupation="student"></person>')
,('One country','<country name="Germany" capital="Berlin" continent="Europe"></country>');
--Filter for <person> entities
DECLARE #entityName NVARCHAR(100)='person';
--This is a string representing the command
DECLARE #cmd NVARCHAR(MAX)=
'SELECT t.ID
,t.Descr
***columns here***
FROM SimulateYourTable t
WHERE VAL.value(''local-name(/*[1])'',''varchar(100)'')=''***name here***''';
--with this we can create all the columns
--Hint: With SQL Server 2017+ there is STRING_AGG() - much simpler!
DECLARE #columns NVARCHAR(MAX)=
(
SELECT CONCAT(',t.VAL.value(''(/*[1]/#',Attrib.[name],')[1]'',''nvarchar(max)'') AS ',QUOTENAME(Attrib.[name]))
FROM SimulateYourTable t
CROSS APPLY t.VAL.nodes('//#*') AllAttrs(a)
CROSS APPLY (SELECT a.value('local-name(.)','varchar(max)')) Attrib([name])
WHERE VAL.value('local-name(/*[1])','varchar(100)')=#entityName
GROUP BY Attrib.[name]
FOR XML PATH(''),TYPE
).value('.','nvarchar(max)');
--Now we stuff this into our command
SET #cmd=REPLACE(#cmd,'***columns here***',#columns);
SET #cmd=REPLACE(#cmd,'***name here***',#entityName);
--This is the command.
--Hint: You might use this to create physical VIEWs without the need to type them in...
PRINT #cmd;
You can use EXEC(#cmd) to execute this dynamic SQL and check the result.
Newbie here on EAV (Entity-Attribute-Value) model of DB in SQL.
Just a background: I am using SQL Server 2016. The use of EAV is kind of a requirement at work so I am learning to do it one step at a time.
I recently learned how to do a dynamic PIVOT to return 800+ rows with 200+ columns in an EAV table.
See details here:
Converting 200+ rows to column in SQL Server using PIVOT
As successful it was to return the data I need, the performance speed was too slow - it took about 30mins to query. By the way, I am using the code as follows:
declare #pivot_col varchar(max);
declare #sql varchar(max);
select #pivot_col = STUFF(
( SELECT ',' + CAST([Col_Name] AS VARCHAR(max) ) AS [text()]
FROM ( select distinct [Col_Name] from tbl_Values ) A
ORDER BY [Col_Name] FOR XML PATH('')), 1, 1, NULL
);
set #sql = 'SELECT *
FROM ( SELECT [Row_ID], [Col_Name], [Col_Value] FROM tbl_Values ) AS a
PIVOT (
MAX([Col_Value])
FOR [Col_Name] in (' + #pivot_col + ' )
) AS p
ORDER BY [Row_ID]';
exec ( #sql );
I am trying to incorporate CURSOR with this but hasn't gone much far. Before I go more distance on research, can you provide input as to if it makes any difference with regards to performance / speed?
Thanks!
Found a solution to the poor performance of my PIVOT query: I was told to create a clustered index on the Row_ID column I have in my table. I ran the query below:
CREATE CLUSTERED INDEX IX_tbl_Values_Row_ID
ON dbo.tbl_Values (Row_ID);
GO
And the query I have on my question which took 30 mins to load before had now run for just 6 seconds now! Thanks to #MohitShrivastava for the tip! Definitely worked.
I also referred to this before creating the clustered index:
https://learn.microsoft.com/en-us/sql/relational-databases/indexes/create-clustered-indexes?view=sql-server-ver15
I have a performance problem with a stored procedure. Because when I check my benchmark's result I realized that "MatchxxxReferencesByIds" has '240.25' ms Average LastElapsedTimeInSecond. How can I improve my procedure?
ALTER PROCEDURE [Common].[MatchxxxReferencesByIds]
(#refxxxIds VARCHAR(MAX),
#refxxxType NVARCHAR(250))
BEGIN
SET NOCOUNT ON;
BEGIN TRAN
DECLARE #fake_tbl TABLE (xxxid NVARCHAR(50))
INSERT INTO #fake_tbl
SELECT LTRIM(RTRIM(split.a.value('.', 'NVARCHAR(MAX)'))) AS fqdn
FROM
(SELECT
CAST ('<M>' + REPLACE(#refxxxIds, ',', '</M><M>') + '</M>' AS XML) AS data
) AS a
CROSS APPLY
data.nodes ('/M') AS split(a)
SELECT [p].[ReferencedxxxId]
FROM [Common].[xxxReference] AS [p]
WHERE ([p].[IsDeleted] = 0)
AND (([p].[ReferencedxxxType] COLLATE Turkish_CI_AS = #refxxxType COLLATE Turkish_CI_AS )
AND [p].[ReferencedxxxId] COLLATE Turkish_CI_AS IN (SELECT ft.xxxid COLLATE Turkish_CI_AS FROM #fake_tbl ft))
COMMIT;
END;
One can only make assumptions without knowing the table's schema, indexes and data sizes.
Hard-coding collations can prevent the query optimizer from using any indexes on the ReferencedEntityId column. The field name and sample data '423423,423423,423432,23423' suggest this is a numeric column anyway (int? bigint?). The collation shouldn't be needed and the variable's column type should match the table's type.
Finally, a.value can return an int or bigint directly, which means the splitting query can be rewritten as:
declare #refEntityIds nvarchar(max)='423423,423423,423432,23423';
DECLARE #fake_tbl TABLE (entityid bigint PRIMARY KEY, INDEX IX_TBL(Entityid))
INSERT INTO #fake_tbl
SELECT split.a.value('.', 'bigint') AS fqdn
FROM
(SELECT
CAST ('<M>' + REPLACE(#refEntityIds, ',', '</M><M>') + '</M>' AS XML) AS data
) AS a
CROSS APPLY
data.nodes ('/M') AS split(a)
The input data contains some duplicates so entityid can't be a PRIMARY KEY.
After that, the query can change to :
SELECT [p].[ReferencedEntityId]
FROM [Common].[EntityReference] AS [p]
WHERE [p].[IsDeleted] = 0
AND [p].[ReferencedEntityType] COLLATE Turkish_CI_AS = #refEntityType COLLATE Turkish_CI_AS
AND [p].[ReferencedEntityId] IN (SELECT ft.entityid FROM #fake_tbl ft)
The next problem is the hard-coded collation. Unless it matches the column's actual collation, this prevents the server from using any indexes that cover that column. How to fix this depends on the actual data statistics. Perhaps the column's collation has to change or perhaps the rows after filtering by ReferencedEntityId are so few that there's no benefit to this.
Finally, IsDeleted can't be indexed. It's either a bit columns whose values are 1/0 or another numberic column that still contains 0/1. An index that's so bad at selecting rows won't be used by the query optimizer because it's actually faster to just scan the rows returned by the other conditions.
A general rule is to put the most selective index column first. The database combines all columns to create one "key" value and construct a B+-tree index from it. The more selective the key, the fewer index nodes need to be scanned.
IsDeleted can still be used in a filtered index to index only the non-deleted columns. This allows the query optimizer to eliminate unwanted columns from the search. The resulting index will be smaller too, which means the same number of IO operations will load more index pages in memory and allow faster seeking.
All of this means that EntityReference should have an index like this one.
CREATE NONCLUSTERED INDEX IX_EntityReference_ReferenceEntityID
ON Common.EntityReference (ReferenceEntityId, ReferenceEntityType)
WHERE IsDeleted =0;
If the collations don't match, ReferenceEntityType won't be used for seeking. If this is the most common case we can remove ReferenceEntityType from the index and put it in an INCLUDE clause. The field won't be part of the index key although it will still be available for filtering without having to load data from the actual table:
CREATE NONCLUSTERED INDEX IX_EntityReference_ReferenceEntityID
ON Common.EntityReference (ReferenceEntityId)
INCLUDE(ReferenceEntityType)
WHERE IsDeleted =0;
Of course, if that's the most common case the column's collation should be changed instead
Based on the execution plan of the stored procedure, what makes it to perform slowly, is the part where you are going to work with XML.
Lets rethink about the solution:
I have created a table like this:
CREATE TABLE [Common].[EntityReference]
(
IsDeleted BIT,
ReferencedEntityType VARCHAR(100),
ReferencedEntityId VARCHAR(10)
);
GO
and manipulate it like this(Insert 1M records into it):
DECLARE #i INT = 1000000;
DECLARE #isDeleted BIT,
#ReferencedEntityType VARCHAR(100),
#ReferencedEntityId VARCHAR(10);
WHILE #i > 0
BEGIN
SET #isDeleted =(SELECT #i % 2);
SET #ReferencedEntityType = 'TEST' + CASE WHEN #i % 2 = 0 THEN '' ELSE CAST(#i % 2 AS VARCHAR(100)) END;
SET #ReferencedEntityId = CAST(#i AS VARCHAR(10));
INSERT INTO [Common].[EntityReference]
(
IsDeleted,
ReferencedEntityType,
ReferencedEntityId
)
VALUES (#isDeleted, #ReferencedEntityType, #ReferencedEntityId);
SET #i = #i - 1;
END;
lets analyse your code:
You have a comma delimited input(#refEntityIds), which you want to split it and then run a query against these values. (your SP's subtree cost in my PC is about 376) To do so you have different approaches:
1.Pass a table variable to stored procedure which contains the refEntityIds
2.Make use of STRING_SPLIT function to split the string
Lets see the sample query:
INSERT INTO #fake_tbl
SELECT value
FROM STRING_SPLIT(#refEntityIds, ',');
Using this, you will gain a great performance improvement in your code.(subtree cost: 6.19 without following indexes) BUT this feature is not available in SQL Server 2008!
You can use a replacement for this function(read this: https://stackoverflow.com/a/54926996/1666800) and changing your query to this(subtree cost still is about 6.19):
INSERT INTO #fake_tbl
SELECT value FROM dbo.[fn_split_string_to_column](#refEntityIds,',')
In this case again you will see the notable performance improvement.
You can also create a non clustered index on [Common].[EntityReference] table, which has a little performance improvement too. But please think about creating index, before creating it, it might have negative impact on your DML operations:
CREATE NONCLUSTERED INDEX [Index Name] ON [Common].[EntityReference]
(
[IsDeleted] ASC
)
INCLUDE ([ReferencedEntityType],[ReferencedEntityId])
In case that I have not this index(Suppose that I have replaced your split solution with mine), the subtree cost is: 6.19, When I add aforementioned index, the subtree cost decreased to 4.70, and finally when I change the index to following one, the subtree cost is 5.16
CREATE NONCLUSTERED INDEX [Index Name] ON [Common].[EntityReference]
(
[ReferencedEntityType] ASC,
[ReferencedEntityId] ASC
)
INCLUDE ([IsDeleted])
Thanks to #PanagiotisKanavos following index will even performs better than the aforementioned ones (subtree cost: 3.95):
CREATE NONCLUSTERED INDEX IX_EntityReference_ReferenceEntityID
ON Common.EntityReference (ReferencedEntityId)
INCLUDE(ReferencedEntityType)
WHERE IsDeleted =0;
Also please note that, using transaction against a local table variable has almost no effect, and probably you can simply ignore it.
If [p].[ReferencedEntityId] is going to be integers, then you don't need to apply COLLATE clause. You can directly apply IN condition.
You can go for simple comma separated values to integers list using Table valued functions. There are many samples. Keep the datatype of the ID as integer, to avoid collation application.
[p].[ReferencedEntityId] IN (SELECT ft.entityid AS FROM #fake_tbl ft))
I don't think you need a TRAN. You are simply "shredding" your comma seperated values into a #variable table. and doing a SELECT. TRAN not need here.
try an exists
SELECT [p].[ReferencedEntityId]
FROM [Common].[EntityReference] AS [p]
WHERE ([p].[IsDeleted] = 0)
AND (([p].[ReferencedEntityType] COLLATE Turkish_CI_AS = #refEntityType COLLATE Turkish_CI_AS )
AND EXISTS (SELECT 1 FROM #fake_tbl ft WHERE ft.entityid COLLATE Turkish_CI_AS = [p].[ReferencedEntityId] COLLATE Turkish_CI_AS )
3.
See https://www.sqlshack.com/efficient-creation-parsing-delimited-strings/
for different ways to parse your delimited string.
quote from article:
Microsoft’s built-in function provides a solution that is convenient
and appears to perform well. It isn’t faster than XML, but it clearly
was written in a way that provides an easy-to-optimize execution plan.
Logical reads are higher, as well. While we cannot look under the
covers and see exactly how Microsoft implemented this function, we at
least have the convenience of a function to split strings that are
shipped with SQL Server. Note that the separator passed into this
function must be of size 1. In other words, you cannot use
STRING_SPLIT with a multi-character delimiter, such as ‘”,”’.
post a screen shot of your execution plan. if you don't have proper index (or you have "hints" that prevent use of indexes)..your query will never perform well.
I tried to shred XML into a temporary table by using XQuery .nodes as follows. But, I got performance problem. It is taking much time to shred. Please give me an idea on alternatives for this.
My requirement is to pass bulk records to a stored procedure and parse those records and do some operation based on record values.
CREATE TABLE #DW_TEMP_TABLE_SAVE(
[USER_ID] [NVARCHAR](30),
[USER_NAME] [NVARCHAR](255)
)
insert into #DW_TEMP_TABLE_SAVE
select
A.B.value('(USER_ID)[1]', 'nvarchar(30)' ) [USER_ID],
A.B.value('(USER_NAME)[1]', 'nvarchar(30)' ) [USER_NAME]
from
#l_n_XMLDoc.nodes('//ROW') as A(B)
Specify the text() node in your values clause.
insert into #DW_TEMP_TABLE_SAVE
select A.B.value('(USER_ID/text())[1]', 'nvarchar(30)' ) [USER_ID],
A.B.value('(USER_NAME/text())[1]', 'nvarchar(30)' ) [USER_NAME]
from #l_n_XMLDoc.nodes('/USER_DETAILS/RECORDSET/ROW') as A(B)
Not using text() will create a query plan that tries concatenate the values from the specified node with all its child nodes and I guess you don't want that in this scenario. The concatenation part of the query if you don't use text() is done by the UDX operator and it is a good thing not to have it in your plan.
Another thing to try is OPENXML. In some scenarios (large xml documents) I have found that OPENXML performs faster.
declare #idoc int
exec sp_xml_preparedocument #idoc out, #l_n_XMLDoc
insert into #DW_TEMP_TABLE_SAVE
select USER_ID, USER_NAME
from openxml(#idoc, '/USER_DETAILS/RECORDSET/ROW', 2)
with (USER_ID nvarchar(30), USER_NAME nvarchar(30))
exec sp_xml_removedocument #idoc
sql2005
This is my simplified example:
(in reality there are 40+ tables in here, I only showed 2)
I got a table called tb_modules, with 3 columns (id, description, tablename as varchar):
1, UserType, tb_usertype
2, Religion, tb_religion
(Last column is actually the name of a different table)
I got an other table that looks like this:
tb_value (columns:id, tb_modules_ID, usertype_OR_religion_ID)
values:
1111, 1, 45
1112, 1, 55
1113, 2, 123
1114, 2, 234
so, I mean 45, 55, 123, 234 are usertype OR religion ID's
(45, 55 usertype, 123, 234 religion ID`s)
Don't judge, I didn't design the database
Question
How can I make a select, showing * from tb_value, plus one column
That one column would be TITLE from the tb_usertype or RELIGIONNAME from the tb_religion table
I would like to make a general thing.
Was thinking initially about maybe a SQL function that returns a string, but I think I would need dynamic SQL, which is not ok in a function.
Anyone a better idea ?
At the beginning we have this -- which is quite messy.
To clean-up a bit I add two views and a synonym:
create view v_Value as
select
ID as ValueID
, tb_modules_ID as ModuleID
, usertype_OR_religion_ID as RemoteID
from tb_value ;
go
create view v_Religion as
select
ID
, ReligionName as Title
from tb_religion ;
go
create synonym v_UserType for tb_UserType ;
go
And now the model looks like
It is easier now to write the query
;
with
q_mod as (
select
m.ID as ModuleID
, coalesce(x1.ID , x2.ID) as RemoteID
, coalesce(x1.Title , x2.Title) as Title
, m.Description as ModuleType
from tb_Modules as m
left join v_UserType as x1 on m.TableName = 'tb_UserType'
left join v_Religion as x2 on m.TableName = 'tb_Religion'
)
select
a.ModuleID
, v.ValueID
, a.RemoteID
, a.ModuleType
, a.Title
from q_mod as a
join v_Value as v on (v.ModuleID = a.ModuleID and v.RemoteID = a.RemoteID) ;
There is an obvious pattern in this query, so it can be created as dynamic sql if you have to add another module-type table. When adding another table, use ID and Title to avoid having to use a view.
EDIT
To build dynamic sql (or query on application level)
Modify lines 6 and 7, the x-index is tb_modules.id
coalesce(x1. , x2. , x3. ..)
Add lines to the left join (below line 11)
left join v_SomeName as x3 on m.TableName = 'tb_SomeName'
The SomeName is tb_modules.description and x-index is matching tb_modules.id
EDIT 2
The simplest would probably be to package the above query into a view and then each time the schema changes dynamically crate and run ALTER VIEW. This way the query would not change from the point of the application.
Since we're all agreed the design is flaky, I'll skip any comments on that. The pattern of the query is this:
-- Query 1
select tb_value.*,tb_religion.religion_name as ANY_DESCRIPTION
from tb_value
JOIN tb_religion on tb_value.ANY_KIND_OF_ID = tb_religion.id
WHERE tb_value.module_id = 2
-- combine it with...
UNION ALL
-- ...Query 2
select tb_value.*,tb_religion.title as ANY_DESCRIPTION
from tb_value
JOIN tb_userType on tb_value.ANY_KIND_OF_ID = tb_userType.id
WHERE tb_value.module_id = 1
-- combine it with...
UNION ALL
-- ...Query 3
select lather, rinse, repeat for 40 tables!
You can actually define a view that hardcodes all 40 cases, and then put filters onto queries for the particular modules you want.
To do this dynamically you need to be able to create a sql statement that looks like this
select tb_value.*, tb_usertype.title as Descr
from tb_value
inner join tb_usertype
on tb_value.extid = tb_usertype.id
where tb_value.tb_module_id = 1
union all
select tb_value.*, tb_religion.religionname as Descr
from tb_value
inner join tb_religion
on tb_value.extid = tb_religion.id
where tb_value.tb_module_id = 2
-- union 40 other tables
Currently you can not do that because you do not have any information in the db telling you which column to use from tb_religion and tb_usertype etc. You can add that as a new field in tb_module.
If you have fieldname to use in tb_module you can build a view that does what you want.
And you could add a trigger to table tb_modules that alters the view whenever tb_modules is modified. That way you do not need to use dynamic sql from the client when doing queries. The only thing you need to worry about is that the table needs to be created in the db before you add a new row to tb_modules
Edit 1
Of course the code in the trigger needs to dynamically build the alter view statement.
Edit 2 You also need to have a field with information about what column in tb_usertype and tb_religion etc. to join against tb_value.extid (usertype_OR_religion_ID). Or you can assume that the field will always be called id
Edit 3 Here is how you could build the trigger on tb_module that alters the view v_values. I have added fieldname as a column in tb_modules and I assume that the id field in the related tables is called id.
create trigger tb_modules_change on tb_modules after insert, delete, update
as
declare #sql nvarchar(max)
declare #moduleid int
declare #tablename varchar(50)
declare #fieldname varchar(50)
set #sql = 'alter view v_value as '
declare mcur cursor for
select id, tablename, fieldname
from tb_modules
open mcur
fetch next from mcur into #moduleid, #tablename, #fieldname
while ##FETCH_STATUS = 0
begin
set #sql = #sql + 'select tb_value.*, '+#tablename+'.'+#fieldname+' '+
'from tb_value '+
'inner join '+#tablename+' '+
'on tb_value.extid = '+#tablename+'.id '+
'where tb_value.tb_module_id = '+cast(#moduleid as varchar(10))
fetch next from mcur into #moduleid, #tablename, #fieldname
if ##FETCH_STATUS = 0
begin
set #sql = #sql + ' union all '
end
end
close mcur
deallocate mcur
exec sp_executesql #sql
Hm..there are probably better solutions available but here's my five cents:
SELECT
id,tb_modules_ID,usertype_OR_religion_ID,
COALESCE(
(SELECT TITLE FROM tb_usertype WHERE Id = usertype_OR_religion_ID),
(SELECT RELIGIONNAME FROM tb_religion WHERE Id = usertype_OR_religion_ID),
'N/A'
) AS SourceTable
FROM tb_valuehere
Note that I don't have the possibility to check the statement right now so I'm reserving myself for any syntax errors...
First, using your current design the only reasonable solution is dynamic SQL. You should write a module in your middle-tier that queries for the appropriate table names and builds the queries on the fly. Trying to accomplish that in T-SQL will be a nightmare. T-SQL was not designed for string construction.
The right solution is to build a new database designed properly, migrate the data and scrap the existing design. The problems you will encounter with your current design will simply grow. It will be harder for new developers to learn the new system. It will be prone to errors. There will be no data integrity (e.g. forcing the attribute "Start Date" to be parsable as a date). Custom queries will be a chore to write and so on. Eventually, you will hit the day when the types of information desired from the system are simply too difficult to extract given the current design.
First take the undesigner out the back and put them out of their misery. They are hurting people.
Due to their incompetence, every time you add a row to Module, you have to modify every query that uses it. Good for www.dailywtf.com.
You do not have Referential Integrity either, because you cannot define an FK on the this_or_that column. Your data is exposed, probably to "code" written by the same undesigner. No doubt you are aware that this is where the deadlocks are created.
That it is a "judgement", that is so that you understand the gravity of the undesign, and you can justify replacing it, to your managers.
SQL was designed for Relational Databases, that means Normalised. It is not good for mangled files. Sure, some queries may be better than others (just look at the answers), but there is no way to get around the undesign, any SQL query will be hamstrung, and need change whenever a Module row is added.
"Dynamic" is reserved for Databases, not possible for flat flies.
Two answers. One to stop the continuing idiocy of changing the existing queries every time a Module row is added (you're welcome); the second to answer your question.
Safe Future Queries
CREATE VIEW UserReligion_vw AS
SELECT [XxxxId] = id, -- replace Xxxx
[ReligionId] = usertype_OR_religion_ID
FROM tb_value
WHERE tb_modules_ID = 1
CREATE VIEW UserReligion_vw AS
SELECT [XxxxId] = id,
[ReligionId] = usertype_OR_religion_ID
FROM tb_value
WHERE tb_modules_ID = 2
From now on, make sure the all queries currently using the undesign, are modified to use the correct View instead. Do not use the Views for Update/Delete/Insert.
Answer
Ok, now for the main question. I can think of other approaches, but this one is the best. You have stated, you want the third column to also be an unnormalised piece of chicken excreta and the supply Title for [EITHER_Religion_OR_UserType_OR_This_OR_That]. Right, so you are teaching the user to be confused as well; when the no of modules grow, they will have great fun figuring out what the column contains. Yes a problem does always compound itself.
SELECT [XxxxId] = id,
[Whatever] = CASE tb_modules_ID
WHEN 1 THEN ( SELECT name -- title, whatever
FROM tb_religion
WHERE id = V.usertype_OR_religion_ID
)
WHEN 2 THEN ( SELECT name -- title, whatever
FROM tb_usertype
WHERE id = V.usertype_OR_religion_ID
)
ELSE "(UnknownModule)" -- do not remove the brackets
END
FROM tb_value V
WHERE conditions... -- you need something here
This is called a Correlated Scalar Subquery.
It works on any version of Sybase since 4.9.2 with no limitations. And SQL 2005 (last time I looked, anyway, Aug 2009). But on MS you will get a StackTrace if the volume of tb_value is large, so make sure the WHERE clause has some conditions on it.
But MS have broken the server with their "new" 2008 codeline, so it does not work in all circumstances (the worse your mangled files, the less likely it will work; the better your database design, the more likely it will work). That is why some MS people pray every day for the next Service pack, and others never attend church.
I guess you want something like this:
Adding tables and one row per table into tb_modules is straight forward.
SET NOCOUNT ON
if OBJECT_ID('tb_modules') > 0 drop table tb_modules;
if OBJECT_ID('tb_value') > 0 drop table tb_value;
if OBJECT_ID('tb_usertype') > 0 drop table tb_usertype;
if OBJECT_ID('tb_religion') > 0 drop table tb_religion;
go
create table dbo.tb_modules (
id int,
description varchar(20),
tablename varchar(255)
);
insert into tb_modules values ( 1, 'UserType', 'tb_usertype');
insert into tb_modules values ( 2, 'Religion', 'tb_religion');
create table dbo.tb_value(
id int,
tb_modules_ID int,
usertype_OR_religion_ID int
);
insert into tb_value values ( 1111, 1, 45);
insert into tb_value values ( 1112, 1, 55);
insert into tb_value values ( 1113, 2, 123);
insert into tb_value values ( 1114, 2, 234);
create table dbo.tb_usertype(
id int,
UserType varchar(30)
);
insert into tb_usertype values ( 45, 'User_type_45');
insert into tb_usertype values ( 55, 'User_type_55');
create table dbo.tb_religion(
id int,
Religion varchar(30)
);
insert into tb_religion values ( 123, 'Religion_123');
insert into tb_religion values ( 234, 'Religion_234');
-- start of query
declare #sql varchar(max) = null
Select #sql = case when #sql is null then ' ' else #sql + char(10) + 'union all ' end
+ 'Select ' + str(id) + ' type, id, ' + description + ' description from ' + tablename from tb_modules
set #sql = 'select v.id, tb_modules_ID , usertype_OR_religion_ID , t.description
from tb_value v
join ( ' + #sql + ') as t
on v.tb_modules_ID = t.type and v.usertype_OR_religion_ID = t.id
'
Print #sql
exec( #sql)
I think it's intended to be used with dynamic sql.
Maybe break out each tb_value.tb_modules_ID row into its own temp table, named with the tb_modules.tablename.
Then have an sp iterate through the temp tables matching your naming convention (by prefix or suffix) building the sql and doing your join.