Adding N number of dynamic columns in sql query - sql

I have a table which is called datarecords which contains 7 fixed columns that are always required in select query. A user can add as many custom columns they want. I am storing this information in a table called datacolumn and the values are stored in another table called datavalue.
Now I want to create a query which bring the 7 fixed columns from datarecord, and then add custom columns and bring the data value from these tables since each data record have corresponding value in data value table.

You can try to PIVOT the custom attributes from rows into columns, but you'll find that even with support for PIVOT in Microsoft SQL Server, you need to know the attributes in advance of writing the query, and the query code needs to specify all the attributes. There's no way in SQL to ask for all the custom attributes to magically fill as many columns as necessary.
You can retrieve an arbitrary number of custom attributes only by fetching them row by row, as they are stored in the database. Then write application code to loop over the results. If you want, you can write a class to map the multiple rows of custom attributes into fields of an object in your application.
It's awkward and inelegant to query non-relational data using SQL. This is because SQL is designed to assume each logical entity of the same type has a fixed number of columns, and that you know the columns before you write the query. If your entity has variable attributes, it can't be stored as a relation, by definition.
Many people try to extend this using the design you're using, but they find it's hard to manage and doesn't scale well. This design is usually called the Entity-Attribute-Value model, or key-value pairs. For more details on the pitfalls of the EAV design, see my book SQL Antipatterns.
If you need to support custom attributes, here are a few alternatives:
Store all the custom attributes together in a BLOB, with some internal structure to delimit field names and values (Serialized LOB). You can optionally create inverted indexes to help you look up rows where a given field has a given value (see How FriendFeed Uses MySQL).
Use a document-oriented database such as MongoDB or Solr for the dynamic data.
Use ALTER TABLE to add conventional columns to the table when users need custom attributes. This means you either need to enforce the same set of custom attributes for all users, or else store all users' custom attributes and hope your table doesn't get too wide (Single Table Inheritance), or create a separate table per user, either for all columns (Concrete Table Inheritance) or for just the custom columns (Class Table Inheritance).

EDIT: See note at bottom for more detail.
I am facing the same problem, and I found a solution that is slow. Maybe someone else has a solution for speeding up my findings. In my code, I have a table with three columns: Col1, Col2, Col3. Col1 is my record ID. Col2 is the name of my dynamic columns. Col3 is the value at that column. So if I wanted to represent a record with ID 1, two columns 2 and 3, and values at those columns: 4 and 5, I would have the following:
Col1, Col2, Col3
1, 2, 4
1, 3, 5
Then we pivot over column 2 and select the MAX (or MIN or AVG, doesn't matter since col2 and col3 combinations are unique) col3 in the pivot. In order to accomplish the pivot with a variable number of columns, we use dynamic SQL generation to generate our SQL. This works well for small input data (I believe the derived table inside the FROM clause of the dynamic SQL). Once your dataset gets large, the average function starts taking a long time to execute. A very long time. It looks like this starts at around 1000 rows, so maybe there's a hint or another method that makes this shorter.
As a note, since the values for Col2 and Col3 map 1:1, I also tried dynamically generating a SELECT statement like the following:
SELECT Col1,
CASE WHEN Col2 = '4' THEN Col3 END [4],
CASE WHEN Col2 = '5' THEN Col3 END [5],
CASE WHEN Col2 = '6' THEN Col3 END [6], -- ... these were dyanmically generated
FROM #example
GROUP BY Col1
This was just as slow for my dataset. Your milege may vary. Here is a full example of how this works written for SQL Server (2005+ should run this).
--DROP TABLE #example
CREATE TABLE #example
(
Col1 INT,
Col2 INT,
Col3 INT
)
INSERT INTO #example VALUES (2,4,10)
INSERT INTO #example VALUES (2,5,20)
INSERT INTO #example VALUES (2,6,30)
INSERT INTO #example VALUES (2,7,40)
INSERT INTO #example VALUES (2,8,50)
INSERT INTO #example VALUES (3,4,11)
INSERT INTO #example VALUES (3,5,22)
INSERT INTO #example VALUES (3,6,33)
INSERT INTO #example VALUES (3,7,44)
INSERT INTO #example VALUES (3,8,55)
DECLARE #columns VARCHAR(100)
SET #columns = ''
SELECT #columns = #columns + '[' + CAST(Col2 AS VARCHAR(10)) + '],'
FROM (SELECT DISTINCT Col2 FROM #Example) a
SELECT #columns = SUBSTRING(#columns, 0, LEN(#columns) )
DECLARE #dsql NVARCHAR(MAX)
SET #dsql = '
select Col1, ' + #columns + '
from
(select Col1, Col2, Col3 FROM #example e) a
PIVOT
(
MAX(Col3)
FOR Col2 IN (' + #columns + ')
) p'
print #dsql
EXEC sp_executesql #dsql
EDIT: Because of the unique situation in which I am doing this, I managed to get my speed-up using two tables (one with the entities and another with the attribute-value pairs), and creating a clustered index on the attribute-value pairs which includes all columns (ID, Attribute, Value). I recommend you work around this approach another way if you need fast inserts, large numbers of columns, many data rows, etc.. I have some known certainties about the size and growth rates of my data, and myy solution is suited to my scope.
There are many other solutions which are better suited to solve this problem. For example, if you need fast inserts and single-record reads (or slow reads don't matter) you should consider packing an XML string into a field and serializing/deserializing in the database consumer. If you need ultra-fast writes, ultra-fast reads, and data columns are very rarely added then you may consider altering your table. This is a bad solution in most practice, but may fit some problems. If you have columns that change frequently enough, but you also need fast reads and writes are not an issue then my solution may work for you up to a certain dataset size.

Related

Oracle SQL - using column name in one table as query parameter within another

Help needed please! Here is the problem:
I have 2 tables (one transactional, and one lookup/control) as below:
transactional table (A):
TXID, NAME, DESCRIPTION, GROUP, DATE, TYPE, AMOUNT, etc.
(e.g. 12345, 'SAMPLE TRANSACTION','test','TXGROUP1','FEB.15 2019',500.00, etc.)
lookup/control table (B):
COLID, COLNAME, FLAG
(e.g. 1,'NAME', 0; 2,'DATE',1, etc.)
In this scenario, entries for COLNAME in table B refer to actual column names in table A (i.e. B.COLNAME = 'DATE' refers to A.DATE)
Problem is, I need to write a query that fetches all COLNAME values in table B, and select their corresponding grouped value from table A. For example:
since B.COLNAME contains 'DATE', select max (DATE) from table A grouping by A.NAME
What I've tried:
select NAME, (SELECT column_name FROM all_tab_columns where table_name like '%TABLE_A%' AND ROWNUM = 1 GROUP BY COLUMN_NAME) AS COL from TABLE_A;
but this only gives me the literal value of the column name - (i.e. 'SAMPLE TRANSACTION', 'DATE') - NOT THE DERIVED VALUE as in what I actually need, if I were to run the query manually would be select NAME, DATE AS COL from TABLE_A;
and I might expect something like:
NAME, COL (e.g. 'SAMPLE TRANSACTION', 'FEB.15 2019')
Ideally am trying to do this only in raw SQL if possible (i.e. no stored procedures, PL/SQL, dynamic, etc....) but... am definitely open to anything that can just make it work.
input and/or suggestions would be very much greatly appreciated. Environment is Oracle 11g I believe, though I suspect this may not make a huge difference.
It is possible to run dynamic SQL in SQL, but the solutions are painful. The simplest way uses the package DBMS_XMLGEN and does not require any additional PL/SQL objects.
The below example works but is unrealistically simple. A real version would have to deal with many type conversion issues, retrieve other values, etc.
--Read a value based on the CONTROL table.
select
to_number(extractvalue(xml, '/ROWSET/ROW/COL')) COL
from
(
select xmltype(dbms_xmlgen.getxml(v_sql)) xml
from
(
select 'select '||colname||' col from transaction' v_sql
from control
)
);
COL
---
2
The results are based on this sample schema:
--Sample schema:
create table control
(
COLID number,
COLNAME varchar2(4000),
FLAG number
);
insert into control values(1,'NAME',1);
create table transaction
(
TXID number,
NAME varchar2(100),
DESCRIPTION varchar2(4000),
the_GROUP varchar2(100),
the_DATE date,
TYPE varchar2(100),
AMOUNT number
);
insert into transaction values(1,2,3,4,sysdate,6,7);
commit;
If you have more complicated query needs, for example if you need to return an unknown number of columns, you'll need to install something like my open source program Method4. That program allows dynamic SQL in SQL, but it requires installing some new objects first.
In practice, this level of dynamic SQL is rarely necessary. It's usually best to find a simpler way to solve the problem.

Common methods for doing select with computation by need?

I would like to be able to add columns to a table with cells who's values are computed by need at 'querytime' when (possibly) selecting over them.
Are there some established ways of doing this?
EDIT: Okay I can do without the 'add columns'. What I want is to make a select query which searches some (if they exist) rows with all needed values computed (some function) and also fills in some of the rows which does not have all needed values computed. So each query would do it's part in extending the data a bit.
(Some columns would start out as null values or similar)
I guess I'll do the extending part first and the query after
You use select expression, especially if you don't plan to store the calculation results, or they are dependant on more than one table. An example, as simple as it could be:
SELECT id, (id+1) as next_id FROM table;
What type of database are you asking for? If it is SQL Server then you can use the computed columns by using the AS syntax.
Eg:
create table Test
(
Id int identity(1,1),
col1 varchar(2) default 'NO',
col2 as col1 + ' - Why?'
)
go
insert into Test
default values
go
select * from Test
drop table Test
In the SQL world it's usually expensive to add a column to an existing table so I'd advise against it. Maybe you can manage with something like this:
SELECT OrderID,
ProductID,
UnitPrice*Quantity AS "Regular Price",
UnitPrice*Quantity-UnitPrice*Quantity*Discount AS "Price After Discount"
FROM order_details;
If you really insist on adding a new column, you could go for something like (not tested):
ALTER TABLE order_details ADD column_name datatype
UPDATE order_details SET column_name = UnitPrice+1
You basically ALTER TABLE to add the new column, then perform an UPDATE operation on all the table to set the value of the newly added column.

Separating multiple values in one column in MS SQL

I have a field in an application that allows a user to select multiple values. When I query this field in the DB, if multiple values were selected the result gets displayed as one long word. There are no commas or space between the multiple selected values. Is there any way those values can be split by a comma?
Here’s my query:
SELECT HO.Value
FROM HAssessment ha
INNER JOIN HObservation HO
ON HO.AssessmentiD = ha.AssessmentID
AND HO.Patient_Oid = 2255231
WHERE Ho.FindingAbbr = 'A_R_CardHx'
------------------------------------------------
Result:
AnginaArrhythmiaCADCChest Pain
-------------------------
I would like to see:
Angina, Arrhythmia, CADC, Chest Pain
------------------------------------------
Help!
There's no easy solution to this.
The most expedient would be writing a string splitting function. From your sample data, it seems the values are all concatenated together without any separators. This means you'll have to come up with a list of all possible values (hopefully this is a query from some symptoms table...) and parse each one out from the string. This will be complex and painful.
A straightforward way to do this would be to test each valid symptom value to see whether it's contained within HObservation.Value, stuff all the found values together, and return the result. Note that this will perform very poorly.
Here's an example in TSQL. You'd be much better off doing this at the application layer, though, or better yet, normalizing your database (see below for more on that).
declare #symptoms table (symptom varchar(100))
insert into #symptoms (symptom)
values ('Angina'),('Arrhythmia'),('CADC'),('Chest Pain')
declare #value varchar(100)
set #value = 'AnginaArrhythmiaCADCChest Pain'
declare #result varchar(100)
select #result = stuff((
SELECT ', ' + s.symptom
FROM #symptoms s
WHERE patindex('%' + s.symptom + '%',#value) > 0
FOR XML PATH('')),1,1,'')
select #result
The real answer is to restructure your database. Put each distinct item found in HObservation.Value (this means Angina, Arrhythmia, etc. as separate rows) in to some other table if such a table doesn't exist already. I'll call this table Symptom. Then create a lookup table to link HObservation with Symptom. Then drop the HObservation.Value column entirely. Do the splitting work in the application level, and make multiple inserts in to the lookup table.
Example, based on sample data from your question:
HObservation
------------
ID Value
1 AnginaArrhythmiaCADC
Becomes:
HObservation
------------
ID
1
Symptom
-------
ID Value
1 Angina
2 Arrhythmia
3 CADC
HObservationSymptom
-------------------
ID HObservationID SymptomID
1 1 1
1 1 2
1 1 3
Note that if this is a production system (or you want to preserve the existing data for some other reason), you'll still have to write code to do the string splitting.

Moving data between 2 tables on columns with different datatypes

I have 2 tables in 2 different databases. The first table (Custumers) Has many data with 10-12 coulmns.
Then I have the Second table(CustumersNew), it has new columns that should represent the same columns as Custumers just with different names and datatypes. CustumersNew is currently empty. I want to move all of the data from table Custumers to table CustumersNew.
The thing here is that table Custumers UserID column has the datatype uniqueidentifier
and the CustumerNew ID column has the datatype int. So as the rest of the coulmns, they sinply do not match in datatypes.
How do i move the data from A to B?
EDIT:
I'm using MS-SQL
I would use the INSERT INTO CustumersNew(<column list>) SELECT <column list from Custumers CONVERTed to data types that match the data type of corresponding columns in CustumersNew> FROM Custumers statement.
E.g.
INSERT INTO CustumersNew(UserId, Name, Age)
SELECT UserId, CONVERT(NVARCHAR(128), Name), CONVERT(INT, Age)
FROM Custumers
I am assuming that Name and Age are of different types in these two tables. You would need to write a similar convert statements where the data type argument should match the data type in the CustumersNew table.
Since UserId/CustomerId being uniqueidentifier cannot be mapped to integer and I doubt the relevance of the values in this column from a functional perspective, I would model the UserId/CustomerId as an AUTO/IDENTITY column in the new table.
Well, you can't store a uniqueidentifier in an int column so you'll have to come up with a new set of keys.
Most database systems provide a mechanism for sequentially numbering records with integer values. In SQL Server, they use IDENTITY columns, in Oracle they use sequences, I think that in MySql you specify the column as auto_increment.
Once you have set up your new table (with it's auto-numbering scheme) then simply insert the data using SQL:
INSERT INTO CUSTOMERS_NEW (COL2, COL3, COL4)
SELECT COL2, COL3, COL4 FROM CUSTOMERS
Notice that the insert statement does not include the ID column - that should be populated automatically for you.
if you are not able to use insert and have to use update it will be like this
UPDATE change
SET widget_id = (SELECT insert_widget.widget_id
FROM insert_widget
WHERE change.id = insert_widget.id)
WHERE change.id = (SELECT insert_widget.id
FROM insert_widget
WHERE change.id = insert_widget.id)
in this example, I wanted to move widget_id column form insert_widget table to change table, but change table already has data, so I have to use update statement

How to determine of an SQL table row contains values other than NULL

This is the problem:
I have a table with unknown number of columns. all the columns are real.
Assuming there is only one row in that table, I need a method to determine if there is a value other than NULL in that table/row.
I don't know the number of columns nor their name at run-time (and don't want to use c cursor)
SQL Server 2005
I appreciate your help.
Here's one way - CHECKSUM() returns no value if all values in the row are NULL:
create table #t (col1 real, col2 real, col3 real)
select checksum(*) from #t
if ##rowcount = 0
print 'All values are NULL'
else
print 'Non-NULL value(s) found'
drop table #t
On the other hand, I don't really know if this is what you're doing: a "temporary table built in memory" sounds like something you're managing yourself. With more information about what you're trying to achieve, we might be able to suggest a better solution.
And by the way, there is nothing wrong with a single-row table for storing settings. It has the big advantage that each setting has a separate data type, can have CHECK constraints etc.
Sounds like you're doing some kind of a settings/properties table, based on the fact that you know you only have 1 row in it. This is the wrong way to do it, if you need to have dynamic properties; instead have a table with 2 columns: option and value. Then for each dynamic property, you'll store one row.