Type Conversion in Persisted Computed Column - sql

I'm working with 2 related tables in a Microsoft SQL Server 2008 environment which are connected via a GUID. In one table, the field has the type varchar(50), the other one is properly types as uniqueidentifier. This is obviously bad but I can't change this now because it's given by a legacy piece of software.
The conversion SQL Server needs to perform at each inner join makes queries running terribly slow, since I can't use indices at all. I tried adding a Computed Column, which is persisted, to get the ID stored as a uniqueidentifer. This way I could add an index to get it running much faster probably. I failed.
Does anybody know if I can store an explicitly converted value in a computer column. If I can, what's the formula to use here?
Cheers,
Matthias

This worked for me:
CREATE TABLE t_uuid (charid VARCHAR(50) NOT NULL, uuid AS CAST(charid AS UNIQUEIDENTIFIER))
CREATE INDEX IX_uuid_uuid ON t_uuid (uuid)
INSERT
INTO t_uuid (charid)
VALUES (NEWID())
SELECT *
FROM t_uuid

CONVERT(uniqueidentifier, your_varchar_here)

Depending on how often you need to make the conversion for joining, I'd use a CTE to convert the data type(s). It is constructed faster than an inline view (next best temporary option). In either case, you'd expose value as the correct data type in a result column from the CTE/inline view so you can JOIN on to it. CTE Example:
WITH example AS (
SELECT t.guid
CONVERT(UniqueIdentifier, t.guid) 'cguid'
FROM TABLE t)
SELECT t.*
FROM TABLE t
JOIN example e ON e.cguid = t.guid
Inline view example:
SELECT t.*
FROM TABLE t
JOIN (SELECT t.guid
CONVERT(UniqueIdentifier, t.guid) 'cguid'
FROM TABLE t) e ON e.cguid = t.guid
It's not going to get around that the index for guid (assuming one does) won't be used, but it's also not a good habit to be performing data type conversion in the WHERE clause.

Related

How can I create a calculate column in the creation of table in POSTGRESQL, for example in sql server LineTotal AS Price * Quantity [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

SQL Server subquery behaviour

I have a case where I want to check to see if an integer value is found in a column of a table that is varchar, but is a mix of values that can be integers some are just strings. My first thought was to use a subquery to just select the rows with numeric-esque values. The setup looks like:
CREATE TABLE #tmp (
EmployeeID varchar(50) NOT NULL
)
INSERT INTO #tmp VALUES ('aa1234')
INSERT INTO #tmp VALUES ('1234')
INSERT INTO #tmp VALUES ('5678')
DECLARE #eid int
SET #eid = 5678
SELECT *
FROM (
SELECT EmployeeID
FROM #tmp
WHERE IsNumeric(EmployeeID) = 1) AS UED
WHERE UED.EmployeeID = #eid
DROP TABLE #tmp
However, this fails, with: "Conversion failed when converting the varchar value 'aa1234' to data type int.".
I don't understand why it is still trying to compare #eid to 'aa1234' when I've selected only the rows '1234' and '5678' in the subquery.
(I realize I can just cast #eid to varchar but I'm curious about SQL Server's behaviour in this case)
You can't easily control the order things will happen when SQL Server looks at the query you wrote and then determines the optimal execution plan. It won't always produce a plan that follows the same logic you typed, in the same order.
In this case, in order to find the rows you're looking for, SQL Server has to perform two filters:
identify only the rows that match your variable
identify only the rows that are numeric
It can do this in either order, so this is also valid:
identify only the rows that are numeric
identify only the rows that match your variable
If you look at the properties of this execution plan, you see that the predicate for the match to your variable is listed first (which still doesn't guarantee order of operation), but in any case, due to data type precedence, it has to try to convert the column data to the type of the variable:
Subqueries, CTEs, or writing the query a different way - especially in simple cases like this - are unlikely to change the order SQL Server uses to perform those operations.
You can force evaluation order in most scenarios by using a CASE expression (you also don't need the subquery):
SELECT EmployeeID
FROM #tmp
WHERE EmployeeID = CASE IsNumeric(EmployeeID) WHEN 1 THEN #eid END;
In modern versions of SQL Server (you forgot to tell us which version you use), you can also use TRY_CONVERT() instead:
SELECT EmployeeID
FROM #tmp
WHERE TRY_CONVERT(int, EmployeeID) = #eid;
This is essentially shorthand for the CASE expression, but with the added bonus that it allows you to specify an explicit type, which is one of the downsides of ISNUMERIC(). All ISNUMERIC() tells you is if the value can be converted to any numeric type. The string '1e2' passes the ISNUMERIC() check, because it can be converted to float, but try converting that to an int...
For completeness, the best solution - if there is an index on EmployeeID - is to just use a variable that matches the column data type, as you suggested.
But even better would be to use a data type that prevents junk data like 'aa1234' from getting into the table in the first place.

How to efficiently SELECT rows from database table based on selected set of values

I have a transaction table of 1 million rows. The table has a field name "Code" to keep customer's ID. There are about 10,000 different customer code.
I have an GUI interface allow user to render a report from transaction table. User may select arbitrary number of customers for rendering.
I use IN operator first and it works for few customers:
SELECT * FROM TRANS_TABLE WHERE CODE IN ('...', '...', '...')
I quickly run into problem if I select few thousand customers. There is limitation using IN operator.
An alternate way is create a temporary table with only one field of CODE, and inject selected customer codes into the temporary table using INSERT statement. I may then using
SELECT A.* FROM TRANS_TABLE A INNER JOIN TEMP B ON (A.CODE=B.CODE)
This works nice for huge selection. However, there is performance overhead for temporary table creation, INSERT injection and dropping of temporary table.
Do you aware of better solution to handle this situation?
If you use SQL Server 2008, the fastest way to do this is usually with a Table-Valued Parameter (TVP):
CREATE TYPE CodeTable AS TABLE
(
Code int NOT NULL PRIMARY KEY
)
DECLARE #Codes AS CodeTable
INSERT #Codes (Code) VALUES (1)
INSERT #Codes (Code) VALUES (2)
INSERT #Codes (Code) VALUES (3)
-- Snip codes
SELECT t.*
FROM #Codes c
INNER JOIN Trans_Table t
ON t.Code = c.Code
Using ADO.NET, you can populate the TVP directly from your code, so you don't need to generate all those INSERT statements - just pass in a DataTable and ADO.NET will handle the rest. So you can write a Stored Procedure like this:
CREATE PROCEDURE GetTransactions
#Codes CodeTable READONLY
AS
SELECT t.*
FROM #Codes c
INNER JOIN Trans_Table t
ON t.Code = c.Code
... and just pass in the #Codes value as a parameter.
You can generate SQL such as
SELECT * FROM TRANS_TABLE WHERE CODE IN (?,?,?,?,?,?,?,?,?,?,?)
and re-use it in a loop until you've loaded all the IDs you need. The advantage is that if you only need a few IDs your DB doesn't need to parse all those in-clauses. If many IDs is a rare case then the performance hit may not matter. If you are not worried about the SQL parsing cache then you can limit the size of the in clause to the DB's actual limit, so that sometimes you don't need a loop and other times you do.
As you have to pass the IDs somehow, IN should be the fastest way.
MSDN mentions:
Including an extremely large number of values (many thousands) in an IN clause can consume resources and return errors 8623 or 8632. To work around this problem, store the items in the IN list in a table.
If you still can use IN and the query is to slow, you could try to adjust your indexes like using some covering index for your query. Looking up random values by the clustered index can be slow, because of the random disk I/O required. A covering index could reduce that problem.
If you really pass limit of IN and you create a temporary table, I don't expect the creation of the table to be a major problem, as long as you insert the values at once (not thousands of queries of course). Choose the method with the least overhead, like one of those mentioned here:
http://blog.sqlauthority.com/2008/07/02/sql-server-2008-insert-multiple-records-using-one-insert-statement-use-of-row-constructor/
Of course, if there is some static pattern in your IDs you could select by that (like in SPs or UDFs). If you get those thousands of IDs out of your database itself, instead of passing them back and forth, you could just store them or use a subquery...
Maybe you could pass the customer codes to a stored procedure comma separated and use the split sql function mentioned here: http://www.devx.com/tips/Tip/20009.
Then declare a scalar table where you insert the splitted values in and use an IN clause.
CREATE PROCEDURE prc_dosomething (
#CustomerCodes varchar(MAX)
)
AS
DECLARE #customercodetable table(code varchar(10)) -- or whatever length you require.
SET #customercodetable = UTILfn_Split(#CustomerCodes) -- see the article above for the split function.
-- do some magic stuff here :).

how to convert result of an select sql query into a new table in ms access

how to convert result of an select sql query into a new table in msaccess ?
You can use sub queries
SELECT a,b,c INTO NewTable
FROM (SELECT a,b,c
FROM TheTable
WHERE a Is Null)
Like so:
SELECT *
INTO NewTable
FROM OldTable
First, create a table with the required keys, constraints, domain checking, references, etc. Then use an INSERT INTO..SELECT construct to populate it.
Do not be tempted by SELECT..INTO..FROM constructs. The resulting table will have no keys, therefore will not actually be a table at all. Better to start with a proper table then add the data e.g. it will be easier to trap bad data.
For an example of how things can go wrong with an SELECT..INTO clause: it can result in a column that includes the NULL value and while after the event you can change the column to NOT NULL the engine will not replace the NULLs, therefore you will end up with a NOT NULL column containing NULLs!
Also consider creating a 'viewed' table e.g. using CREATE VIEW SQL DDL rather than a base table.
If you want to do it through the user interface, you can also:
A) Create and test the select query. Save it.
B) Create a make table query. When asked what tables to show, select the query tab and your saved query.
C) Tell it the name of the table you want to create.
D) Go make coffee (depending on taste and size of table)
Select *
Into newtable
From somequery

Forcing a datatype in MS Access make table query

I have a query in MS Access which creates a table from two subqueries. For two of the columns being created, I'm dividing one column from the first subquery into a column from the second subquery.
The datatype of the first column is a double; the datatype of the second column is decimal, with scale of 2, but I want the second column to be a double as well.
Is there a way to force the datatype when creating a table through a standard make-table Access query?
One way to do it is to explicitly create the table before putting anything into it.
Your current statement is probably like this:
SELECT Persons.LastName,Orders.OrderNo
INTO Persons_Order_Backup
FROM Persons
INNER JOIN Orders
ON Persons.P_Id=Orders.P_Id
WHERE FirstName = 'Alistair'
But you can also do this:
----Create NewTable
CREATE TABLE NewTable(FirstName VARCHAR(100), LastName VARCHAR(100), Total DOUBLE)
----INSERT INTO NewTableusing SELECT
INSERT INTO NewTable(FirstName, LastName, Total)
SELECT FirstName, LastName,
FROM Person p
INNER JOIN Orders o
ON p.P_Id = o.P_Id
WHERE p.FirstName = 'Alistair'
This way you have total control over the column types. You can always drop the table later if you need to recreate it.
You can use the cast to FLOAT function CDBL() but, somewhat bizarrely, the Access Database Engine cannot handle the NULL value, so you must handle this yourself e.g.
SELECT first_column,
IIF(second_column IS NULL, NULL, CDBL(second_column))
AS second_column_as_float
INTO Table666
FROM MyTest;
...but you're going to need to ALTER TABLE to add your keys, constraints, etc. Better to simply CREATE TABLE first then use INSERT INTO..SELECT to populate it.
You can use CDbl around the columns.
An easy way to do this is to create an empty table with the correct field types and then to an Append-To query and Access will automatically convert the data to the destination field.
I had a similar situation, but I had a make-table query creating a field with NUMERIC datatype that I wanted to be short text.
What I did (and I got the idea from Stack) is to create the table with the field in question as Short Text, and at the same time build a delete query to scrub the records. I think it's funny that a DELETE query in access doesn't delete the table, just the records in it - I guess you have to use a DROP TABLE function for that, to purge a table...
Then, I converted my make-table query to an APPEND query, which I'd never done before... and I just added the running of the DELETE query to my process.
Thank you, Stack Overflow !
Steve
I add a '& ""' to the field I want to make sure are stored as text, and a ' *1 ' (as in multiplying the amount by 1) to the fields I want to store as numeric.
Seems to do the trick.
To get an Access query to create a table with three numeric output fields from input numeric fields, (it kept wanting to make the output fields text fields), had to combine several of the above suggestions. Pre-establish an empty output table with pre-defined output fields as integer, double and double. In the append query itself, multiply the numeric fields by one. It worked. Finally.