How do duplicated expressions effect the performance of queries?

How do duplicated expressions effect the performance of queries? - sql

When I write something like
SELECT Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int as [Result]
FROM [sometable]
WHERE Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int > 4
Is that inefficient and how to do it better?

Is that inefficient?
No, SQL Server will evaluate the expression once and re-use it.
how to do it better?
You can laterally join a re-usable values with a cross apply and a values table constructor:
Select MyValue, OtherCols
from SomeTable
cross apply(values( <my expression> ))v(MyValue)
where MyValue = ?;

Yes, I Mean that is inefficient. you can try following this.
SELECT [Result]
FROM (SELECT Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int as [Result] FROM [sometable]) as temp
WHERE [Result] > 4

Related

SQL - Transposing rows from some columns in a table to each record in thesame table

I am using a platform which accepts minimal SQL functions to write a SQL code. The UNPIVOT function cannot be used on the platform so I have to do this manually. I am thinking along the line of UNION ALL and then CROSS JOINING (which I attempted but ended up with the wrong record counts. Please see image attached.
Any help / pointer will be highly appreciated!

I don't know how you used UNION ALL but it can be done like this:
select col1, col2, col3 as NewCol from Table1
union all
select col1, col2, col4 from Table1
You could also use an ORDER BY clause, so that rows with the same col1 and col2 appear in subsequent rows:
select col1, col2, NewCol
from (
select col1, col2, col3 as NewCol, 1 as ord from Table1
union all
select col1, col2, col4, 2 from Table1
) t
order by col1, col2, ord

A portable approach uses union all:
select col1, col2, col3 as newcol from mytable
union all
select col1, col2, col4 from mytable
If your database supports lateral joins (also called cross apply in some databases) and values(), this can be simplified:
select t.col1, t.col2, x.newcol
from mytable t
cross join lateral (values(col3), (col4)) x(newcol)

You can use a cross join, but it requires some case logic. The exact syntax depends on the database, but something like this:
select t.col1, t.col2,
(case when n.n = 1 then t.col3 else t.col4 end) as newcol
from t cross join
(select 1 as n union all select 2) n;
To load another table, you would do one of the following:
insert these results into a table that has already been created.
Use select into or create table as depending on the database.
If you care about the ordering, then you can add order by t.col1, t.col2, n.n.
In most cases, a simple union all approach is fine (such as GMB suggests). That approach requires scanning the table twice, which incurs some additional overhead. However, if the "table" is really a complex query or view, then only processing it once is a bigger advantage.

NULL safe way of averaging values in two columns in a snowflake table

I am looking for a NULL safe way of averaging two columns in a Snowflake Table. I can use
select col1, col2, (col1+col2)/2 AS averaged_columns FROM tablename
but this doesn't work where there are NULL values in either of the col1 or col2 or both.

You can. use a case expression:
select col1, col2,
(case when col1 is null then col2
when col2 is null then col1
else (col1 + col2) / 2
end)
from t;
You can simplify this to the more inscrutable:
select coalesce((col1 + col2) / 2, col1, col2)
Note: This assumes you want to ignore NULL values, not treat them as 0. If you want to treat them as 0, use coalesce():
select (coalesce(col1, 0) + coalesce(col2, 0)) / 2
A lateral join version should also work:
select t.*, a.average
from t cross join lateral
(select avg(col) as average
from (values (t.col1), (t.col2)) v(col)
) a

For null values in columns, have to convert the null values to 0. So check the below query
select col1, col2,
(ISNULL(col1,0) + ISNULL(col2,0))/2 AS averaged_columns FROM tablename

select col1, col2, (nvl(col1,0)+nvl(col2,0))/2

How Can I Use the Max Function to Filter a List and Insert Into

Is there a way to do something like:
Insert Into (col1, col2, col3)
Select col1, col2, col3, max(col4)
From mytable
Group By col1, col2, col3
That gives me: The select list for the INSERT statement contains more items than the insert list.
I want to use the max function to filter out dupes but when I select this extra field, the order of fields and number of fields doesn’t match up. How can I filter a list from a table, use the max function, and insert all records except the ones in the max field?

I want to use the max function to filter out dupes
Well, I suspect that you actually want distinct:
insert into my_target_table(col1, col2, col3)
select distinct col1, col2, col3 from my_source_table
This will insert one record in the target table for each distinct (col1, col2, col3) tuple in the source table.

You are describing something like this:
Insert Into (col1, col2, col3)
select col1, col2, col3
from mytable
where t.col4 = (select max(t2.col4)
from mytable t2
where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 = t.col3
);
However, this is pretty much equivalent to select distinct (NULL values might be treated differently). You probably want dupes defined on only one column, so I'm thinking:
insert into (col1, col2, col3)
select col1, col2, col3
from mytable
where t.col4 = (select max(t2.col4)
from mytable t2
where t2.col1 = t.col1
);

INSERT INTO using a query, and add a default value

I want run an INSERT INTO table SELECT... FROM...
The problem is that the table that I am inserting to has 5 columns, whereas the table I am selecting from has only 4. The 5th column needs to be set do a default value that I specify. How can I accomplish this? The query would be something like this (note: this is Oracle):
INSERT INTO five_column_table
SELECT * FROM four_column_table
--and a 5th column with a default value--;

Just add the default value to your select list.
INSERT INTO five_column_table
SELECT column_a, column_b, column_c, column_d, 'Default Value'
FROM four_column_table;

Just select the default value in your SELECT list. It's always a good idea to explicitly list out columns so I do that here even though it's not strictly necessary.
INSERT INTO five_column_table( col1, col2, col3, col4, col5 )
SELECT col1, col2, col3, col4, 'Some Default'
FROM four_column_table
If you really don't want to list out the columns
INSERT INTO five_column_table
SELECT fct.*, 'Some Default'
FROM four_column_table fct

Oracle supports a keyword DEFAULT for this purpose:
insert all
into five_column_table( col1, col2, col3, col4, col5 )
VALUES( col1, col2, col3, col4, DEFAULT)
SELECT col1, col2, col3, col4
FROM four_column_table;
But in your case I had to use multi-table insert. DEFAULT keyword can be used only in values clause.

Oracle: Insert into select... in the

What is the advantage of inserting into a select of a table over simply inserting into the table?
eg
insert into
( select COL1
, COL2
from Table1
where 1=2 <= this and above is the focus of the question.
) select COL3, COL4 from Table2 ;
It seems to do the same thing as:
insert into Table1
( COL1, COL2 )
select COL3, COL4 from Table2 ;
This is the first time I've seen this; our Sr Dev says there is some advantage but he can't remember what it is.
It may make sense in a way if one was inserting a "select *..." from a table with lots of columns, and we want to be lazy, but... we're not. We're enumerating each column in the table.
Database is Oracle 11gR2, but this query was written probably in 10g or before.

we want to be lazy
No, we use insert into table(col1, col2) select col2, col2 from ... when there is a lot of records (for example 1M) and we don't want to create a the values section for each. Let's imagine how much time it takes if you write
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1);
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 2);
...
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1000000);
insert select is faster way for copying data from one table(several tables) to an another table.

In a nutshell. It's a lot easier. Especially when you have a massive query that you dont wanna rebuild,or if you have a crapton of objects, or values you are inserting.

Without WITH CHECK OPTION specified, I don't know of any purpose for this syntax. If you specify WITH CHECK OPTION, you can effectively implement an ad-hoc check constraint within your insert statement.
insert into
( select COL1
, COL2
from Table1
where 1=2 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;
The above will never insert a record, because 1 will never equal 2.
The statement below will insert a record as long as COL3 is less than 100, otherwise an exception is raised.
insert into
( select COL1
, COL2
from Table1
where COL1 < 100 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do duplicated expressions effect the performance of queries? - sql

When I write something like SELECT Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int as [Result] FROM [sometable] WHERE Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int > 4 Is that inefficient and how to do it better?

Yes, I Mean that is inefficient. you can try following this. SELECT [Result] FROM (SELECT Cast(DateDiff(milliseconds, col1, col2) / DateDiff(milliseconds, col3, col4) as int as [Result] FROM [sometable]) as temp WHERE [Result] > 4

Related

SQL - Transposing rows from some columns in a table to each record in thesame table

NULL safe way of averaging values in two columns in a snowflake table

How Can I Use the Max Function to Filter a List and Insert Into

INSERT INTO using a query, and add a default value

Oracle: Insert into select... in the

Categories

Resources