Best Practice: Select * on CTE - sql

The following post had compelling reasons for generally avoiding the use of select * in SQL.
Why is SELECT * considered harmful?
In the discussion was examples of when it was or wasn't acceptable to use select * However I did not see discussion on common table expression (CTE). Are there any drawbacks for using select * in CTEs?
Example:
WITH CTE1 AS
(
SELECT Doc, TotalDue
FROM ARInvoices
WHERE CustomerName = 'ABC'
UNION
SELECT Doc, - TotalDue
FROM ARInvoiceMemos
WHERE CustomerName = 'ABC'
)
select * from CTE1
UNION
Select 'Total' as Doc, sum(TotalDue)
FROM CTE1

Since you already properly listed the column names in the cte, I don't see any harm in using select * from the cte.
In fact, it might be just the right place to use select *, since there is no point of listing the columns twice.
Unless you don't need to use all the columns returned by the cte. (i.e a column in the cte is used on the query, but not in the select clause) In that case, I would suggest listing only the columns you need even of the from is pointing to a cte.
Note that if the cte itself uses select * then all of the drawbacks listed in the post you linked to applies to it.
My main objection to select * is that it's usually used by lazy developers that doesn't consider the consequences of the *.
Note: Everything I've written here applies to derived tables as well.

In theory the rule of thumb that select * is ill advised always applies.
In practice though, if you are a developer who considers things like design and general good programming practice as important as functionality, your CTE will most likely be coded to only return the columns which are actually needed, so select * from CTE1 might not be so bad.

Related

In-line CTE in SQL

Until today I thought that the CTE had to go before the main SELECT clause, but it appears that a CTE can be dropped into any subselect. As an example:
-- tested on BigQuery, Redshift, MySQL, Postgres
SELECT CONCAT(Odd.num, Even.num) FROM
(WITH Numbers AS (SELECT 1 AS num UNION ALL SELECT 3 UNION ALL SELECT 5) SELECT * FROM Numbers) Odd,
(WITH Numbers AS (SELECT 2 AS num UNION ALL SELECT 4 UNION ALL SELECT 6) SELECT * FROM Numbers) Even
Outside of just playing around with things though, is there ever a case for this sort of "embedded" CTE, or is there no reason to ever have it outside of the top, as if it's a variable declaration?
Not only that, you can have CTEs inside of CTEs.
WITH CTE1 AS (
WITH CTE2 AS (
…
)
)
SELECT …
I had to work on someone’s query that did this 13 levels deep. What a pain. The main reason this isn’t good is that it makes a mess of readability. I think what you are describing isn’t quite as bad as it keeps subsections of the query together. Doing oddball things like this generally leads to confusion when others try to understand your query.

Sql server sorting view results slow

I have a SQL server database-view which has lots of inner-join operations. This view works perfectly when performing select-operations. It is not very fast, but within reason.
SELECT * FROM ViewName WHERE ItemId=1234
However when sorting the results of this view the performance drops to an unacceptable low.
SELECT * FROM ViewName WHERE ItemId=1234 ORDER BY CompanyName
This seems a bit strange because when I run the same query on a temporary table
SELECT * FROM ViewName INTO #temp WHERE ItemId=1234
SELECT * FROM #temp ORDER BY CompanyName
This is very fast.
Is there a way to make the sorting of my view-data faster, without using the temporary-table solution? So to force the query to first do the selection, and then the sorting.
There are a few variants you can try, that sometimes offer better performance. The key really is to look at the execution plan when you run the query with and without the ORDER BY, and see what is different.
One option is to use a subquery as a derived table:
SELECT *
FROM (
SELECT *
FROM ViewName
WHERE ItemId=1234
) AS dt
ORDER BY CompanyName
Another option is using common table expression, which I always prefer over derived tables if possible, because they are more readable:
WITH cte AS (
SELECT *
FROM ViewName
WHERE ItemId=1234
)
SELECT *
FROM cte
ORDER BY CompanyName
A third option is to use index hints, to force it to use the correct index. I always try to avoid this option though, because it can cause issues in the future if the data or structure change. You can read some more opinions about index hints here:
https://www.brentozar.com/archive/2013/10/index-hints-helpful-or-harmful/

Is Select * bad practice when selecting from a common table expression (cte) [duplicate]

The following post had compelling reasons for generally avoiding the use of select * in SQL.
Why is SELECT * considered harmful?
In the discussion was examples of when it was or wasn't acceptable to use select * However I did not see discussion on common table expression (CTE). Are there any drawbacks for using select * in CTEs?
Example:
WITH CTE1 AS
(
SELECT Doc, TotalDue
FROM ARInvoices
WHERE CustomerName = 'ABC'
UNION
SELECT Doc, - TotalDue
FROM ARInvoiceMemos
WHERE CustomerName = 'ABC'
)
select * from CTE1
UNION
Select 'Total' as Doc, sum(TotalDue)
FROM CTE1
Since you already properly listed the column names in the cte, I don't see any harm in using select * from the cte.
In fact, it might be just the right place to use select *, since there is no point of listing the columns twice.
Unless you don't need to use all the columns returned by the cte. (i.e a column in the cte is used on the query, but not in the select clause) In that case, I would suggest listing only the columns you need even of the from is pointing to a cte.
Note that if the cte itself uses select * then all of the drawbacks listed in the post you linked to applies to it.
My main objection to select * is that it's usually used by lazy developers that doesn't consider the consequences of the *.
Note: Everything I've written here applies to derived tables as well.
In theory the rule of thumb that select * is ill advised always applies.
In practice though, if you are a developer who considers things like design and general good programming practice as important as functionality, your CTE will most likely be coded to only return the columns which are actually needed, so select * from CTE1 might not be so bad.

Not Able to Query Multiple Times from Multiple Common Table Expressions (WITH)?

I was doing some querying today in T-SQL, SQL-Server-2008 and stumbled upon something weird that I didn't understand. Using the query windows, I am trying to query from two common table expressions like so (I stripped out a lot of code to make it more obvious what I was doing):
;WITH temp1 AS (SELECT * FROM dbo.Log)
, temp2 AS (SELECT * FROM dbo.SignalCodeItems300_tbl)
SELECT * FROM temp1
SELECT * FROM temp2
However, only one of the select statements will run, the FIRST one. Regardless of which is which, only the first runs. I assume this is some sort of syntax thing that I'm missing maybe? I get the error "Invalid object name 'temp2'".
Could someone shed some light on this problem? Are there any workarounds for this?
No, this works as it should. A CTE (Common Table Expression) is only available for the first statement after the definition. So in other words, after select * from temp1, they both become unavailable.
The fix would be this:
;WITH temp1 AS (SELECT * FROM dbo.Log)
SELECT * FROM temp1
;WITH temp2 AS (SELECT * FROM dbo.SignalCodeItems300_tbl)
SELECT * FROM temp2
You might want to take a look at the MSDN documentation.
Especially:
Multiple CTE query definitions can be defined in a nonrecursive CTE.
The definitions must be combined by one of these set operators:
UNION ALL, UNION, INTERSECT, or EXCEPT.
You cannot mix and match two different schemas though, as this essentially runs as one query.
Use a view or a user-defined, table-valued function to house your query if you don't want to repeat it explicitly.

What do you put in a subquery's Select part when it's preceded by Exists?

What do you put in a subquery's Select part when it's preceded by Exists?
Select *
From some_table
Where Exists (Select 1
From some_other_table
Where some_condition )
I usually use 1, I used to put * but realized it could add some useless overhead.
What do you put? is there a more efficient way than putting 1 or any other dummy value?
I think the efficiency depends on your platform.
In Oracle, SELECT * and SELECT 1 within an EXISTS clause generate identical explain plans, with identical memory costs. There is no difference. However, other platforms may vary.
As a matter of personal preference, I use
SELECT *
Because SELECTing a specific field could mislead a reader into thinking that I care about that specific field, and it also lets me copy / paste that subquery out and run it unmodified, to look at the output.
However, an EXISTS clause in a SQL statement is a bit of a code smell, IMO. There are times when they are the best and clearest way to get what you want, but they can almost always be expressed as a join, which will be a lot easier for the database engine to optimize.
SELECT *
FROM SOME_TABLE ST
WHERE EXISTS(
SELECT 1
FROM SOME_OTHER_TABLE SOT
WHERE SOT.KEY_VALUE1 = ST.KEY_VALUE1
AND SOT.KEY_VALUE2 = ST.KEY_VALUE2
)
Is logically identical to:
SELECT *
FROM
SOME_TABLE ST
INNER JOIN
SOME_OTHER_TABLE SOT
ON ST.KEY_VALUE1 = SOT.KEY_VALUE1
AND ST.KEY_VALUE2 = SOT.KEY_VALUE2
I also use 1. I've seen some devs who use null. I think 1 is efficient compared to selecting from any field as the query won't have to get the actual value from the physical loc when it executes the select clause of the subquery.
Use:
WHERE EXISTS (SELECT NULL
FROM some_other_table
WHERE ... )
EXISTS returns true if one or more of the specified criteria match - it doesn't matter if columns are actually returned in the SELECT clause. NULL just makes it explicit that there isn't a comparison while 1/etc could be a valid value previously used in an IN clause.