Working on speeding up a query and I've noticed for some reason the more empty columns added to a query the slower it gets.
With only the Id column the query returns 100k records in approx. 1 second.
If I add about 20 empty columns it goes to 4 seconds.
Questions
- What is the default data type of the string in SQL?
- Any way to speed this up?
SELECT Id,
'' as col1,
'' as col2,
'' as col3
FROM myTable
It will depend on how many rows are in your myTable. For ex: If you have 905k rows on mytable, Basically SQL is creating 20 Columns with ' ' for 905k rows
I just tried it in my own table that has 805k rows. For every increment of Columns I add, SQL creates '' values for each row.
Hope this helps you understand it clearer.
The default data type seems to be a varchar(1) -- you can insert it into a temp table and check the temp table structure to confirm. One option you can try is declare a variable and use it rather than the empty spaces:
declare #space varchar(40) = ''
SELECT
id,
#space as col1,
#space as col2,
#space as col3
FROM dbo.[table]
I have a list of values such as
1,2,3,4...
that will be passed into my SQL query.
I need to have these values stored in a table variable. So essentially I need something like this:
declare #t (num int)
insert into #t values (1),(2),(3),(4)...
Is it possible to do that formatting in SQL Server? (turning 1,2,3,4... into (1),(2),(3),(4)...
Note: I can not change what those values look like before they get to my SQL script; I'm stuck with that list. also it may not always be 4 values; it could 1 or more.
Edit to show what values look like: under normal circumstances, this is how it would work:
select t.pk
from a_table t
where t.pk in (#place_holder#)
#placeholder# is just a literal place holder. when some one would run the report, #placeholder# is replaced with the literal values from the filter of that report:
select t.pk
from a_table t
where t.pk in (1,2,3,4) -- or whatever the user selects
t.pk is an int
note: doing
declare #t as table (
num int
)
insert into #t values (#Placeholder#)
does not work.
Your description is a bit ridicuolus, but you might give this a try:
Whatever you mean with this
I see what your trying to say; but if I type out '#placeholder#' in the script, I'll end up with '1','2','3','4' and not '1,2,3,4'
I assume this is a string with numbers, each number between single qoutes, separated with a comma:
DECLARE #passedIn VARCHAR(100)='''1'',''2'',''3'',''4'',''5'',''6'',''7''';
SELECT #passedIn; -->: '1','2','3','4','5','6','7'
Now the variable #passedIn holds exactly what you are talking about
I'll use a dynamic SQL-Statement to insert this in a temp-table (declared table variable would not work here...)
CREATE TABLE #tmpTable(ID INT);
DECLARE #cmd VARCHAR(MAX)=
'INSERT INTO #tmpTable(ID) VALUES (' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','),(') + ');';
EXEC (#cmd);
SELECT * FROM #tmpTable;
GO
DROP TABLE #tmpTable;
UPDATE 1: no dynamic SQL necessary, all ad-hoc...
You can get the list of numbers as derived table in a CTE easily.
This can be used in a following statement like WHERE SomeID IN(SELECT ID FROM MyIDs) (similar to this: dynamic IN section )
WITH MyIDs(ID) AS
(
SELECT A.B.value('.','int') AS ID
FROM
(
SELECT CAST('<x>' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','</x><x>') + '</x>' AS XML) AS AsXml
) as tbl
CROSS APPLY tbl.AsXml.nodes('/x') AS A(B)
)
SELECT * FROM MyIDs
UPDATE 2:
And to answer your question exactly:
With this following the CTE
insert into #t(num)
SELECT ID FROM MyIDs
... you would actually get your declared table variable filled - if you need it later...
I have as a result from a dynamic query a table with only one row, it has 1 column + d + n columns so, the problem here is that the number of 'd' and 'n' is variable so I could have a row with 5,6,7,..d values, and 10,11....n values or more...
like
Input:
x f1 f2 f3 ... fd Other1 Other2 Other3 ... Othern
10 1.0000 139.0000 60.0000 ... 59.0000 846.0000 30.1000 0.3980 ... 0.398
If I need to do some calculus with, lets say, n,f1,Other1 for the first column; n,f1,f2,Other1,Other2 for column 2, n,f1,f3,Other1,Other3 for third column... of another table like:
Column_1 Column_2 Column_3 ..Column_d
x*(f1*f1)/(Other1*Other1) x*(f1*f2)/(Other1*Other2) x*(f1*f3)/(Other1*Other3)..x*(f1*fn)/(Other1*Othern)
x*(f2*f1)/(Other2*Other1) x*(f2*f2)/(Other2*Other2) x*(f2*f3)/(Other2*Other3)..x*(f2*fn)/(Other2*Othern) ...
...
x*(fd*f1)/(Otherd*Other1) x*(fd*f2)/(Otherd*Other2) x*(fd*f3)/(Otherd*Other3)..x*(fd*fn)/(Otherd*Othern)
I was thinking to first save the columns that I need in a nested loop, and updating it until I reach the end of the table. but As I need to do that d times I'm getting a little confused, so my questions are:
Could I use a cursor to get the output?
Could I select all vars first to do that?
Using a pivot should do the trick, How?
Do not know, and the main issue is that the input table has d dynamic columns
I am trying to do a Stored procedure but have no luck, dynamically constructing SQL query in code before executing.
Thanks in advance.
Hope the question is more readeable. thank you
PS.
The x is another input, so it has nothing to do with the 'n' elements in columns Other1...Othern
--------------EDIT-----------------
To generate the input table with one row:
I use a dynamic query to select various fields, As they are dynamic I use a string which will be replaced later so the general code is:
SET #template = 'SELECT SUM(1) AS x,{f}, {other} FROM '+ #table_name
--then in some loops I calculate sums, powers, etc...
--so after I replace the strings with cosen queries I replace them like
SET #template = REPLACE(#template, '{f}' , #dynamicStringForf )
SET #template = REPLACE(#template, '{Other}', #dynamicStringForOther )
--Finally I get large query with the objetive I need
--something like:
'SELECT SUM(1) AS x, sum(a+b) as f1,pow(b,c) as f2....,sum(x+y) as Other1 ,pow(y+z) as Other 2... FROM '+ #table_name
the result is a one row with data like:
x f1 f2 f3 ... fd Other1 Other2 Other3 ... Othern
10 1.0000 139.0000 60.0000 ... 59.0000 846.0000 30.1000 0.3980 ... 0.398
Now I have created a new temp table dynamically
--#d could be any number, but at this stage I know it
Set #TempColumn = ''
Set #TempCol = ''
Set #Comma = ''
Set #ColumnNo = 1
Set #SQL = 'Create Table temp ('
WHILE #ColumnNo <= #d Begin
Set #TempColumn =#TempColumn + #Comma + ' Column_' + Cast(#ColumnNo as nvarchar)
Set #SQL =#SQL + #Comma + ' Column_' + Cast(#ColumnNo as nvarchar) + ' FLOAT'
Set #Comma = ','
Set #ColumnNo = #ColumnNo + 1
END
Set #SQL = #SQL + ' )'
EXEC (#SQL) --create temp table
--the result is a new table like:
Column_1 Column_2 Column_3 ... Column_d
Now I want to populate it, something like:
Column_1 Column_2 Column_3 ..Column_d
x*(f1*f1)/(Other1*Other1) x*(f1*f2)/(Other1*Other2) x*(f1*f3)/(Other1*Other3)..x*(f1*fn)/(Other1*Othern)
x*(f2*f1)/(Other2*Other1) x*(f2*f2)/(Other2*Other2) x*(f2*f3)/(Other2*Other3)..x*(f2*fn)/(Other2*Othern) ...
...
x*(fd*f1)/(Otherd*Other1) x*(fd*f2)/(Otherd*Other2) x*(fd*f3)/(Otherd*Other3)..x*(fd*fn)/(Otherd*Othern)
Any idea how to acomplish this, union, cursor, pivot, what could be the best
SQL is really good at working with sets of data -- as you want to do, if that data is stored as rows.
I would think the best way to solve this problem is to transform the data into a table with two columns (one column storing the f values and the second column storing the other values) with D rows.
Then the solution is fairly simple (a join and a pivot statement).
Even better -- re-write the prior query to give you the data in this format. (Do you have the prior query -- I could show you how to do that.
Well to generate the row I described I have something like:
SET #template = 'SELECT SUM(1) AS N,{f}, {other} FROM '+ #table_name
then I replace in a loop the fields I need so,
SET #template = REPLACE(#template, '{f}' , #f)
SET #template = REPLACE(#template, '{other}', #other)
I don't understand how this works... it looks like you are just selecting variables -- are those variables column names? Please clarify -- I'm sure there is a better way to build this query.
Could you explain me how to do that join and that pivot you describe?
If I use a pivot, how do I change the selected columns, to compute the terms as I described?
I will when we know what your data structure looks like for sure, I need something to test against.
f has d data and Other has n, if I put the values ina 2 column table how fo I handle the n > d, and a lot of nulls in the d column??
Often times when you have a lot of nulls you use group by to "squish" the rows down.
I am tryin to get a column from DB that returns Variable Column Name which depends on Row data. I know I can have variable Column name with using Dynamic SQL, but what if the name actually depends on the row's information.
SELECT name,age FROM dbo.Names
--Reurns 'name' as column name
SELECT name as [xyz],age FROM dbo.Names
--Returns 'xyz' as column name
EXEC 'SELECT name as [' + #var + '], age FROM dbo.Names'
--Returns #var value as Column name
SELECT name AS ['Hi: ' + age ] FROM dbo.Name ?????
--So I am trying to get 'Hi: 25' or 'Hi: 40' as column name
How would I do that? Any help please?
You can combine the approaches for one line, but not for selecting all rows:
DECLARE #age INT
SET #age = SELECT TOP 1 #age FROM dbo.NAMES
EXEC 'SELECT name as [Hi ' + #age + '], age FROM dbo.Names'
Why do you need to do this in SQL and not in application logic?
I am gonna have to change on Application Side. I don't think that's possible to change in the SQL Side. That logic only works for the 1 row, I need multiple rows.
Is there any RDBMS that implements something like SELECT * EXCEPT? What I'm after is getting all of the fields except a specific TEXT/BLOB field, and I'd like to just select everything else.
Almost daily I complain to my coworkers that someone should implement this... It's terribly annoying that it doesn't exist.
Edit: I understand everyone's concern for SELECT *. I know the risks associated with SELECT *. However, this, at least in my situation, would not be used for any Production level code, or even Development level code; strictly for debugging, when I need to see all of the values easily.
As I've stated in some of the comments, where I work is strictly a commandline shop, doing everything over ssh. This makes it difficult to use any gui tools (external connections to the database aren't allowed), etc etc.
Thanks for the suggestions though.
As others have said, it is not a good idea to do this in a query because it is prone to issues when someone changes the table structure in the future. However, there is a way to do this... and I can't believe I'm actually suggesting this, but in the spirit of answering the ACTUAL question...
Do it with dynamic SQL... this does all the columns except the "description" column. You could easily turn this into a function or stored proc.
declare #sql varchar(8000),
#table_id int,
#col_id int
set #sql = 'select '
select #table_id = id from sysobjects where name = 'MY_Table'
select #col_id = min(colid) from syscolumns where id = #table_id and name <> 'description'
while (#col_id is not null) begin
select #sql = #sql + name from syscolumns where id = #table_id and colid = #col_id
select #col_id = min(colid) from syscolumns where id = #table_id and colid > #col_id and name <> 'description'
if (#col_id is not null) set #sql = #sql + ','
print #sql
end
set #sql = #sql + ' from MY_table'
exec #sql
Create a view on the table which doesn't include the blob columns
Is there any RDBMS that implements something like SELECT * EXCEPT?
Yes, Google Big Query implements SELECT * EXCEPT:
A SELECT * EXCEPT statement specifies the names of one or more columns to exclude from the result. All matching column names are omitted from the output.
WITH orders AS(
SELECT 5 as order_id,
"sprocket" as item_name,
200 as quantity
)
SELECT * EXCEPT (order_id)
FROM orders;
Output:
+-----------+----------+
| item_name | quantity |
+-----------+----------+
| sprocket | 200 |
+-----------+----------+
EDIT:
H2 database also supports SELECT * EXCEPT (col1, col2, ...) syntax.
Wildcard expression
A wildcard expression in a SELECT statement. A wildcard expression represents all visible columns. Some columns can be excluded with optional EXCEPT clause.
EDIT 2:
Hive supports: REGEX Column Specification
A SELECT statement can take regex-based column specification in Hive releases prior to 0.13.0, or in 0.13.0 and later releases if the configuration property hive.support.quoted.identifiers is set to none.
The following query selects all columns except ds and hr.
SELECT `(ds|hr)?+.+` FROM sales
EDIT 3:
Snowflake also now supports: SELECT * EXCEPT (and a RENAME option equivalent to REPLACE in BigQuery)
EXCLUDE col_name EXCLUDE (col_name, col_name, ...)
When you select all columns (SELECT *), specifies the columns that should be excluded from the results.
RENAME col_name AS col_alias RENAME (col_name AS col_alias, col_name AS col_alias, ...)
When you select all columns (SELECT *), specifies the column aliases that should be used in the results.
and so does Databricks SQL (since Runtime 11.0)
star_clause
[ { table_name | view_name } . ] * [ except_clause ]
except_clause
EXCEPT ( { column_name | field_name } [, ...] )
and also DuckDB
-- select all columns except the city column from the addresses table
SELECT * EXCLUDE (city) FROM addresses;
-- select all columns from the addresses table, but replace city with LOWER(city)
SELECT * REPLACE (LOWER(city) AS city) FROM addresses;
-- select all columns matching the given regex from the table
SELECT COLUMNS('number\d+') FROM addresses;
DB2 allows for this. Columns have an attribute/specifier of Hidden.
From the syscolumns documentation
HIDDEN
CHAR(1) NOT NULL WITH DEFAULT 'N'
Indicates whether the column is implicitly hidden:
P Partially hidden. The column is implicitly hidden from SELECT *.
N Not hidden. The column is visible to all SQL statements.
Create table documentation As part of creating your column, you would specify the IMPLICITLY HIDDEN modifier
An example DDL from Implicitly Hidden Columns follows
CREATE TABLE T1
(C1 SMALLINT NOT NULL,
C2 CHAR(10) IMPLICITLY HIDDEN,
C3 TIMESTAMP)
IN DB.TS;
Whether this capability is such a deal maker to drive the adoption of DB2 is left as an exercise to future readers.
Is there any RDBMS that implements something like SELECT * EXCEPT
Yes! The truly relational language Tutorial D allows projection to be expressed in terms of the attributes to be removed instead of the ones to be kept e.g.
my_relvar { ALL BUT description }
In fact, its equivalent to SQL's SELECT * is { ALL BUT }.
Your proposal for SQL is a worthy one but I heard it has already been put to the SQL standard's committee by the users' group and rejected by the vendor's group :(
It has also been explicitly requested for SQL Server but the request was closed as 'won't fix'.
Yes, finally there is :) SQL Standard 2016 defines Polymorphic Table Functions
SQL:2016 introduces polymorphic table functions (PTF) that don't need to specify the result type upfront. Instead, they can provide a describe component procedure that determines the return type at run time. Neither the author of the PTF nor the user of the PTF need to declare the returned columns in advance.
PTFs as described by SQL:2016 are not yet available in any tested database.10 Interested readers may refer to the free technical report “Polymorphic table functions in SQL” released by ISO. The following are some of the examples discussed in the report:
CSVreader, which reads the header line of a CVS file to determine the number and names of the return columns
Pivot (actually unpivot), which turns column groups into rows (example: phonetype, phonenumber) -- me: no more harcoded strings :)
TopNplus, which passes through N rows per partition and one extra row with the totals of the remaining rows
Oracle 18c implements this mechanism. 18c Skip_col Polymorphic Table Function Example Oracle Live SQL and Skip_col Polymorphic Table Function Example
This example shows how to skip data based on name/specific datatype:
CREATE PACKAGE skip_col_pkg AS
-- OVERLOAD 1: Skip by name
FUNCTION skip_col(tab TABLE, col columns)
RETURN TABLE PIPELINED ROW POLYMORPHIC USING skip_col_pkg;
FUNCTION describe(tab IN OUT dbms_tf.table_t,
col dbms_tf.columns_t)
RETURN dbms_tf.describe_t;
-- OVERLOAD 2: Skip by type --
FUNCTION skip_col(tab TABLE,
type_name VARCHAR2,
flip VARCHAR2 DEFAULT 'False')
RETURN TABLE PIPELINED ROW POLYMORPHIC USING skip_col_pkg;
FUNCTION describe(tab IN OUT dbms_tf.table_t,
type_name VARCHAR2,
flip VARCHAR2 DEFAULT 'False')
RETURN dbms_tf.describe_t;
END skip_col_pkg;
and body:
CREATE PACKAGE BODY skip_col_pkg AS
/* OVERLOAD 1: Skip by name
* NAME: skip_col_pkg.skip_col
* ALIAS: skip_col_by_name
*
* PARAMETERS:
* tab - The input table
* col - The name of the columns to drop from the output
*
* DESCRIPTION:
* This PTF removes all the input columns listed in col from the output
* of the PTF.
*/
FUNCTION describe(tab IN OUT dbms_tf.table_t,
col dbms_tf.columns_t)
RETURN dbms_tf.describe_t
AS
new_cols dbms_tf.columns_new_t;
col_id PLS_INTEGER := 1;
BEGIN
FOR i IN 1 .. tab.column.count() LOOP
FOR j IN 1 .. col.count() LOOP
tab.column(i).pass_through := tab.column(i).description.name != col(j);
EXIT WHEN NOT tab.column(i).pass_through;
END LOOP;
END LOOP;
RETURN NULL;
END;
/* OVERLOAD 2: Skip by type
* NAME: skip_col_pkg.skip_col
* ALIAS: skip_col_by_type
*
* PARAMETERS:
* tab - Input table
* type_name - A string representing the type of columns to skip
* flip - 'False' [default] => Match columns with given type_name
* otherwise => Ignore columns with given type_name
*
* DESCRIPTION:
* This PTF removes the given type of columns from the given table.
*/
FUNCTION describe(tab IN OUT dbms_tf.table_t,
type_name VARCHAR2,
flip VARCHAR2 DEFAULT 'False')
RETURN dbms_tf.describe_t
AS
typ CONSTANT VARCHAR2(1024) := upper(trim(type_name));
BEGIN
FOR i IN 1 .. tab.column.count() LOOP
tab.column(i).pass_through :=
CASE upper(substr(flip,1,1))
WHEN 'F' THEN dbms_tf.column_type_name(tab.column(i).description)
!=typ
ELSE dbms_tf.column_type_name(tab.column(i).description)
=typ
END /* case */;
END LOOP;
RETURN NULL;
END;
END skip_col_pkg;
And sample usage:
-- skip number cols
SELECT * FROM skip_col_pkg.skip_col(scott.dept, 'number');
-- only number cols
SELECT * FROM skip_col_pkg.skip_col(scott.dept, 'number', flip => 'True')
-- skip defined columns
SELECT *
FROM skip_col_pkg.skip_col(scott.emp, columns(comm, hiredate, mgr))
WHERE deptno = 20;
I highly recommend to read entire example(creating standalone functions instead of package calls).
You could easily overload skip method for example: skip columns that does not start/end with specific prefix/suffix.
db<>fidde demo
Related: How to Dynamically Change the Columns in a SQL Query By Chris Saxon
Stay away from SELECT *, you are setting yourself for trouble. Always specify exactly which columns you want. It is in fact quite refreshing that the "feature" you are asking for doesn't exist.
I believe the rationale for it not existing is that the author of a query should (for performance sake) only request what they're going to look at/need (and therefore know what columns to specify) -- if someone adds a couple more blobs in the future, you'd be pulling back potentially large fields you're not going to need.
Temp table option here, just drop the columns not required and select * from the altered temp table.
/* Get the data into a temp table */
SELECT * INTO #TempTable
FROM
table
/* Drop the columns that are not needed */
ALTER TABLE #TempTable
DROP COLUMN [columnname]
SELECT * from #TempTable
declare #sql nvarchar(max)
#table char(10)
set #sql = 'select '
set #table = 'table_name'
SELECT #sql = #sql + '[' + COLUMN_NAME + '],'
FROM INFORMATION_SCHEMA.Columns
WHERE TABLE_NAME = #table
and COLUMN_NAME <> 'omitted_column_name'
SET #sql = substring(#sql,1,len(#sql)-1) + ' from ' + #table
EXEC (#sql);
I needed something like what #Glen asks for easing my life with HASHBYTES().
My inspiration was #Jasmine and #Zerubbabel answers. In my case I've different schemas, so the same table name appears more than once at sys.objects. As this may help someone with the same scenario, here it goes:
ALTER PROCEDURE [dbo].[_getLineExceptCol]
#table SYSNAME,
#schema SYSNAME,
#LineId int,
#exception VARCHAR(500)
AS
DECLARE #SQL NVARCHAR(MAX)
BEGIN
SET NOCOUNT ON;
SELECT #SQL = COALESCE(#SQL + ', ', ' ' ) + name
FROM sys.columns
WHERE name <> #exception
AND object_id = (SELECT object_id FROM sys.objects
WHERE name LIKE #table
AND schema_id = (SELECT schema_id FROM sys.schemas WHERE name LIKE #schema))
SELECT #SQL = 'SELECT ' + #SQL + ' FROM ' + #schema + '.' + #table + ' WHERE Id = ' + CAST(#LineId AS nvarchar(50))
EXEC(#SQL)
END
GO
It's an old question, but I hope this answer can still be helpful to others. It can also be modified to add more than one except fields. This can be very handy if you want to unpivot a table with many columns.
DECLARE #SQL NVARCHAR(MAX)
SELECT #SQL = COALESCE(#SQL + ', ', ' ' ) + name FROM sys.columns WHERE name <> 'colName' AND object_id = (SELECT id FROM sysobjects WHERE name = 'tblName')
SELECT #SQL = 'SELECT ' + #SQL + ' FROM ' + 'tblName'
EXEC sp_executesql #SQL
Stored Procedure:
usp_SelectAllExcept 'tblname', 'colname'
ALTER PROCEDURE [dbo].[usp_SelectAllExcept]
(
#tblName SYSNAME
,#exception VARCHAR(500)
)
AS
DECLARE #SQL NVARCHAR(MAX)
SELECT #SQL = COALESCE(#SQL + ', ', ' ' ) + name from sys.columns where name <> #exception and object_id = (Select id from sysobjects where name = #tblName)
SELECT #SQL = 'SELECT ' + #SQL + ' FROM ' + #tblName
EXEC sp_executesql #SQL
For the sake of completeness, this is possible in DremelSQL dialect, doing something like:
WITH orders AS
(SELECT 5 as order_id,
"foobar12" as item_name,
800 as quantity)
SELECT * EXCEPT (order_id)
FROM orders;
Result:
+-----------+----------+
| item_name | quantity |
+-----------+----------+
| foobar12 | 800 |
+-----------+----------+
There also seems to be another way to do it here without Dremel.
Your question was about what RDBMS supports the * EXCEPT (...) syntax, so perhaps, looking at the jOOQ manual page for * EXCEPT can be useful in the future, as that page will keep track of new dialects supporting the syntax.
Currently (mid 2022), among the jOOQ supported RDBMS, at least BigQuery, H2, and Snowflake support the syntax natively. The others need to emulate it by listing the columns explicitly:
-- ACCESS, ASE, AURORA_MYSQL, AURORA_POSTGRES, COCKROACHDB, DB2, DERBY, EXASOL,
-- FIREBIRD, HANA, HSQLDB, INFORMIX, MARIADB, MEMSQL, MYSQL, ORACLE, POSTGRES,
-- REDSHIFT, SQLDATAWAREHOUSE, SQLITE, SQLSERVER, SYBASE, TERADATA, VERTICA,
-- YUGABYTEDB
SELECT LANGUAGE.CD, LANGUAGE.DESCRIPTION
FROM LANGUAGE
-- BIGQUERY, H2
SELECT * EXCEPT (ID)
FROM LANGUAGE
-- SNOWFLAKE
SELECT * EXCLUDE (ID)
FROM LANGUAGE
Disclaimer: I work for the company behind jOOQ
As others are saying: SELECT * is a bad idea.
Some reasons:
Get only what you need (anything more is a waste)
Indexing (index what you need and you can get it more quickly. If you ask for a bunch of non-indexed columns, too, your query plans will suffer.