How concatenating the rows works [duplicate] - sql

This question already has answers here:
nvarchar concatenation / index / nvarchar(max) inexplicable behavior
(2 answers)
Order by with variable Coalesce
(1 answer)
Closed 7 months ago.
Can someone please clarify how variables are executing here?
In query 1 how are all the rows concatenated with a comma? There is no while loop there.
In query 2 I assign empty strings directly to the query but it shows different results. Can someone explain this?
Table data:
select name from names
Output:
s
r
i
n
u
Query 1:
declare #var varchar(20)
set #var=''
select #var=#var+name+',' from names
select #var
Output:
s,r,i,n,u,
Query 2:
declare #var varchar(20)
select #var=''+name+',' from names
select #var
Output:
u,

As I mentioned in the comment, the logic you have in your query is a documented antipattern. Effectively you are relying on that query is row in a row by row basis, and for each row the variable is updated.
So, you are hoping, that the variable is first set to the value '' + 's,' (which is 's,'), then for the second row, the prior rows value of variable would be used ('s,') and concatenated to the next ('r'), resulting in 's,r,'. For the third row, again use the prior rows value of variable ('s,r,') and concatenate it to the next ('i'), resulting in 's,r,i'. Repeat until you get to the end of the dataset.
Per the documentation, however, there is no guarantee that'll actually happen though:
In this case, it is not guaranteed that #Var would be updated on a row by row basis. For example, #Var may be set to initial value of #Var for all rows. This is because the order and frequency in which the assignments are processed is nondeterminant. This applies to expressions containing variables string concatenation, as demonstrated below, but also to expressions with non-string variables or += style operators. Use aggregation functions instead for a set-based operation instead of a row-by-row operation.
So this means you could simply end up with a single delimited value like 'u,', or perhaps some missing values (maybe 'n,u,') due to when the rows and variable assignments were processed.
Instead, as the documentation also states, use string aggregation. On all (fully) support versions of SQL Server, that would be STRING_AGG:
SELECT #var = STRING_AGG(name,',')
FROM dbo.Names;
If you are on an older version of SQL Server, then you would need to use the "old" FOR XML PATH (with STUFF) method, like shown in this question.

Related

How to run SQL like EVAL in BigQuery

In BigQuery, I have a query and its result is like:
myQueryValue
select * from 'some path'
I'd like to use it directly in new query.
SELECT someValue
FROM
(
select * from 'some path' <- How can I replace this to myQueryValue?
)
How can I use the result value of some queries like EVAL?
----------------EDITED AT 14th Oct.----------------
Thanks for all answer but I need to explain more what I want.
If I have a 'queryTable' like
col
'select * from tableA'
The result of 'select * from tableA' is
foo
bar
When I only know about 'queryTable', how can I get the this result?
foo
bar
I'd like to refer 'queryTable', and get the final result of its.
You can use sub queries, its a query inside the from clause.
Here is an example code:
SELECT * FROM (SELECT ID FROM CUSTOMERS WHERE SALARY > 4500)
A Subquery or Inner query or a Nested query is a query within another
SQL query and embedded within the WHERE clause.
A subquery is used to return data that will be used in the main query
as a condition to further restrict the data to be retrieved.
Subqueries can be used with the SELECT, INSERT, UPDATE, and DELETE
statements along with the operators like =, <, >, >=, <=, IN, BETWEEN,
etc.
There are a few rules that subqueries must follow −
Subqueries must be enclosed within parentheses.
A subquery can have only one column in the SELECT clause, unless
multiple > >columns are in the main query for the subquery to compare
its selected > >columns.
An ORDER BY command cannot be used in a subquery, although the main
query >can use an ORDER BY. The GROUP BY command can be used to
perform the same >function as the ORDER BY in a subquery.
Subqueries that return more than one row can only be used with
multiple >value operators such as the IN operator.
The SELECT list cannot include any references to values that evaluate
to a >BLOB, ARRAY, CLOB, or NCLOB.
A subquery cannot be immediately enclosed in a set function.
The BETWEEN operator cannot be used with a subquery. However, the
BETWEEN >operator can be used within the subquery.
click here for more information about sub queries.
Below is example of how easy to achieve this
DECLARE myQueryValue STRING;
SET myQueryValue = "select * from your_table";
EXECUTE IMMEDIATE '''
SELECT someValue
FROM ( ''' || myQueryValue || ''' )''';
As you didn't provide additional information I'm going to elaborate my comment.
In comment I've proposed that you can use Declare with Set. Good differences between those two are presented in this stackoverflow thread.
DECLARE does not initialize the variable. When you declare it you declare the variable name, the type, and a default value, which could be an expression.
SET is for initializing the variable you declared previously, and you cannot SET the variable until you DECLARE it.
One of the examples has been provided in #Mikhail Berlyant answer in this thread.
However, more detailed information with more examples are mentioned in GCP Set Reference.
Sets a variable to have the value of the provided expression, or sets multiple variables at the same time based on the result of multiple expressions.
The SET statement may appear anywhere within the body of a script.
This is the easiest way to achieve this.
Another common way you could do this is to use SubQuery/Nested Query, it's also well described in the GCP BigQuery Reference.
In GCP doc you can also find example which uses Set, Declare and subquery:
DECLARE target_word STRING DEFAULT 'methinks';
DECLARE corpus_count, word_count INT64;
SET (corpus_count, word_count) = (
SELECT AS STRUCT COUNT(DISTINCT corpus), SUM(word_count)
FROM `bigquery-public-data`.samples.shakespeare
WHERE LOWER(word) = target_word
);
SELECT
FORMAT('Found %d occurrences of "%s" across %d Shakespeare works',
word_count, target_word, corpus_count) AS result;
Output:
Found 151 occurrences of "methinks" across 38 Shakespeare works

SQL: How to apply a function (stored procedure) within an UPDATE-clause to change values?

the following function deletes all blanks in a text or varchar column and returns the modified text/varchar as an int:
select condense_and_change_to_int(number_as_text_column) from mytable;
This exact query does work.
Though my goal is to apply this function to all rows of a column in order to consistently change its values. How would I do this? Is it possible with the UPDATE-clause, or do i need to do this within a function itself? I tried the following:
UPDATE mytable
SET column_to_be_modiefied = condense_and_change_to_int(column_to_be_modiefied);
Basically i wanted to input the value of the current row, modify it and save it to the column permanantly.
I'd welcome all ideas regarding how to solve scenarios like these. I'm working with postgresql (but welcome also more general solutions).
Is it possible with an update? Well, yes and sort-of.
From your description, the input to the function is a string of some sort. The output is a number. In general, numbers should be assigned to columns with a number type. The assumption is that the column in question is a number.
However, your update should work. The result will be a string representation of the number.
After running the update, you can change the column type, with something like:
alter table mytable alter column column_to_be_modiefied int;

Sql Server 2008 r2 Using a WHILE loop inside a function

I read an answer that said you don't want to use WHILE loops in SQL Server. I don't understand that generalization. I'm fairly new to SQL so I might not understand the explanation yet. I also read that you don't really want to use cursors unless you must. The search results I've found are too specific to the problem presented and I couldn't glean useful technique from them, so I present this to you.
What I'm trying to do is take the values in a client file and shorten them where necessary. There are a couple of things that need to be achieved here. I can't simply hack the field values provided. My company has standard abbreviations that are to be used. I have put these in a table, Abbreviations. the table has the LongName and the ShortName. I don't want to simply abbreviate every LongName in the row. I only want to apply the update as long as the field length is too long. This is why I need the WHILE loop.
My thought process was thus:
CREATE FUNCTION [dbo].[ScrubAbbrev]
(#Field nvarchar(25),#Abbrev nvarchar(255))
RETURNS varchar(255)
AS
BEGIN
DECLARE #max int = (select MAX(stepid) from Abbreviations)
DECLARE #StepID int = (select min(stepid) from Abbreviations)
DECLARE #find varchar(150)=(select Longname from Abbreviations where Stepid=#stepid)
DECLARE #replace varchar(150)=(select ShortName from Abbreviations where Stepid=#stepid)
DECLARE #size int = (select max_input_length from FieldDefinitions where FieldName = 'title')
DECLARE #isDone int = (select COUNT(*) from SizeTest where LEN(Title)>(#size))
WHILE #StepID<=#max or #isDone = 0 and LEN(#Abbrev)>(#size) and #Abbrev is not null
BEGIN
RETURN
REPLACE(#Abbrev,#find,#replace)
SET #StepID=#StepID+1
SET #find =(select Longname from Abbreviations where Stepid=#stepid)
SET #replace =(select ShortName from Abbreviations where Stepid=#stepid)
SET #isDone = (select COUNT(*) from SizeTest where LEN(Title)>(#size))
END
END
Obviously the RETURN should go at the end, but I need to reset the my variables to the next #stepID, #find, and #replace.
Is this one of those times where I'd have to use a cursor (which I've never yet written)?
Generally, you don't want to use cursors or while loops in SQL because they only process a single row at a time, and thus perform very poorly. SQL is designed and optimized to process (potentially very large) sets of data, not individual values.
You could factor out the while loop by doing something like this:
UPDATE t
SET t.targetColumn = a.ShortName
FROM targetTable t
INNER JOIN Abbreviations a
ON t.targetColumn = a.LongName
WHERE LEN(t.targetColumn) > #maxLength
This is generalized and you will need to tweak it to fit your specific data model, but here's what's going on:
For every row in "targetTable", set the value of "targetColumn" (what you want to abbreviate) to the relevant abbreviation (found in Abbreviations.ShortName) iff: the current value has a standardized abbreviation (the inner join) and the current value is longer than desired (the where condition).
You'll need to add an integer parameter or local variable, #maxLength, to indicate what constitutes "too long". This query processes the target table all at once, updating the value in the target column for every eligible row, while a function will only find the abbreviation for a single item (the intersection of one row and one column) at a time.
Note that this won't do anything if the value is too long but doesn't have a standard abbreviation. Your existing code has this same limitation, so I assume this is desired behavior.
I also recommend making this a stored procedure rather than a function. Functions on SQL Server are treated as black boxes and can seriously harm performance, because the optimizer generally doesn't have a good idea of what they're doing.

Using Coalesce in Where Statement

I have a query here and I'm curious if there is a shorter way of writing this query, meaning to reduce the query from using an IF argument to determine if it should use the param or not in the statement. Please see below:
#Param varchar(10) = NULL
IF #Param IS NOT NULL
BEGIN
SELECT * FROM TABLE WHERE Column = #Param
END
ELSE
BEGIN
SELECT * FROM TABLE
END
Could this be reduced to one simple query instead like this?
#Param varchar(10) = NULL
SELECT * FROM TABLE WHERE Column = COALESCE( Any, #Param )
I looked at Coalesce, but didn't see if I could use an Any sort of feature. I hope this makes sense.
Question is how to acheive this. Second question is which would be better on performance?
There is no any style command, but if you use the #Param first and then the value in the column itself, that should work...
SELECT * FROM TABLE WHERE Column = COALESCE(#Param, Column)
If #Param is Null then it is ignored and the next item in the list is used.
You can also use IsNull to do the same thing...
SELECT * FROM TABLE WHERE Column = ISNULL(#Param, Column)
To answer the 2nd part of the question, my feeling is that using a separate statement will always be more efficient. The condensed version might be less code, but isn't necessarily easier to understand or quicker to run.

Dynamic Query in SQL Server

I have a table with 10 columns as col_1,col_2,.... col_10. I want to write a select statement that will select a value of one of the row and from one of these 10 columns. I have a variable that will decide which column to select from. Can such query be written where the column name is dynamically decided from a variable.
Yes, using a CASE statement:
SELECT CASE #MyVariable
WHEN 1 THEN [Col_1]
WHEN 2 THEN [Col_2]
...
WHEN 10 THEN [Col_10]
END
Whether this is a good idea is another question entirely. You should use better names than Col_1, Col_2, etc.
You could also use a string substitution method, as suggested by others. However, that is an option of last resort because it can open up your code to sql injection attacks.
Sounds like a bad, denormalized design to me.
I think a better one would have the table as parent, with rows that contain a foreign key to a separate child table that contains ten rows, one for each of those columns you have now. Let the parent table set the foreign key according to that magic value when the row is inserted or updated in the parent table.
If the child table is fairly static, this will work.
Since I don't have enough details, I can't give code. Instead, I'll explain.
Declare a string variable, something like:
declare #sql varchar(5000)
Set that variable to be the completed SQL string you want (as a string, and not actually querying... so you embed the row-name you want using string concatenation).
Then call: exec(#sql)
All set.
I assume you are running purely within Transact-SQL. What you'll need to do is dynamically create the SQL statement with your variable as the column name and use the EXECUTE command to run it. For example:
EXECUTE('select ' + #myColumn + ' from MyTable')
You can do it with a T-SQl CASE statement:
SELECT 'The result' =
CASE
WHEN choice = 1 THEN col1
WHEN choice = 2 THEN col2
...
END
FROM sometable
IMHO, Joel Coehoorn's case statement is probably the best idea
... but if you really have to use dynamic SQL, you can do it with sp_executeSQL()
I have no idea what platform you are using but you can use Dynamic LINQ pretty easily to do this.
var query = context.Table
.Where( t => t.Id == row_id )
.Select( "Col_" + column_id );
IEnumerator enumerator = query.GetEnumerator();
enumerator.MoveNext();
object columnValue = enumerator.Current;
Presumably, you'll know which actual type to cast this to depending on the column. The nice thing about this is you get the parameterized query for free, protecting you against SQL injection attacks.
This isn't something you should ever need to do if your database is correctly designed. I'd revisit the design of that element of the schema to remove the need to do this.