LEFT JOIN with subquery and accessing main table columns in select clause

LEFT JOIN with subquery and accessing main table columns in select clause - sql

I have an insert statement like the following which gets sytax error of "the multi-part identifier "t2.Col1" could not be bound.". I over simplified the statement and it looks like below:
INSERT INTO dbo.T1
(
Col1,
Col2,
Col3
)
SELECT
t2.Col1,
SUBSTRING(aCase.CaseColumn, 0, CHARINDEX('%', aCase.CaseColumn)), --I expect this line gets the value "2"
SUBSTRING(aCase.CaseColumn, CHARINDEX('%', aCase.CaseColumn) + 1, LEN(aCase.CaseColumn) - CHARINDEX('%', aCase.CaseColumn)) --I expect this line gets the value "3"
FROM
dbo.T2 t2
LEFT JOIN
(
SELECT
CASE --I have hundreds of WHEN conditions below and need to access the parent T2 tables' properties
WHEN t2.Col1 = 1 THEN '2%3' --This line has a syntax error of "the multi-part identifier "t2.Col1" could not be bound."
END AS CaseColumn
)
AS aCase ON 1 = 1
The reason I use LEFT JOIN with CASE is that I have hundreds of conditions for which I need to select different values for different columns. I don't want to repeat the same CASE statements over and over again for all of the columns. Therefore, I use a single CASE which concatenates the values with a delimiter and then I parse that concatenated string and put the appropriate values in it's place.

What you could do is use OUTER APPLY, as it allows your dbo.T2 and the aCase resultset to be related, like this:
INSERT INTO dbo.T1
(
Col1,
Col2,
Col3
)
SELECT
1,
SUBSTRING(aCase.CaseColumn, 0, CHARINDEX('%', aCase.CaseColumn)), --I expect this line gets the value "2"
SUBSTRING(aCase.CaseColumn, CHARINDEX('%', aCase.CaseColumn) + 1, LEN(aCase.CaseColumn) - CHARINDEX('%', aCase.CaseColumn)) --I expect this line gets the value "3"
FROM
dbo.T2 t2
OUTER APPLY
(
SELECT
CASE --I have hundreds of WHEN conditions below and need to access the parent T2 tables' properties
WHEN t2.Col1 = 1 THEN '2%3'
END AS CaseColumn
)
AS aCase ON 1 = 1
That is because the result of the subquery is not indipendent itself, it has to be defined based on the values of the dbo.T2 table.
Read more about OUTER APPLY and CROSS APPLY on this thread.
Number 3, "Reusing a table alias" is similiar to your case and the article linked to it perfectly explains how to use cross apply/outer apply in these cases.

When using a join to a subquery, inside that subquery it doesn't know what t2 is, unless you select from a table aliased as t2 in that subquery.
And you could change that LEFT JOIN to an OUTER APPLY.
But you don't really need to JOIN or OUTER APPLY in this case.
Just select from T2 with the CASE in the subquery.
INSERT INTO dbo.T1
(
Col1,
Col2,
Col3
)
SELECT
Col1,
SUBSTRING(CaseColumn, 1, CHARINDEX('%', CaseColumn)-1),
SUBSTRING(CaseColumn, CHARINDEX('%',CaseColumn)+1, LEN(CaseColumn))
FROM
(
SELECT
Col1,
CASE Col1
WHEN 1 THEN '2%3'
-- more when's
END AS CaseColumn
FROM dbo.T2 t2
) q
Note how the CASE and the SUBSTRING's were changed a little bit.
Btw, personally I would just insert the distinct Col1 into T1, and just update Col2 and Col3 in that reference table manually. That could prove to be faster than writing those hundreds conditions. But then again, you did say this was simplified a lot.

Related

Convert String to list in SQL

I have a query like:
SELECT col1,col2 from table1 where col1 in (:var);
and this :var has a value like "'1234', '5678'" which is a string consisting of single quotes and commas in it. I want to convert this string to a type which can be given as input to the SQL 'in' operator, something like this:
SELECT col1, col2 from table1 where col1 in (STRING_SPLIT(:var));

This is the code as solution to achieve desired result in SQL server query.
DECLARE #var AS NVARCHAR(100) = '''1234'', ''5678''';
SELECT col1, col2 FROM table1 WHERE col1 IN (SELECT LTRIM(RTRIM(value)) FROM STRING_SPLIT(#var, ','))

You can't use STRING_SPLIT to expand a delimited literal string into multiple delimited literal strings. STRING_SPLIT('abc,def',',') doesn't result in 'abc','def', it results in a data set of 2 rows, containing the values 'abc' and 'def'.
If you want to pass a delimited string, you need to either JOIN/CROSS APPLY to STRING_SPLIT or use a subquery:
SELECT T1.Col1,
T1.Col2
FROM dbo.table1 T1
JOIN STRING_SPLIT(#YourVariable,',') SS ON T1.Col1 = SS.Value;
SELECT T1.Col1,
T1.Col2
FROM dbo.table1 T1
WHERE T1.Col1 IN (SELECT SS.Value
FROM STRING_SPLIT(#YourVariable,',') SS);
You may, however, find even better performance with an indexed temporary table, if you are dealing with large data sets:
CREATE TABLE #temp (Value varchar(30) PRIMARY KEY); --Use an appropriate data type. I assume unique values in the delimited string
INSERT INTO #temp (Value)
SELECT SS.Value
FROM STRING_SPLIT(#YourVariable,',') SS;
SELECT T1.Col1,
T1.Col2
FROM dbo.table1 T1
JOIN #Temp T ON T1.Col1 = T.Value;
Finally, which may be better again, you could use a table type parameter. Then you would, like the above, just JOIN to that or use an EXISTS.

What happened when you use table1.* <> table2.*?

Imagine 2 table with the same structure but that might have some rows not equal for the same primary key.
What really happen when using a where clause like this : " where table1.* <> table2.* " ?
I "used" it in PostgreSQL but I'm interested for other's languages behavior with this weird thing.

This statement is comparing, every column together of the first table to every column together of the second table. It is the same as writing the composite type explicitly, which would be required if the columns are not in the same order in both tables.
(t1.id, t1.col1, t1.col2) <> (t2.id, t2.col2, t2.col2)
or even more verbose
t1.id <> t2.id
OR t1.col1 <> t2.col1
OR t1.col2 <> t2.col2
But you may want to use IS DISTINCT FROM instead of <> to consider null vs not null as being different/not equal.

In postgres t1.* <> t2.* in this context is expanded to be:
(t1.c1, t1.c2, ..., t1.cn) <> (t2.c1, t2.c2, ..., t2.cn)
which is the same as:
(t1.c1 <> t2.c1) OR (t1.c2 <> t2.c2) OR ...
I think the expansion is a postgres extension to the standard, tuple comparision exists in several other DBMS. You can read about it at https://www.postgresql.org/docs/current/sql-expressions.html#SQL-SYNTAX-ROW-CONSTRUCTORS
The number of columns is required to be the same when comparing tuples, but
I discovered something peculiar when trying your example:
create table ta (a1 int);
create table tb (b1 int, y int);
select * from ta cross join tb where ta.* <> tb.*
The last select succeds, despite the tuples having different number of columns. Adding some rows change that:
insert into ta values (1),(2);
insert into tb values (2,1),(3,4);
select * from ta cross join tb where ta.* <> tb.*
ERROR: cannot compare record types with different numbers of columns
so it appears as this is not checked when the statement is prepared. Expanding the tuple manually yields an ERROR even with empty tables:
select * from ta cross join tb where (ta.a1, ta.a1) <> (tb.b1, y, y);
ERROR: unequal number of entries in row expressions
Fiddle

default value if a column doesnt exist in the SQL table

I have the below sample code. Assuming I do not know if a particular column exists in a table or not, how could I write the query in a way that I can default the column value to 'Not available' if the column doesn't exist in the table?
Example:
select COL1, COL2,
CASE
WHEN OBJECT_ID('COL3') IS NULL THEN 'Not Available'
ELSE COL3
END AS COL3
from TABLE1
Thanks in advance.

This is quite tricky to do (without dynamic SQL), but there is a way by playing with the scoping rules in SQL. You can do this assuming you have a unique or primary key in the table:
select t1.col1, t1.col2,
(select col3 -- no alias!
from table1 tt1
where tt1.id = t1.id -- the primary/unique key
) col3
from table1 t1 cross join
(values ('Not Available')) v(col3) -- same name
The subquery will fetch col3 from the table1 in the subquery if it exists. Otherwise it will reach out and find col3 from the values() clause.

SQLite table aliases effecting the performance of queries

How does SQLite internally treats the alias?
Does creating a table name alias internally creates a copy of the same table or does it just refers to the same table without creating a copy?
When I create multiple aliases of the same table in my code, performance of the query is severely hit!
In my case, I have one table, call it MainTable with namely 2 columns, name and value.
I want to select multiple values in one row as different columns. for example
Name: a,b,c,d,e,f
Value: p,q,r,s,t,u
such that a corresponds to p and so on.
I want to select values for names a,b,c and d in one row => p,q,r,s
So I write a query
SELECT t1.name, t2.name, t3.name, t4.name
FROM MainTable t1, MainTable t2, MainTable t3, MainTable t4
WHERE t1.name = 'a' and t2.name = 'b' and t3.name = 'c' and t4.name = 'd';
This way f writing the query kills the performance when size of the table increases as rightly pointed above by Larry.
Is there any efficient way to retrieve this result. I am bad at SQL queries :(

If you list the same table more than once in your SQL statement and do not supply conditions on which to JOIN the tables, you are creating a cartesian JOIN in your result set and it will be enormous:
SELECT * FROM MyTable A, MyTable B;
if MyTable has 1000 records, will create a result set with one million records. Any other selection criteria you include will then have to be evaluated across all one million records.
I'm not sure that's what you're doing (your question is very unclear), but it may be a start on solving your problem.
Updated answer now that the poster has added the query that is being executed.
You're going to have to get a little tricky to get the results you want. You need to use CASE and MAX and, unfortunately, the syntax for CASE is a little verbose:
SELECT MAX(CASE WHEN name='a' THEN value ELSE NULL END),
MAX(CASE WHEN name='b' THEN value ELSE NULL END),
MAX(CASE WHEN name='c' THEN value ELSE NULL END),
MAX(CASE WHEN name='d' THEN value ELSE NULL END)
FROM MainTable WHERE name IN ('a','b','c','d');
Please give that a try against your actual database and see what you get (of course, you want to make sure the column name is indexed).

Assuming you have table dbo.Customers with a million rows
SELECT * from dbo.Customers A
does not result in a copy of the table being created.
As Larry pointed out, the query as it stands is doing a cartesian product across your table four times which, as you has observed, kills your performance.
The updated ticket states the desire is to have 4 values from different queries in a single row. That's fairly simple, assuming this syntax is valid for sqllite
You can see that the following four queries when run in serial produce the desired value but in 4 rows.
SELECT t1.name
FROM MainTable t1
WHERE t1.name='a';
SELECT t2.name
FROM MainTable t2
WHERE t2.name='b';
SELECT t3.name
FROM MainTable t3
WHERE t3.name='c';
SELECT t4.name
FROM MainTable t4
WHERE t4.name='d';
The trick is to simply run them as sub queries like so there are 5 queries: 1 driver query, 4 sub's doing all the work. This pattern will only work if there is one row returned.
SELECT
(
SELECT t1.name
FROM MainTable t1
WHERE t1.name='a'
) AS t1_name
,
(
SELECT t2.name
FROM MainTable t2
WHERE t2.name='b'
) AS t2_name
,
(
SELECT t3.name
FROM MainTable t3
WHERE t3.name='c'
) AS t3_name
,
(
SELECT t4.name
FROM MainTable t4
WHERE t4.name='d'
) AS t4_name

Aliasing a table will result a reference to the original table that exists for the duration of the SQL statement.

Large Table With Multiple Outer Apply Row Compare Performance

I have a large table with a sample query like below to retrieve matched results.
Select col1,col2,col3
from
Table1 T1
OUTER APPLY (select col2 from Table2 Where t2id=T1.id)
OUTER APPLY (select col3 from Table3 Where t3id=T1.id)
Where col3>0
problem is the its running extremely slow when I have the Where clause column value check.
I have tried different approach including CROSS APPLY, without any improvement to the performance.
Any idea?

Try moving the where clause inside the select statement. This should result in less rows to compute and therefore quicker results
Select col1,col2,col3
from
Table1 T1
OUTER APPLY (select col2 from Table2 Where t2id=T1.id)
OUTER APPLY (select col3 from Table3 Where t3id=T1.id Where col3>0)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

LEFT JOIN with subquery and accessing main table columns in select clause - sql

Related

Convert String to list in SQL

What happened when you use table1.* <> table2.*?

default value if a column doesnt exist in the SQL table

SQLite table aliases effecting the performance of queries

Large Table With Multiple Outer Apply Row Compare Performance

Categories

Resources