I did some searching on this site and couldn't find exactly what I'm looking for, so I hope this isn't a duplicate. I have an issue where a query in a view is taking about 39 seconds to run, which is dragging down a report query that joins to this view multiple times.
To keep this simple I'm going to keep the code simple, but keep the structure exactly as it is on the view. Here is the SELECT statement:
SELECT ....
FROM A a
JOIN B b on a.x = b.x
JOIN C c ON c.s = 'P' AND c.y = b.y
JOIN B AS b2 ON b2.y = c.y AND b2.x <> a.x
JOIN B b3 ON b3.x = b2.x
The x's and y's are the same column names in all join predicates.
The issue I am having comes with the line AND b2.x <> a.x. Without this, it runs in about 1 second, but with it its always taking over 30 seconds. I've tried rewriting this predicate multiple times:
b2.x IN (select b2.x FROM B b2 join A a on b2.x <> a.x)
b2.x NOT IN (select b2.x FROM b b2 JOIN A a on b2.x <> a.x)
NOT b2.x = a.x
OR even removing it and putting in a where clause after the joins, with all of the above varieties and also :
WHERE b2.x NOT IN (SELECT x FROM a)
WHERE b2.x (NOT IN SELECT DISTINCT x FROM a)
Im running out of ideas and need to figure out a way to optimize this. Any suggestions or hints at what else I can look at? Just running
SELECT b2.x from B b2 JOIN A a ON b2.x <> a.x
runs very quickly, so I don't think the underlying tables are the issues.
If the query runs really fast without the condition, but poorly with it, then I might suggest a materialized CTE:
WITH abc as (
SELECT /*+ materialize */...., b2.x as b2x, a.x as ax
FROM A a JOIN
B b
ON a.x = b.x JOIN
C c
ON c.s = 'P' AND c.y = b.y JOIN
B b2
ON b2.y = c.y AND b2.x <> a.x JOIN
B b3
ON b3.x = b2.x
)
SELECT abc.*
FROM ABC
WHERE b2x <> ax;
Related
I have 2 tables name like A and B. A have columns say X, Y and Z and Table B have coulmns say P, Q and R. here in my case table have blank data for few rows in all the columns.
I need to join these 2 tables such that If A.X<>'' and B.X<>'' then It should join the table. If A.X='' and B.X='' then
it should check the next columns A.Y<>'' and B.Y<>''. If this is also blank it should join the table on next condition A.Z<>'' and B.Z<>''. If all these 3 condition have blanks It should not join for that row.
How can we achieve this using sql join?
Thanks in advance
You can go for conditional JOINS as given below:
SELECT *
FROM A
LEFT OUTER JOIN B as b1
ON A.X = b1.X AND B1.X <> '' -- JOIN only rows WHERE x is not blank
LEFT OUTER JOIN B as b2
ON A.Y = b2.Y AND b2.Y <> '' -- JOIN only rows WHERE y is not blank
LEFT OUTER JOIN B AS b3
ON A.Z = b3.Z AND b3.Z <> '' -- JOIN only rows WHERE z is not blank
WHERE
b1.X IS NOT NULL OR
b2.Y IS NOT NULL OR
b3.Z IS NOT NULL
Ramu's answer is close (I upvoted it) but it needs to be refined. The important part of the answer that is correct -- the equality conditions in the JOINs make the query easier to optimize.
However, it is better written as:
SELECT a.*,
COALESCE(b1.P, b2.P, b3.P) as P,
COALESCE(b1.Q, b2.Q, b3.Q) as Q,
COALESCE(b1.R, b2.R, b3.R) as R
FROM A LEFT JOIN
B b1
ON A.X = b1.X LEFT JOIN
B b2
ON A.Y = b2.Y AND
b1.X IS NULL LEFT JOIN -- no previous match
B b3
ON A.Z = b3.Z AND
b2.Y IS NULL AND
b1.X IS NULL -- no previous match
WHERE b1.X IS NOT NULL OR
b2.Y IS NOT NULL OR
b3.Z IS NOT NULL ;
The two key changes are:
The LEFT JOIN conditions check that previous columns did not match.
The SELECT uses COALESCE() to fetch columns.
Also, I don't think the condition on empty strings is needed. There will be no match if there are no empty string values in B for that column. If both tables have empty strings, then you apparently do want a match -- and an empty string matches an empty string in SQL Server.
You can also express this using APPLY:
SELECT a.*, b.*
FROM A CROSS APPLY
(SELECT TOP (1) WITH TIES B.*
FROM B
WHERE A.X = b.X OR
A.Y = b.Y OR
A.Z = b.Z
ORDER BY (CASE WHEN A.X = B.X THEN 1
WHEN A.Y = B.Y THEN 2
WHEN A.Z = B.Z THEN 3
)
) B;
However, I would expect the previous version to have much better performance.
Use a CASE expression in the ON clause:
SELECT *
FROM A INNER JOIN B
ON 1 = CASE
WHEN A.X <> '' AND B.P <> '' AND A.X = B.P THEN 1
WHEN A.Y <> '' AND B.Q <> '' AND A.Y = B.Q THEN 1
WHEN A.Z <> '' AND B.R <> '' AND A.Z = B.R THEN 1
END
or:
SELECT *
FROM A INNER JOIN B
ON 1 = CASE
WHEN A.X <> '' AND A.X = B.P THEN 1
WHEN A.Y <> '' AND A.Y = B.Q THEN 1
WHEN A.Z <> '' AND A.Z = B.R THEN 1
END
There is no need for further joins.
The CASE expression makes sure that each condition will be checked in the order that you want it to be checked.
So if the 1st condition is satisfied then the rows of the 2 tables will be joined, if not then the 2nd condition will be checked and so on.
I apologize in advance if this question is ambiguous. My SQL skills are very weak and I'm not sure if this question is too general to have a correct answer.
I'm working on a project, converting reports from Hyperion Interactive Reporting (IR) to OBIEE. I'm given a visual of the data model in IR, and I'm trying to write the equivalent SQL query.
The data model looks like this:
A --- = --- B --- = --- C
\-- +=+ --/ \-- +=+ --/
The = represents an inner join; +=+ represents a full outer join. Table B inner joins and full outer joins to tables A and C. So I have four joins that I'm trying to piece together:
A join B on A.x = B.x
A full outer join B on A.y = B.y
B join C on B.x = C.x
B full outer join C on B.y = C.y
Without specifying details of my data, is it possible to write a query that matches the behavior of the data model above? And if so, what is the correct/preferred way to do so?
Use union/union all as per your requirement
A join B on A.x = B.x
B join C on B.x = C.x
union
A full outer join B on A.y = B.y
B full outer join C on B.y = C.y
Given that in SQLite, Views:
are read-only
cannot be UPDATEd,
the following is the situation:
There are 4 tables A, B, C and D and a View has to be created with data from all the four tables conditionally. Here's the pseudo-construct:
CREATE VIEW AS E SELECT A.A1, A.A2, A.A3, A.A4, B.B1, C.C1, C.C2, D.D1, D.D2 FROM A, B, C, D
WHERE A.X = 'SOME STRING' AND
A.FK = C.PK AND
A.Y = B.Z AND
D.G = A.PK AND
D.H = 'SOME STRING'
The requirement is that, irrespective of no matches in D, the remaining matches should get populated, (with 0 values in the view E for the columns from D). Needless to say, the above construct works if there are matching D rows, but obviously returns an empty view if there are no D matches.
How can the CASE statement or SELECT sub-queries (or an altogether different construct, like an INSTEAD OF trigger) deliver this requirement?
Greatly appreciate if the database experts could publish the exact construct(s). Many, many thanks in advance!
First, use explicit joins:
SELECT A.A1, A.A2, A.A3, A.A4, B.B1, C.C1, C.C2, D.D1, D.D2
FROM A
JOIN B ON A.Y = B.Z
JOIN C ON A.FK = C.PK
JOIN D ON D.G = A.PK
WHERE A.X = 'SOME STRING'
AND D.H = 'SOME STRING';
Then you can use an outer join when you want to keep rows without a match:
FROM A
JOIN B ON A.Y = B.Z
JOIN C ON A.FK = C.PK
LEFT JOIN D ON D.G = A.PK AND D.H = 'SOME STRING'
WHERE A.X = 'SOME STRING';
(The D.H comparison must be part of the join condition because D.H is NULL for missing rows, and the comparison would fail in the WHERE clause.)
INSERT INTO TableA
SELECT
x,
y,
z
FROM TableB
WHERE x IN
(select DISTINCT x
FROM TableC
WHERE x NOT IN
(SELECT DISTINCT x from TableD)
)
This query takes forever and it doesn't complete.
When I run the each select query it works fine but when I run it all it takes forever? Can you see the reason?
Try this query :
insert into TableA
select b.*
from TableB b --with(nolock)
left outer join TableC c --with(nolock)
on b.x = c.x
left outer join TableD d --with(nolock)
on c.x = d.x
where c.x is not null and d.x is null
if it is also running infinite then uncomment with(nolock) and try again. if it does not work then check estimated execution plan.
Firstly you need to look at the execution plan for the query - it might tell you where the bottlenecks are or if there are missing indexes that would significantly speed up your query - I think this is likely as your query is simple so I don't see why it would take so long;
I think you could also restructure you query so that it uses Joins instead of the not in - it would help if i knew the data to see if this produced the same results, But i think it should;
SELECT B.x,
B.y,
B.Z
FROM TableB B
INNER JOIN --where in
(
SELECT DISTINCT x
FROM TableC c
LEFT JOIN TableD d
ON c.x = d.x
WHERE d.x IS NULL -- c x not in d x
) sub
on B.x = sub.x
Subqueries and DISTINCT when not needed are notoriously poor for performance. You can accomplish what you need using JOINs.
SELECT b.x, b.y, b.z
FROM TableB b
INNER JOIN TableC c ON c.x=b.x
LEFT JOIN TableD d ON d.x=b.x
WHERE d.x IS NULL
GROUP BY b.x, b.y, b.z -- only if you have duplicates and need unique records
The INNER JOIN on TableC fixes your 1st "IN", then the LEFT JOIN and d.x IS NULL fixes your "NOT IN" clause.
Lastly, make sure that you have indexes on the "x" column in each table.
CREATE INDEX IX_TableB_X ON TableB (X);
CREATE INDEX IX_TableC_X ON TableC (X);
CREATE INDEX IX_TableD_X ON TableD (X);
I have a problem similar to this StackOverflow question, except that I need to exclude certain fields from the comparison but still include it in the result set.
I'm penning the problem as locally symmetric difference.
For example Table A and B have columns X,Y,Z and I want to compare only Y,Z for differences but I still want the result set to include X.
Sounds like this is basically what you want. Match rows between two tables on columns Y and Z, find the unmatched rows, and output the values of columns X, Y, and Z.
SELECT a.x, a.y, a.z, b.x, b.y, b.z
FROM a FULL OUTER JOIN b ON a.y = b.y AND a.z = b.z
WHERE a.y IS NULL OR b.y IS NULL
Old style SQL for a full join - A concatenated with B, excluding rows in B also in A (the middle):
-- all rows in A with or without matching B
select a.x, a.y, a.z
from a
left join b
on a.x = b.x
and a.y = b.y
union all
-- all rows in B with no match in A to "exclude the middle"
select b.x, b.y, null as z
from b
where not exists (select null
from a
where b.x = a.x
and b.y = a.y)
ANSI Style:
select coalesce(a.x, b.x) as x,
coalesce(a.y, b.y) as y,
a.z
from a
full outer join b
on a.x = b.x
and a.y = b.y
The coalesce's are there for safety; I've never actually had cause to write a full outer join in the real world.
If what you really want to find out if two table are identical, here's how:
SELECT COUNT(*)
FROM (SELECT list_of_columns
FROM one_of_the_tables
MINUS
SELECT list_of_columns
FROM the_other_table
UNION ALL
SELECT list_of_columns
FROM the_other_table
MINUS
SELECT list_of_columns
FROM one_of_the_tables)
If that returns a non-zero result, then there is a difference. It doesn't tell you which table it's in, but it's a start.