SQL with SubSelect refering to same Table as Inital load is slow - sql

i DISTINCT load multiple Tables together:
Simple DISTINCT SELECT Table1.Val1, Table2.Val2, Table1.Val3,...
FROM Table1/2/3
WHERE Table2.KEYVAL = TABLE1.KEYVAL
AND TABLE3.KEYVAL = TABLE2.KEYVAL
some WHERE's on Table 1,2,3
I also have some Subselects with a 4th Table in my Main Select Statement which work fast and fine.
Now i want to calculate the SUM of a Value in Table3 (let's call it weight).
Table3 Result has Multiple Rows (Sub-ID's) but share the Same KEY.
How can i accomplish that?
if i put a SubSelect in my Main Select:
(SELECT Sum(weight)
FROM Table3
WHERE Table3.KEYVAL = Table1.KEYVAL
) as Wheight
it get's ugly slow.
How can i (or does sql) differ between (Main)Table3.weight and (SubSelect)Table3.weight when i want to use a LeftJoin for that Weight?
Sum(weight) in Main Select does not work directly, i think because of not being DISTINCT?

I think you have to use the Group by function in your subquery.
For instance:
Select a.*, b.*
from mytable as a
join (select keyval, sum(weight) from my_second_table group by keyval) as b
on a.keyval = b.keyval

Related

Use a subselect within a "locate" function to test for multiple values? Alternatives?

I am looking for a way to run a "locate" function on multiple values based on a subselect; this is the pseudocode I'm envisioning (which does not run, because the subselect returns more than one value; which is what I want).
select * from table
where locate((select distinct field1 from subquery), field2) > 0
This is an unknown number of values, so I cannot use "or" statements for multiple values.
The only way I can think to do it is to do a join on the table to the subselect, but I am worried about efficiency with this method.
with cte_subselect as (select distinct field1 from subquery)
select * from table inner join cte_subselect on 1=1
where locate(field1, field2) > 0
Is the inner join method my only option?
You want to combine all results from the subquery in with the rows of the main table and search for matches. In short, you want to filter over the cross product of a table with a subquery.
The typical solution is to do what you are doing already. Namely:
select t.*
from t
join (select distinct field1 from subquery) x
on locate(x.field1, t.field2) > 0
Now, if performance is important you can speed it up by adding an index:
create index ix1 on t (field2);
Then, the query can be rephrased as:
select t.*
from (select distinct field1 from subquery) s
join t on t.field2 like s.field1 || '%'
Try this instead of inner join.
with cte_subselect as (select distinct field1 from subquery)
select * from table
where (select max(locate(field1, field2) ) from cte_subselect) > 0

Adding sum values from two different tables

How can i achieve this. T2 is linked with another table which contains order details like customer name, country and classification. They have an inner join.
T1 is linked to T2 only via order code and order item.
Assuming that both tables report the same set of order numbers, we can try joining two subqueries each of which finds the sums in the respective tables:
SELECT
t1.ORDER_NUM,
t1.ORDER_ITEM,
t1.PRODUCED + t2.PRODUCED AS PRODUCED
FROM
(
SELECT ORDER_NUM, ORDER_ITEM, SUM(PRODUCED) AS PRODUCED
FROM table1
GROUP BY ORDER_NUM
) t1
INNER JOIN
(
SELECT ORDER_NUM, ORDER_ITEM, SUM(NET_IN - NET_OUT) AS PRODUCED
FROM table2
GROUP BY ORDER_NUM
) t2
ON t1.ORDER_NUM = t2.ORDER_NUM AND
t1.ORDER_ITEM = t2.ORDER_ITEM
ORDER BY
t1.ORDER_NUM,
t1.ORDER_ITEM;
Note that the above is not necessarily an ideal approach, because a given order/item combination in one table might not appear in the other table. A better approach would be to start the query with a reference table containing all orders and items. That failing, we could convert the above to a full outer join.
I think a simple approach is union all:
select ordernum, orderitem, sum(produced) as produced
from ((select ordernum, orderitem, produced
from table1
) union all
(select ordernum, orderitem, netout
from table2
)
) t12
group by ordernum, orderitem;
This has two advantages over pre-aggregating and using joins:
It keeps all order/item pairs, even those that appear in one table.
If you add a where claus to the outer query, SQL Server is likely to "project" that into the subqueries.
Try for bellow query also
select t1.order_num,t1.order_item,sum(t1.produced)+(select sum(net_in) from t2)-(select sum(t2.net_out) from t2)PRODUCED
from t1
group by t1.order_num,t1.order_item
if you have wanted the only sum from another table that time you have used select query and do the sum of a particular column.

Redshift Query returning too many rows in aggregate join

I am sure I must be missing something obvious. I am trying to line up two tables with different measurement data for analysis, and my counts are coming back enormously high when I join the two tables together.
Here are the correct counts from my table1
select line_item_id,sum(is_imp) as imps
from table1
where line_item_id=5993252
group by 1;
Here are the correct counts from table2
select cs_line_item_id,sum(grossImpressions) as cs_imps
from table2
where cs_line_item_id=5993252
group by 1;
When I join the tables together, my counts become inaccurate:
select a.line_item_id,sum(a.is_imp) as imps,sum(c.grossImpressions) as cs_imps
from table1 a join table2 c
ON a.line_item_id=c.cs_line_item_id
where a.line_item_id=5993252
group by 1;
I'm using aggregates, group by, filtering, so I'm not sure where I'm going wrong. Here is the schema for these tables:
select a.*, b.imps table2_imps from
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)a
join
(select line_item_id,sum(is_imp) as imps
from table1
group by 1)b
on a.select line_item_id=b.select line_item_id
You are generating a Cartesian product for each line_item_id. There are two relatively simply ways to solve this, one with a full join, the other with union all:
select line_item_id, sum(imps) as imps, sum(grossImpressions) as cs_imps
from ((select a.line_time_id, sum(is_imp) as imps, 0 as grossImpressions
from table1 a
where a.line_item_id = 5993252
group by a.line_item_id
) union all
(select c.line_time_id, 0 as imps, sum(grossImpressions) as grossImpressions
from table2 c
where c.line_item_id = 5993252
group by c.line_item_id
)
) ac
group by line_item_id;
You can remove the where clause from the subqueries to get the total for all line_tiem_ids. Note that this works even when one or the other table has no matching rows for a given line_item_id.
For performance, you really want to do the filtering before the group by.

Use the result of a SQL query as an input for a WHERE clause

I wonder if (and how?) I can use the result set of a query as an input for a WHERE clause in another query.
For instance, I have a query that fetches a list of "code" and I would like to SELECT all tuples from another table that have - as "code" value - either one of the previously fetched set's elements.
I'm using JDBC so I wonder if I need to use some Java programming or If I can directly use SQL.
Select * from table2 -- this is another table from your question
where code in (select code from table1); -- this is where clause that gets code from first table.
In fact query is equivalent to:
Select t2.* from table2 t2
inner join table1 t1 on (t1.code = t2.code);
What in my opinion is better syntax.
Yes of course you can
Try this for example
SELECT f1 , f2 , f3 , etc
FROM table1
WHERE f1 IN (SELECT field1 FROM table2)
The in keyword is fine for finding a single code value into another table, but to associate multiple columns, you should use an inner join:
SELECT employees.name
FROM employees
INNER JOIN departments
ON employees.department = departments.department
AND employees.position = departments.position
WHERE departments.jobtitle = 'Manager'
Another example using in clause as per #Kacper's comment, not that not all databases support this syntax:
SELECT employees.name
FROM employees
WHERE (department,position) IN
(
SELECT department,position
FROM departments
WHERE jobtitle = 'Manager'
)

SQL Subquery w/LEFT JOIN causing Invalid Object Error

I have a query with the following structure:
EDIT Original structure of the query wasn't quite representative.
SELECT A
,B
,C
,D
FROM ( SELECT id,A
FROM myTable
WHERE conditions
GROUP BY id,A) MainQuery
LEFT JOIN (SELECT id, B, C
FROM myView
WHERE id IN
(
SELECT DISTINCT id
FROM MainQuery
)
) sub1
ON sub1.B = MainQuery.A
LEFT JOIN (SELECT MainQuery.id, D
FROM myOtherView
WHERE sub1.id IN
(
SELECT DISTINCT id
FROM MainQuery
)
) sub2
ON sub2.D = sub1.C
When I run the query, I get the error message Invalid object name 'MainQuery'. When I comment out the LEFT JOINs and the fields they feed in the SELECT statement, the query runs just fine. I've also tried AS MainQuery, but I get the same result.
I suspect it has something to do with scope. Where I'm trying to SELECT DISTINCT id FROM MainQuery, is MainQuery out of scope for the WHERE subquery within sub1?
For context, I've been tasked with rewriting a query that used temp tables into a query that can be used in a report deployed on SSRS 2000. My MainQuery, sub1, and sub2 were temp tables in the original query. Those temp tables used subqueries within them, which I've preserved in my translation. But the original query had the advantage of creating each temp table separately, and then joining the results. Temp tables and subqueries are new to me, so I'm not sure how to adapt between the two, or if that's even the right approach.
The SQL for your MainQuery is invalid. Run it by itself and see:
SELECT A, id
FROM myTable
WHERE conditions
GROUP BY A
You can't select A and id, but only group by A. Either you need to also group by id, or wrap id in an aggregate function like min, or max.
With that addressed it looks like your other issue is that you say "LEFT JOIN" but then place the column of your LEFT JOINED table on the left hand side of your where clause. See below where I flip sub1.B and MainQuery.A in the JOIN.
SELECT A
,B
,C
,D
FROM ( SELECT A, id
FROM myTable
WHERE conditions
GROUP BY A,id) MainQuery
LEFT JOIN nutherTable sub1
on MainQuery.A = sub1.B
and MainQuery.id = sub1.id
LEFT JOIN (SELECT D ...) sub2
ON sub1.C = sub2.D