An issue possibly related to Cursor/Join - sql

Here is my situation:
Table one contains a set of data that uses an id for an unique identifier. This table has a one to many relationship with about 6 other tables such that.
Given Table 1 with Id of 001:
Table 2 might have 3 rows with foreign key: 001
Table 3 might have 12 rows with foreign key: 001
Table 4 might have 0 rows with foreign key: 001
Table 5 might have 28 rows with foreign key: 001
I need to write a report that lists all of the rows from Table 1 for a specified time frame followed by all of the data contained in the handful of tables that reference it.
My current approach in pseudo code would look like this:
select * from table 1
foreach(result) {
print result;
select * from table 2 where id = result.id;
foreach(result2) {
print result2;
}
select * from table 3 where id = result.id
foreach(result3) {
print result3;
}
//continued for each table
}
This means that the single report can run in the neighbor hood of 1000 queries. I know this is excessive however my sql-fu is a little weak and I could use some help.

LEFT OUTER JOIN Tables2-N on Table1
SELECT Table1.*, Table2.*, Table3.*, Table4.*, Table5.*
FROM Table1
LEFT OUTER JOIN Table2 ON Table1.ID = Table2.ID
LEFT OUTER JOIN Table3 ON Table1.ID = Table3.ID
LEFT OUTER JOIN Table4 ON Table1.ID = Table4.ID
LEFT OUTER JOIN Table5 ON Table1.ID = Table5.ID
WHERE (CRITERIA)

Join doesn't do it for me. I hate having to de-tangle the data on the client side. All those nulls from left-joining.
Here's a set-based solution that doesn't use Joins.
INSERT INTO #LocalCollection (theKey)
SELECT id
FROM Table1
WHERE ...
SELECT * FROM Table1 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table2 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table3 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table4 WHERE id in (SELECT theKey FROM #LocalCollection)
SELECT * FROM Table5 WHERE id in (SELECT theKey FROM #LocalCollection)

Ah! Procedural! My SQL would look like this, if you needed to order the results from the other tables after the results from the first table.
Insert Into #rows Select id from Table1 where date between '12/30' and '12/31'
Select * from Table1 t join #rows r on t.id = r.id
Select * from Table2 t join #rows r on t.id = r.id
--etc
If you wanted to group the results by the initial ID, use a Left Outer Join, as mentioned previously.

You may be best off to use a reporting tool like Crystal or Jasper, or even XSL-FO if you are feeling bold. They have things built in to handle specifically this. This is not something the would work well in raw SQL.
If the format of all of the rows (the headers as well as all of the details) is the same, it would also be pretty easy to do it as a stored procedure.
What I would do: Do it as a join, so you will have the header data on every row, then use a reporting tool to do the grouping.

SELECT * FROM table1 t1
INNER JOIN table2 t2 ON t1.id = t2.resultid -- this could be a left join if the table is not guaranteed to have entries for t1.id
INNER JOIN table2 t3 ON t1.id = t3.resultid -- etc
OR if the data is all in the same format you could do.
SELECT cola,colb FROM table1 WHERE id = #id
UNION ALL
SELECT cola,colb FROM table2 WHERE resultid = #id
UNION ALL
SELECT cola,colb FROM table3 WHERE resultid = #id
It really depends on the format you require the data in for output to the report.
If you can give a sample of how you would like the output I could probably help more.

Join all of the tables together.
select * from table_1 left join table_2 using(id) left join table_3 using(id);
Then, you'll want to roll up the columns in code to format your report as you see fit.

What I would do is open up cursors on the following queries:
SELECT * from table1 order by id
SELECT * from table1 r, table2 t where t.table1_id = r.id order by r.id
SELECT * from table1 r, table3 t where t.table1_id = r.id order by r.id
And then I would walk those cursors in parallel, printing your results. You can do this because all appear in the same order. (Note that I would suggest that while the primary ID for table1 might be named id, it won't have that name in the other tables.)

Do all the tables have the same format? If not, then if you have to have a report that can display the n different types of rows. If you are only interested in the same columns then it is easier.
Most databases have some form of dynamic SQL. In that case you can do the following:
create temporary table from
select * from table1 where rows within time frame
x integer
sql varchar(something)
x = 1
while x <= numresults {
sql = 'SELECT * from table' + CAST(X as varchar) + ' where id in (select id from temporary table'
execute sql
x = x + 1
}
But I mean basically here you are running one query on your main table to get the rows that you need, then running one query for each sub table to get rows that match your main table.
If the report requires the same 2 or 3 columns for each table you could change the select * from tablex to be an insert into and get a single result set at the end...

Related

Pass values as parameter from select query

I want to pass values from output of select query to another query. Basically both queries will be part of a stored procedure. e.g.
select Id, RelId
from tables
There will be multiple rows returned by above query and I want to pass them to the following query
select name
from table2
where Id = #Id and MgId = #RelId
Please suggest
You cannot pass multiple values in SQL.
But maybe you can just join your 2 tables, that would be far more efficient.
Not knowing your table schemes I suggest something like this. You might have to adapt this to your actual table schemas off course
select name
from table2 t2
inner join tables t on t2.Id = t.Id
and t2.MgId = t.RelId
EDIT
As Gordon mentioned in his answer, this approach can show double rows in your result.
If you don't want that than here are 2 ways of getting rid of the doubles
select distinct name
from ...
or by grouping by adding this at the end of the statement
group by name
Though this will work, avoiding the doubles like in Gordon's answer is better
I would suggest using exists:
select t2.name
from table2 t2
where exists (select 1
from tables t
where t2.Id = t.Id and t2.MgId = t.RelId
);
The difference between exists and join is that this will not generate duplicates, if there are multiple matches between the tables.
Or...
SELECT *
INTO #Table1
FROM ...
SELECT *
INTO #Table2
FROM ...
SELECT *
FROM #Table1 T1
JOIN #Table2 T2
DROP TABLE #Table1, #Table2

Joining tables using concat column

So I have two tables I want to join using SQL.
Since they did not have a common column I used
SELECT NEW_ID = CONCAT ('0',table1.ID)
Now that I have the new column with matching data in both tables, how do I join both tables? Is there any way to use the NEW_ID column as a temporary column so that I do not have to alter table 1?
In your case, suitable. of course in terms of performance it is not the best solution ( compare fields with diffrent types)
Select *
From Table1 As t1 inner join Table2 as t2
ON t1.ID = CAST(t2.ID AS INT)
Different ways, here's one:
;WITH CTE AS
(
SELECT NEW_ID = CONCAT ('0',table1.ID) FROM TableA
)
SELECT * FROM CTE AS C INNER JOIN TableB AS B ON C.New_ID = B.ID
You Can Just give the Concrete Expression in the Join.You Don't have to add a new Column For that. Like this
SELECT
*
FROM Table1 A
INNER JOIN Table2 B
ON RIGHT('00'+A.Id,2) = RIGHT('00'+B.Id,2)

Select records from one table based on records from another table

this is a simplified version of a problem I'm having,
I have two tables:
Table1 has two columns (Stuff, YesNo) and
Table2 has one column (Stuff)
The records in the YesNo Column will either be 1 or 0
How could I select records in Table2 where the records in Table1.YesNo = 1
Many Thanks
SELECT Table2.*
FROM Table2
INNER JOIN Table1 ON Table1.Stuff = Table2.Stuff
WHERE Table1.YesNo = 1
If I understand you correctly, this would be your solution:
Select Stuff From Table2
Where Exists (
Select 'Y'
From Table1
Where Table1.Stuff = Table2.Stuff
And YesNo = 1
)
As I believe you'll need data from both tables and you may want to render fields unique to each table This seems like a likely response. However, as I don't believe STUFF accurately represents the relationship... you'll need to quantify/adjust the on a.stuff = b.stuff so that the join includes all necessary fields.
SELECT A.Stuff, B.Stuff, B.YesNo
FROM table1 B
INNER JOIN table2 A
on A.Stuff = B.Stuff
WHERE B.YesNo = 1
SELECT T2.*
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.Stuff = T2.Stuff
WHERE T1.YesNo = 1

SQL query to find record with ID not in another table

I have two tables with binding primary key in database and I desire to find a disjoint set between them. For example,
Table1 has columns (ID, Name) and sample data: (1 ,John), (2, Peter), (3, Mary)
Table2 has columns (ID, Address) and sample data: (1, address2), (2, address2)
So how do I create a SQL query so I can fetch the row with ID from table1 that is not in table2. In this case, (3, Mary) should be returned?
PS: The ID is the primary key for those two tables.
Try this
SELECT ID, Name
FROM Table1
WHERE ID NOT IN (SELECT ID FROM Table2)
Use LEFT JOIN
SELECT a.*
FROM table1 a
LEFT JOIN table2 b
on a.ID = b.ID
WHERE b.id IS NULL
There are basically 3 approaches to that: not exists, not in and left join / is null.
LEFT JOIN with IS NULL
SELECT l.*
FROM t_left l
LEFT JOIN
t_right r
ON r.value = l.value
WHERE r.value IS NULL
NOT IN
SELECT l.*
FROM t_left l
WHERE l.value NOT IN
(
SELECT value
FROM t_right r
)
NOT EXISTS
SELECT l.*
FROM t_left l
WHERE NOT EXISTS
(
SELECT NULL
FROM t_right r
WHERE r.value = l.value
)
Which one is better? The answer to this question might be better to be broken down to major specific RDBMS vendors. Generally speaking, one should avoid using select ... where ... in (select...) when the magnitude of number of records in the sub-query is unknown. Some vendors might limit the size. Oracle, for example, has a limit of 1,000. Best thing to do is to try all three and show the execution plan.
Specifically form PostgreSQL, execution plan of NOT EXISTS and LEFT JOIN / IS NULL are the same. I personally prefer the NOT EXISTS option because it shows better the intent. After all the semantic is that you want to find records in A that its pk do not exist in B.
Old but still gold, specific to PostgreSQL though: https://explainextended.com/2009/09/16/not-in-vs-not-exists-vs-left-join-is-null-postgresql/
Fast Alternative
I ran some tests (on postgres 9.5) using two tables with ~2M rows each. This query below performed at least 5* better than the other queries proposed:
-- Count
SELECT count(*) FROM (
(SELECT id FROM table1) EXCEPT (SELECT id FROM table2)
) t1_not_in_t2;
-- Get full row
SELECT table1.* FROM (
(SELECT id FROM table1) EXCEPT (SELECT id FROM table2)
) t1_not_in_t2 JOIN table1 ON t1_not_in_t2.id=table1.id;
Keeping in mind the points made in #John Woo's comment/link above, this is how I typically would handle it:
SELECT t1.ID, t1.Name
FROM Table1 t1
WHERE NOT EXISTS (
SELECT TOP 1 NULL
FROM Table2 t2
WHERE t1.ID = t2.ID
)
SELECT COUNT(ID) FROM tblA a
WHERE a.ID NOT IN (SELECT b.ID FROM tblB b) --For count
SELECT ID FROM tblA a
WHERE a.ID NOT IN (SELECT b.ID FROM tblB b) --For results

Is it possible to restrict the results of an outer join?

I've got a scenario where I need to do a join across three tables.
table #1 is a list of users
table #2 contains users who have trait A
table #3 contains users who have trait B
If I want to find all the users who have trait A or trait B (in one simple sql) I think I'm stuck.
If I do a regular join, the people who don't have trait A won't show up in the result set to see if they have trait B (and vice versa).
But if I do an outer join from table 1 to tables 2 and 3, I get all the rows in table 1 regardless of the rest of my where clause specifying a requirement against tables 2 or 3.
Before you come up with multiple sqls and temp tables and whatnot, this program is far more complex, this is just the simple case. It dynamically creates the sql based on lots of external factors, so I'm trying to make it work in one sql.
I expect there are combinations of in or exists that will work, but I was hoping for some thing simple.
But basically the outer join will always yield all results from table 1, yes?
SELECT *
FROM table1
LEFT OUTER
JOIN table2
ON ...
LEFT OUTER
JOIN table3
ON ...
WHERE NOT (table2.pk IS NULL AND table3.pk IS NULL)
or if you want to be sneaky:
WHERE COALESCE(table2.pk, table3.pk) IS NOT NULL
but for you case, i simply suggest:
SELECT *
FROM table1
WHERE table1.pk IN (SELECT fk FROM table2)
OR table1.pk IN (SELECT fk FROM table3)
or the possibly more efficient:
SELECT *
FROM table1
WHERE table1.pk IN (SELECT fk FROM table2 UNION (SELECT fk FROM table3)
If you really just want the list of users that have one trait or the other, then:
SELECT userid FROM users
WHERE userid IN (SELECT userid FROM trait_a UNION SELECT userid FROM trait_b)
Regarding outerjoin specifically, longneck's answer looks like what I was in the midst of writing.
I think you could do a UNION here.
May I suggest:
SELECT columnList FROM Table1 WHERE UserID IN (SELECT UserID FROM Table2)
UNION
SELECT columnList FROM Table1 WHERE UserID IN (SELECT UserID FROM Table3)
Would something like this work? Keep in mind depending on the size of the tables left outer joins can be very expensive with regards to performance.
Select *
from table1
where userid in (Select t.userid
From table1 t
left outer join table2 t2 on t1.userid=t2.userid and t2.AttributeA is not null
left outer join table3 t3 on t1.userid=t3.userid and t3.AttributeB is not null
group by t.userid)
If all you want is the ids of the users then
SELECT UserId From Table2
UNION
SELECT UserId From Table3
is totally sufficient.
If you want some more infos from Table1 on these users, you can join the upper SQL to Table 1:
SELECT <list of columns from Table1>
FROM Table1 Join (
SELECT UserId From Table2
UNION
SELECT UserId From Table3) User on Table1.UserID = Users.UserID