How to join and sum tables without losing data? - sql

Here's an example of what I'm trying to do.
Select (t1.count+t2.count) as countTotal from t1 LEFT JOIN t2 ON (t1.ID = t2.ID);
I'm doing this on a much larger scale with many variables added together. The problem I'm getting is that if one of the IDs is not in one of the tables I'm combining, the whole row for that ID comes back blank. My goal is to sum the two tables together for the most part but if one of the rows is only in one table, how can I keep that data in the resulting query?

Use NZ():
Select nz(t1.count) + nz(t2.count) as countTotal
from . . .
This replaces the NULL values with 0 so the + works. Otherwise it returns NULL if any value is NULL.

Assuming the ID can be missing from either table, you'll need to use a FULL OUTER JOIN.
If the row is missing from one table, t1.count + t2.count will return Null.
According to the documentation, the default return value from Nz for when the value is Null is 0 or a zero-length string. I prefer to write clearer code, so I specify the value.
SELECT Nz(t1.count, 0) + Nz(t2.count, 0) as countTotal
FROM t1
FULL OUTER JOIN t2 on t1.ID = t2.ID

I had to do something similar and ended up exporting it all to excel and carefully combining the data manually, it took me like 4 hours. That was only for 1200 records though, and I was working with 5 tables that all had different names for the matching columns, huge mess.
Maybe you could try using a make table query, but if you have a large amount of fields then this isn’t feasible, and I’m not 100% it would even work.
The next best option would be to export the tables and import them into an SQL IDE that supports full outer join. Then make a table with that full join and export it back to Access. This would definitely work. But it can be tricky to import tables to an SQL IDE, I’ve had trouble with it in the past. But once I got them imported, it was smooth sailing.

Related

Bigquery JOIN optimization

We are running a query every 5 minutes with a JOIN. On one side of the JOIN is table1#time1-time2 (as we only look at the incremental part), another side of the JOIN is table2, which keeps changing as we are stream data into it. The JOIN is now like
[table1#time1-time2] AS T1 INNER JOIN EACH table2 AS T2 ON T1.id = T2.id
Since every time this query involves the whole T2, is there any possible optimization I can do, such as using cache or else, in order to minimize the money cost?
EDIT
The query:
Copy pasting text would be better, hard to read the query on that screenshot.
That said, I see a SELECT * for the second table. Selecting only the needed columns would only query a fraction of the table, instead of all of it.
Also, why are you generating a row_in and joining on a different one?

How do I put multiple criteria for a column in a where clause?

I have five results to retrieve from a table and I want to write a store procedure that will return all desired rows.
I can write the query like that temporarily:
Select * from Table where Id = 1 OR Id = 2 or Id = 3
I supposed I need to receive a list of Ids to split, but how do I write the WHERE clause?
So, if you're just trying to learn SQL, this is a short and good example to get to know the IN operator. The following query has the same result as your attempt.
SELECT *
FROM TABLE
WHERE ID IN (SELECT ID FROM TALBE2)
This translates into what is your attempt. And judging by your attempt, this might be the simplest version for you to understand. Although, in the future I would recommend using a JOIN.
A JOIN has the same functionality as the previous code, but will be a better alternative. If you are curious to read more about JOINs, here are a few links from the most important sources
Joins - wikipedia
and also a visual representation of how different types of JOIN work
Another way to do it. The inner join will only include rows from T1 that match up with a row from T2 via the Id field.
select T1.* from T1 inner join T2 on T1.Id = T2.Id
In practice, inner joins are usually preferable to subqueries for performance reasons.

SQL Server : Join two rows. All columns from the 1st and 1 column from the second using an alias

This marks the first time I ask a question on stack overflow, and probably the 5000th time i've visited the site. So first off, thanks for all your hard work!
So I have a basic select query on a single table that returns two rows of similar data and are linked via a shared PK.
I want to retrieve all fields from the first row, and only one of the columns from the second under an alias.
Basically flattening the two records into one but only using one of the columns from the second row.
OK Here is a screenshot.
http://www.flickr.com/photos/imagevault/8581053528/
Looking at the first results window I want the Second "Comp" value to show up as an additional column on the first row as a "RentalComp". IF there is only one row returned for a given propertyid then it can just be null.
Thanks!
.. I'm at a loss of what to google for so here i am.
SELECT a.*, b.Comp AS RentalComp
FROM dbo.vwComps AS a LEFT OUTER JOIN dbo.vwComps AS b ON a.PropertyID = b.PropertyID
AND b.ConfigurationUsed = 2
WHERE (a.ConfigurationUsed = 1)
The key was specifying multiple conditions in the 'ON' statement. THen doing a basic filter in the where clause.
I kept trying to do everything in the where an it was filtering everything out.
Is this what you're looking for?
SELECT t1.*,
t2.col
FROM table1 t1
JOIN table2 t2
ON t1.key = t2.key

Is there some equivalent to subquery correlation when making a derived table?

I need to flatten out 2 rows in a vertical table (and then join to a third table) I generally do this by making a derived table for each field I need. There's only two fields, I figure this isn't that unreasonable.
But I know that the rows I want back in the derived table, are the subset that's in my join with my third table.
So I'm trying to figure out the best derived tables to make so that the query runs most efficiently.
I figure the more restrictive I make the derived table's where clause, the smaller the derived table will be, the better response I'll get.
Really what I want is to correlate the where clause of the derived table with the join with the 3rd table, but you can't do that in sql, which is too bad. But I'm no sql master, maybe there's some trick I don't know about.
The other option is just to make the derived table(s) with no where clause and it just ends up joining the entire table twice (once for each field), and when I do my join against them the join filters every thing out.
So really what I'm asking I guess is what's the best way to make a derived table where I know pretty much specifically what rows I want, but sql won't let me get at them.
An example:
table1
------
id tag value
-- ----- -----
1 first john
1 last smith
2 first sally
2 last smithers
table2
------
id occupation
-- ----------
1 carpenter
2 homemaker
select table2.occupation, firsttable.first, lasttable.last from
table2, (select value as first from table1 where tag = 'first') firsttable,
(select value as last from table1 where tag = 'last') lasttable
where table2.id = firsttable.id and table2.id = lasttable.id
What I want to do is make the firsttable where clause where tag='first' and id = table2.id
DERIVED tables are not to store the intermediate results as you expect. These are just a way to make code simpler. Using derived table doesnt mean that the derived table expression will be executed first and output of that will be used to join with remaining tables.Optimizer will automaticaly faltten derived tables in most of the cases.
However,There are cases where the optimizer might want to store the results of the subquery and thus materilize instead of flattening.It usually happens when you have some kind of aggregate functions or like that.But in your case the query is too simple and thus optimizer will flatten query
Also,storing derived table expression wont make your query fast it will in turn could make it worse.Your real problem is too much normalization.Fix that query will be just a join of two tables.
Why you have this kind of normalization?Why you are storing col values as rows.Try to denormalize table1 so that it has two columns first and last.That will be best solution for this.
Also, do you have proper indexes on id and tag column? if yes then a merge join is quite good for your query.
Please provide index details on these tables and the plan generated by your query.
Your query will be used like an inner join query.
select table2.occupation, first.valkue as first, last.value as last
from
table2
inner join table1 first
on first.tag = 'first'
and first.id =table2.id
inner join table1 last
on last.tag = 'last'
and table2.id = last.id
I think what you're asking for is a COMMON TABLE EXPRESSION. If your platform doesn't implement them, then a temporary table may be the best alternative.
I'm a little confused. Your query looks okay . . . although it looks better with proper join syntax.
select table2.occupation, firsttable.first, lasttable.last
from table2 join
(select value as first from table1 where tag = 'first') firsttable
on table2.id = firsttable.id join
(select value as last from table1 where tag = 'last') lasttable
on table2.id = lasttable.id
This query does what you are asking it to do. SQL is a declarative language, not a procedural language. This means that you describe the result set and rely on the database SQL compiler to turn it into the right set of commands. (That said, sometimes how a query is structured does make it easier or harder for some engines to produce efficient query plans.)

Nested records is sql server

select Table1.colID, Table1.colName,
(select * from Table2 where Table2.colID = Table1.colID) as NestedRows
from Table1
The above query gives you this error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used.....
Can anybody explain why this limitation exist?
I had this idea that this kind of multidimentional queries would be nice for building OO objects directly from the database with 1 query
EDIT:
This question is pretty theoretical. To solve this practical I would use a join or simply done 2 queries, but I wondered if there was anything stopping you from returning a column as a table type (In sql server 2008 you can create table types).
Say you have corrensponding classes in code, think Linq2Sql
public class Table1
{
public int colID,
public string colName,
public List<Table2> table2s;
}
I would like to be able to fill instances of this class directly with 1 query
It appears as though you want a recordset (multiple columns and multiple rows) returned from Table2 for each row in table1. If this is correct, perhaps you could return the data as XML from the DB. Something like this...
select Table1.colID, Table1.colName, Table2.*
from Table1
Inner Join Table2
On Table1.ColId = Table2.ColId
Order By Table1.ColId
For XML Auto
Then, for each row in Table1, you'll get multiple sub-nodes in your XML for table2 data.
There's likely to be performance implications with returning XML from your database, as well as loading your data structure on the front end. I'm not necessarily suggesting this is the best approach, but it's probably worth investigating.
Because the subquery in a select clause must be "inserted" into a column value in every row of the result set from the outer query. You cannot put a set of values into a single cell (a single column of a single row) of the result set.
You need to use an inner join. the multiple rows returned by joined table will be output as multiple rows in the final result set.
You would be better off using an INNER JOIN between the two tables and simply selecting the rows you want from each table.
SELECT tab1.colID, tab1.colName, tab2.Column1, tab2.column2
FROM dbo.Table1 AS tab1
INNER JOIN dbo.Table2 AS tab2
ON tab1.colID = tab2.colID
However, remember that the data from table1 will be repeated for each matching record in table2. Although I believe the query I posted will get the data in the form you are looking for, I don't think it's the best method for querying the database. I would either execute separate queries, or put the separate queries into a stored procedure and return multiple result sets.
I think the query you're looking for is probably:
select Table1.colID, Table1.colName,Table2.*
from Table1 inner join Table2 ON Table1.colID = Table2.colID
Subqueries are more typically used (at least by me) in the WHERE clause.