MSSQL 2012 - Returning multiple columns in a subquery - sql

I'd like to return multiple columns with a sub query.
E.G,
select a.name, a.age
from table1 a, ( select b.race, b.weight from table2 b where dateDiff(dd, b.date1, b.date2 ) < 30 )
where a.age > 24
Some of you have said "Just use a join" - I do not want the dateDiff in the subquery affecting the results of the parent query. Again, my real query is more complex then this but this should be sufficient in explaining my issue.

Use left join to do this, left join will return NULL values
SELECT a.name, b.score, ...
FROM (select id, name, ... from table1 where ???) a
LEFT JOIN (select id, score, ... from table2 where ???) b on (a.id = b.id)
WHERE clause

Related

oracle12c,sql,difference between count(*) and sum()

Tell me the difference between sql1 and sql2:
sql1:
select count(1)
from table_1 a
inner join table_2 b on a.key = b.key where a.id in (
select id from table_1 group by id having count(1) > 1
)
sql2:
select sum(a) from (
select count(1) as a
from table_1 a
inner join table_2 b on a.key = b.key group by a.id having count(1) > 1
)
Why is the output not the same?
The queries are not even similar. They are very different. Let's check the first one:
select count(1)
from table_1 a
inner join table_2 b
on a.key = b.key
where a.id in (
select id from table_1 group by id having count(1) > 1
) ;
You are first making an inner join:
select count(1)
from table_1 a
inner join table_2 b
on a.key = b.key
In this case, you can use count(1), count(id), count(*), it's equivalent. You are counting the common elements in both tables: those ones that have in common the key field.
After that, you are enforcing this:
where a.id in (
select id from table_1 group by id having count(1) > 1
)
In other words, that every "id" of the table_1 must be at least two times in the table_1 table.
And lastly, you are doing this:
select count(1)
In other words, counting those elements. So, translated into english you have done this:
get every record of table_1 and pair with records of table_2 for the id, and get only those that match
for the result above, filter out only the elements whose id of the table_1 appears more than one time
count that result
Let's see what happens with the second query:
select sum(a) from (
select count(1) as a
from table_1 a
inner join table_2 b
on a.key = b.key
group by a.id
having count(1) > 1
);
You are making the same inner join:
select count(1) as a
from table_1 a
inner join table_2 b
on a.key = b.key
but, you are grouping it by the id of the table:
group by a.id
and then filtering out only those elements who appear more than one time:
having count(1) > 1
The result so far are a set of records that have in common the key field in both tables, but grouped by the id: this means that only those fields that are at leas two times in the table_b are outputed of this join. After that, you group by id, collapsing those results into the table_1.id field and counting the result. I presume that very few records will match this strict criteria.
And lastly, you sum all those set.
When you use count(*) you count ALL the rows. The SUM() function is an aggregate function that returns the sum of all or distinct values in a set of values.

Written a subquery that can return more than one field without using the Exists

The query below is supposed to pull records for fields with the max date.
I am getting an error
You have written a subquery that can return more than one field without using EXISTS reserved word in the Main query's FROM clause. Revise the SELECT statement of the subquery to request only one column.
Code:
SELECT *
FROM TableName
WHERE (((([Project_Name], [Date])) IN (SELECT Project_Name, MAX(Date)
FROM TableName
GROUP BY Project)));
Your probably thinking of a nested subquery used as a table, like the below:
select a.*, b.1, b.2
from FirstTable A
join (Select Id, firstcolumn as 1, secondcolumn as 2
from SecondTable) B on b.ID = a.ID
Works pretty much like a regular join except you are using a subquery. Hope that helps,
SELECT A.*
FROM TableName A
INNER JOIN (select Project_Name, max(Date) MaxDate
from TableName
group by Project) B
ON A.[Project_Name] = B.[Project_Name]
AND A.[Date] = B.MaxDate
A version using EXISTS() looks like this:
SELECT *
FROM TableName AS A
WHERE EXISTS(
SELECT * FROM (
SELECT B.Project_Name, MAX( B.Date ) AS MaxDate
FROM TableName AS B
GROUP BY B.Project_Name ) AS C
WHERE C.Project_Name = A.Project_Name AND C.MaxDate = A.Date
);
Although I have the feeling this will have poorer performance than a JOIN because the GROUP BY statement might have to be executed for each record and each call to the EXISTS() function...

Exposing more fields on group by sql

I know, in a Group By you can't Select a field that is not in an aggregate function or the GROUP BY clause.
However, There must be a workaround using joins or something else.
I have TWO tables BMP_VISITS_SITES and BMP_VISITS_COMMENTS which are connected by StationID in a one-to-many relationship. One Site can have many comments.
I'm trying to write a query that returns all Sites and the latest (only 1) comment. I have a "working" query but it only returns two columns which are in either an aggregate function or group by.
Here is my "working" query:
select a.StationID,
MAX(b.[dateobserved]) as LastDateObserved,
a.Status
from BMP_VISITS_SITES a
left outer join BMP_VISITS_COMMENTS as b
on a.[StationID] = b.[StationID]
group by a.StationID;
But how can I access all the columns in both tables?
I've tried inner joins with 1/2 success. When I join my BMP_VISITS_SITES to the above query I get all the fields of the table (t1). Great, but as soon as I try joining on BMP_VISITS_COMMENTS (t3) I get more results than I should.
select t1.*, t2.*
--,t3.*
from BMP_VISITS_SITES t1
inner join (
select a.StationID, MAX(b.[dateobserved]) as LastDateObserved from BMP_VISITS_SITES a
left outer join BMP_VISITS_COMMENTS as b
on a.[StationID] = b.[StationID]
group by a.StationID
) t2 on t2.StationID = t1.StationID
--inner join sde.BMP_VISITS_COMMENTS t3 on t3.StationID = t2.StationID;
SELECT a.*, b.* FROM
BMP_VISITS_SITES a
OUTER APPLY
(
SELECT TOP 1 *
FROM BMP_VISITS_COMMENTS b
WHERE b.StationID = a.StationID
ORDER BY LastDateObserved DESC
) b
You can use apply to get the last comment record and return all fields from both sides of the query.
Use row_number()
select *
from
(
select a.StationID,
a.Status,
b.*,
row_number() over (partition by a.stationid, a.status order by b.[dateobserved] desc) as rn
from BMP_VISITS_SITES a
left outer join BMP_VISITS_COMMENTS as b
on a.[StationID] = b.[StationID]
) v
where rn = 1

SQL: Turn a subquery into a join: How to refer to outside table in nested join where clause?

I am trying to change my sub-query in to a join where it selects only one record in the sub-query. It seems to run the sub-query for each found record, taking over a minute to execute:
select afield1, afield2, (
select top 1 b.field1
from anothertable as b
where b.aForeignKey = a.id
order by field1
) as bfield1
from sometable as a
If I try to only select related records, it doesn't know how to bind a.id in the nested select.
select afield1, afield2, bfield1
from sometable a left join (
select top 1 id, bfield, aForeignKey
from anothertable
where anothertable.aForeignKey = a.id
order by bfield) b on
b.aForeignKey = a.id
-- Results in the multi-part identifier "a.id" could not be bound
If I hard code values in the nested where clause, the select duration drops from 60 seconds to under five. Anyone have any suggestions on how to join the two tables while not processing every record in the inner table?
EDIT:
I ended up adding
left outer join (
select *, row_number() over (partition by / order by) as rank) b on
b.aforeignkey = a.id and b.rank = 1
went from ~50 seconds to 8 for 22M rows.
Try this:
WITH qry AS
(
SELECT afield1,
afield2,
b.field1 AS bfield1,
ROW_NUMBER() OVER(PARTITION BY a.id ORDER BY field1) rn
FROM sometable a LEFT JOIN anothertable b
ON b.aForeignKey = a.id
)
SELECT *
FROM qry
WHERE rn = 1
Try this
select afield1,
afield2,
bfield1
from sometable a
left join
(select top 1 id, bfield, aForeignKey from anothertable where aForeignKey in(a.id) order by bfield) b on b.aForeignKey = a.id

How to select records from a Table that has a certain number of rows in a related table in SQL Server?

Not quite sure how to ask this, but I have 2 tables that are related in a 1 to many relationship, I need to select all records in the "1" table that have less than three records in the "many' table.
select b.foreignkey,count(b.foreignkey) as bidcount
from b
where b.foreignkey in (select a.id from a) and bidcount< 3
group by b.foreignkey
this doesn't work at all I know but I am at a loss how to do this.
I need to in the end select all the records from the "a" table based on this criteria. Sorry if that is confusing!
Just using your code, not tested:
SELECT
b.foreignkey,
count(b.foreignkey) as bidcount
FROM
b
WHERE
b.foreignkey IN (SELECT a.id FROM a)
GROUP BY
b.foreignkey
HAVING
count(b.foreignkey) < 3
Try this:
SELECT t1.id,COUNT(t2.parentId)
FROM table1 as t1
INNER JOIN table2 as t2
ON t1.id = t2.parentId
GROUP BY t1.id
HAVING COUNT(t2.parentId) < 3
You didn't mention which version of SQL Server you're using - if you're on SQL Server 2005 or newer, you could use this CTE (Common Table Expression):
;WITH ChildRows AS
(
SELECT A.Id, COUNT(b.Id) AS 'BCount'
FROM
dbo.TableA A
INNER JOIN
dbo.TableB B ON B.TableAId = A.Id
)
SELECT A.*, R.BCount
FROM dbo.TableA A
INNER JOIN ChildRows R ON A.Id = R.Id
The inner SELECT lists the Id columns from TableA and the count of the child rows associated with those (using the INNER JOIN to TableB) - and the outer SELECT just builds on top of that result set and shows all fields from table A (and the count from the B table)
if you want to return all fields of your (1) table in one query, I suggest you consider using CROSS APPLY:
SELECT t1.* FROM table_1 t1
CROSS APPLY (SELECT COUNT(*) cnt FROM Table_Many t2 WHERE t2.fk = t1.pk) a
where a.cnt < 3
in some particular cases, based on your indices and db structure, this query may run 4 times faster than the GROUP BY method
you have posted this question in sql server, I have a answer in oracle database system (don't know whether it will run in sql server as well or not)
this is as follow-
select [desired column list] from
(select b.*, count(*) over (partition by b.foreignkey) c_1
from b
where b.foreignkey in (select a.id from a) )
where c_1 < 3 ;
i hope it should work on sql server as well...
if not please let me update ..