SQL alternative to Cross apply - sql

I have a requirement where to bring in all the records from left table for every match in right table.
Sample query below. In the temp table #Dates_Test in below query i am bringing past 1 weeks dates.
For each record in employee if the Date in temp table(#Dates_Test) is between MvIn_DT and MvOut_Dt , i have to return 7 rows. I can achieve expected output using CROSS APPLY , I am looking for alternatives other than CROSS APPLY. Thanks in advance.
Dates_test Result set:
Expected output:
Query:
SELECT c.Name
,c.ID
,COUNT(DISTINCT c.ID) AS [ID_Count]
,t2.DATE AS SvcDate
INTO #Test
FROM Employee c
CROSS APPLY (
SELECT [Date]
FROM #Dates_Test t
WHERE t.DATE BETWEEN c.MvIn_DT
AND c.MvOut_DT
) t2
WHERE c.[State] = 'NY'
GROUP BY t2.DATE
,c.ID

This would more normally be written using JOIN:
SELECT c.Name, c.ID,
COUNT(DISTINCT c.ID) AS [ID_Count],
t2.DATE AS SvcDate
INTO #Test
FROM Employee c JOIN
#Dates_Test t
ON t.DATE BETWEEN c.MvIn_DT AND c.MvOut_DT
WHERE c.[State] = 'NY'
GROUP BY t2.DATE, c.Name, c.ID ;
But if you want an improvement in performance, these will probably be pretty similar.

Related

Combine a CROSS JOIN and a LEFT JOIN

I have two tables named author and commit_metrics. Both of them have an id field. Author has author_name and author_email. Commit_metrics has author_id and author_date.
I am trying to write a query that will get the number of commits that each author had in a given week, even if that number is 0. Here's what I have so far:
SELECT a.id, a.author_name, a.author_email, c.week_num, COUNT(c.id)
FROM author AS a
CROSS JOIN generate_series(1, 610) AS s(n)
LEFT JOIN (SELECT c.id,
c.author_id,
c.author_date,
WEEK_NUMBER(c.author_date) AS week_num
FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
WHERE c.week_num IS NOT NULL
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;
WEEK_NUMBER is a function I wrote for this query:
CREATE OR REPLACE FUNCTION WEEK_NUMBER(date TIMESTAMP) RETURNS INTEGER AS
$$
SELECT TRUNC(DATE_PART('day', date - '2008-01-01') / 7)::INTEGER;
$$ LANGUAGE SQL;
Currently, the query works like a charm with one major caveat. It doesn't properly calculate 0 when the author made no commits in a given week. I'm not sure why it doesn't. When I do the query with just the FROM and CROSS JOIN, it properly prints the many thousand combined authors/weeks. However, when I add the LEFT JOIN, it loses any week where the author did not make a commit.
Any help would be greatly appreciated. I'm open to doing away with the generate_series call if it's unnecessary.
Also, I found this post, but I don't think it's helpful for my case.
Although you are using a left join, "WHERE c.week_num IS NOT NULL" filters out all of the cases where there is no post. Try this:
SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id) as post_count
FROM author AS a
CROSS JOIN generate_series(1, 610) AS s(n)
LEFT JOIN (SELECT c.id,
c.author_id,
c.author_date,
WEEK_NUMBER(c.author_date) AS week_num
FROM commit_metrics c) AS c ON s.n = c.week_num AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, s.n
ORDER BY s.n DESC, a.author_name;
Your WHERE clause is excluding the records on commit_metrics that are null, which is the case when the author has no commits during the week selected. You should just remove this from the WHERE clause to get your desired output.
If you need the WHERE clause to eliminate some of the CROSS JOIN records based on your data, you will need that CROSS JOIN and WHERE to be in a sub-select that you LEFT JOIN to, or create some more complicated logic in the current WHERE clause.
Remove the filtering condition. Also a subquery is not needed and you want to select s.n instead of c.week_num:
SELECT a.id, a.author_name, a.author_email, s.n as week_num, COUNT(c.id)
FROM author a CROSS JOIN
generate_series(1, 610) AS s(n) LEFT JOIN
commit_metrics c
ON s.n = WEEK_NUMBER(c.author_date) AND a.id = c.author_id
GROUP BY a.id, a.author_name, a.author_email, c.week_num
ORDER BY c.week_num DESC, a.author_name;

SQL MAX() value across 2 or more queries

This seems like a basic action in SQL, but it has me stumped.
I have about 2 different subqueries, each grouped by LOCATION_ID that contain a date column. For example, one query includes a listing of WORKORDER records while another query pulls records from the NOTE table. Both of these queries includes a join to the LOCATION table allowing me to group by LOCATION_ID.
My goal is to pull the latest date of contact at that particular location and that can be in the form of a workorder, note date, followup date, etc. which are stored in different tables. So ideally I would have a query grouped by LOCATION_ID that shows the latest date of contact for that location.
I would post SQL but I don't have anything that is currently working for me. Any ideas on how to approach this type of scenario?
Thanks!
SELECT
L.LOCATION_ID, Max(MaxDate)
FROM
LOCATION AS L
LEFT JOIN
(SELECT
LOCATION_ID, Max(dbo.LeadNote.NoteDate) AS MaxDate
FROM
LeadNote
INNER JOIN
LOCATION ON LeadNote.LOCATION_ID = LOCATION.LOCATION_ID
GROUP BY
LOCATION_ID) T1 ON L.LOCATION_ID = T1.CONTACTLOCATION_LOCATION_ID
LEFT JOIN
(SELECT
LOCATION_ID, Max(dbo.WORKORDER.WORKORDER_DATECREATED) AS MaxDate
FROM
WORKORDER
INNER JOIN
LOCATION ON LOCATION_ID = WORKORDER_LOCATION_ID
GROUP BY
LOCATION_ID) T2 ON L.LOCATION_ID = T2.CONTACTLOCATION_LOCATION_ID`
Perhaps you could try using UNION to get a single sql result, then wrap it and give it an alias, and then apply a MAX on the field you wish, which both queries return. Keep in mind that to use UNION both queries must return the same set of field names.
Ex:
Query A:
Select a, b, c from T1 where....
Query B:
Select a, f, e from T2 where...
you would have:
SELECT MAX(e)
FROM
(
(Select a, b, c, NULL as f, NULL as e from T1 where....)
UNION
(Select a, NULL as b, NULL as c, f, e from T2 where...)
) t
If you need to use a join you can use a case statement to fetch the larger date.
SELECT
L.LOCATION_ID,
(CASE WHEN(T2.MaxDate IS NULL OR T1.MaxDate > T2.MaxDate)
THEN T1.MaxDate
ELSE T2.MaxDate
END) MaxDate
...
Try:
SELECT
L.LOCATION_ID, Max(MaxDate)
FROM
(
(
SELECT
LOCATION_ID, Max(LeadNote.NoteDate) AS MaxDate
FROM
LeadNote
JOIN
LOCATION
ON
LeadNote.LOCATION_ID = LOCATION.LOCATION_ID
GROUP BY
LOCATION_ID
)
UNION
(
SELECT
LOCATION_ID, Max(WORKORDER.WORKORDER_DATECREATED) AS MaxDate
FROM
WORKORDER
JOIN
LOCATION ON LOCATION_ID = WORKORDER_LOCATION_ID
GROUP BY
LOCATION_ID
)
)
It may need a little tweaking...but kudos to the comment by #DrCopyPaste if this works :)
You can do this with a simple join and a case statement:
SELECT
L.LOCATION_ID,
CASE WHEN(Max(LeadNote.NoteDate) IS NULL OR Max(LeadNote.NoteDate) > Max(WORKORDER.WORKORDER_DATECREATED)
THEN Max(LeadNote.NoteDate)
ELSE Max(WORKORDER.WORKORDER_DATECREATED) end AS maxDate
FROM
LOCATION AS L
LEFT JOIN LeadNote ON LeadNote.LOCATION_ID = LOCATION.LOCATION_ID
LEFT JOIN WORKORDER ON L.LOCATION_ID = WORKORDER_LOCATION_ID
GROUP BY L.LOCATION_ID

Problems with SQL Inner join

Having some problems while trying to optimize my SQL.
I got 2 tables like this:
Names
id, analyseid, name
Analyses
id, date, analyseid.
I want to get the newest analyse from Analyses (ordered by date) for every name (they are unique) in Names. I can't really see how to do this without using 2 x nested selects.
My try (Dont get confused about the names. It's the same principle):
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
INNER JOIN
(
SELECT TOP 1
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid
FROM
vAnalysesHistory
ORDER BY
vAnalysesHistory.chosendatetime DESC
) AS B
ON
B.companyid = vStockNames.stockid
In my example the problem is that i only get 1 row returned (because of top 1). But if I exclude this, I can get multiple analyses of the same name.
Can you help me ? - THanks in advance.
SQL Server 2000+:
SELECT (SELECT TOP 1
a.id
FROM vAnalysesHistory AS a
WHERE a.companyid = n.stockid
ORDER BY a.chosendatetime DESC) AS id,
n.name,
(SELECT TOP 1
a.chosendatetime
FROM vAnalysesHistory AS a
WHERE a.companyid = n.stockid
ORDER BY a.chosendatetime DESC) AS chosendatetime
FROM vStockNames AS n
SQL Server 2005+, using CTE:
WITH cte AS (
SELECT a.id,
a.date,
a.analyseid,
ROW_NUMBER() OVER(PARTITION BY a.analyseid
ORDER BY a.date DESC) AS rk
FROM ANALYSES a)
SELECT n.id,
n.name,
c.date
FROM NAMES n
JOIN cte c ON c.analyseid = n.analyseid
AND c.rk = 1
...without CTE:
SELECT n.id,
n.name,
c.date
FROM NAMES n
JOIN (SELECT a.id,
a.date,
a.analyseid,
ROW_NUMBER() OVER(PARTITION BY a.analyseid
ORDER BY a.date DESC) AS rk
FROM ANALYSES a) c ON c.analyseid = n.analyseid
AND c.rk = 1
You're only asking for the TOP 1, so that's all you're getting. If you want one per companyId, you need to specify that in the SELECT on vAnalysesHistory. Of course, JOINs must be constant and do not allow this. Fortunately, CROSS APPLY comes to the rescue in cases like this.
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
CROSS APPLY
(
SELECT TOP 1
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid
FROM
vAnalysesHistory
WHERE companyid = vStockNames.stockid
ORDER BY
vAnalysesHistory.chosendatetime DESC
) AS B
You could also use ROW_NUMBER() to do the same:
SELECT
B.id,
B.chosendatetime,
vStockNames.name
FROM
vStockNames
INNER JOIN
(
SELECT
vAnalysesHistory.id,
vAnalysesHistory.chosendatetime,
vAnalysesHistory.companyid,
ROW_NUMBER() OVER (PARTITION BY companyid ORDER BY chosendatetime DESC) AS row
FROM
vAnalysesHistory
) AS B
ON
B.companyid = vStockNames.stockid AND b.row = 1
Personally I'm a fan of the first approach. It will likely be faster and is easier to read IMO.
Will something like this work for you?
;with RankedAnalysesHistory as
(
SELECT
vah.id,
vah.chosendatetime,
vah.companyid
,rank() over (partition by vah.companyid order by vah.chosendatetime desc) rnk
FROM
vAnalysesHistory vah
)
SELECT
B.id,
B.chosendatetime,
vsn.name
FROM
vStockNames vsn
join RankedAnalysesHistory as rah on rah.companyid = vsn.stockid and vah.rnk = 1
It seems to me that you only need SQL-92 for this. Of course, explicit documentation of the joining columns between the tables would help.
Simple names
SELECT B.ID, C.ChosenDate, N.Name
FROM (SELECT A.AnalyseID, MAX(A.Date) AS ChosenDate
FROM Analyses AS A
GROUP BY A.AnalyseID) AS C
JOIN Analyses AS B ON C.AnalyseID = B.AnalyseID AND C.ChosenDate = B.Date
JOIN Names AS N ON N.AnalyseID = C.AnalyseID
The sub-select generates the latest analysis for each company; the join with Analyses picks up the Analyse.ID value corresponding to that latest analysis, and the join with Names picks up the company name. (The C.ChosenDate in the select-list could be replaced by B.Date AS ChosenDate, of course.)
Complicated names
SELECT B.ID, C.ChosenDateTime, N.Name
FROM (SELECT A.CompanyID, MAX(A.ChosenDateTime) AS ChosenDateTime
FROM vAnalysesHistory AS A
GROUP BY A.CompanyID) AS C
JOIN vAnalysesHistory AS B ON C.CompanyID = B.CompanyID
AND C.ChosenDateTime = B.ChosenDateTime
JOIN vStockNames AS N ON N.AnalyseID = C.AnalyseID
Same query with systematic renaming (and slightly different layout to avoid horizontal scrollbars).

mysql left join question

I've got two tables, one holds reservations for a room, and the other is a "mid" table to hold the dates that the room is reserved on (since a reservation could have multiple non-sequential dates).
It looks something like:
Res_table:
id, room_id, owner_id
Res_table_mid:
id, res_id, date
The res_id column in the res_table_mid references the id of the res_table. I need to get the start and end date of the reservation.
So the query looks something like this:
SELECT * FROM res_table a
LEFT JOIN (SELECT min(date) as start_date, res_id FROM res_table_mid) AS min ON a.id = min.res_id
LEFT JOIN (SELECT max(date) as end_date, res_id FROM res_table_mid) AS max ON a.id = max.res_id
This works as expected, unless the tables are empty or there are no results, in which case it errors with
#1048 - Column 'res_id' cannot be null
Is there a way to write this so that I get the data I need but if there's no results there's also no error?
Thanks!
Select id, room_id, owner_id
From Res_table
Left Join (
Select R2.res_id, Min(R2.Date), Max(R2.Date)
From Res_table_mid As R2
Group By R2.res_id
) As MinMax
On MinMax.res_Id = Res_table.Id
In your original query, neither derived table indicates the Group By column. Instead, you are relying on MySQL to guess that it should group by res_id. If I had to wager a guess, I'd say that this might be the source of the problem.
SELECT a.id,
a.room_id,
a.owner_id,
MAX(m.date) AS end_date ,
MIN(m.date) AS start_date
FROM res_table a
LEFT JOIN res_table_mid m
ON a.id = m.res_id
GROUP BY a.id,
a.room_id,
a.owner_id;
SELECT min(date) AS start_date FROM (
SELECT * FROM res_table a
LEFT JOIN res_table_mid AS b
ON a.id = b.res_id
WHERE a.id = #reservation)
SELECT max(date) AS end_date FROM (
SELECT * FROM res_table a
LEFT JOIN res_table_mid AS b
ON a.id = b.res_id
WHERE a.id = #reservation)

SQL Return only where more than one join

Not sure how to ask this as I'm a bit of a database noob,
What I want to do is the following.
table tb_Company
table tb_Division
I want to return companies that have more than one division and I don't know how to do the where clause.
SELECT dbo.tb_Company.CompanyID, dbo.tb_Company.CompanyName,
dbo.tb_Division.DivisionName FROM dbo.tb_Company INNER JOIN dbo.tb_Division ON
dbo.tb_Company.CompanyID = dbo.tb_Division.DivisionCompanyID
Any help or links much appreciated.
You'll need another JOIN where you only return companies having more than one division by using a GROUP BYand a HAVINGclause.
You can read up on grouping here
Groups a selected set of rows into a
set of summary rows by the values of
one or morecolumns or expressions. One
row is returned for each group.
Aggregate functions in the SELECT
clause list provide
information about each group instead
of individual rows.
SELECT dbo.tb_Company.CompanyID
, dbo.tb_Company.CompanyName
, dbo.tb_Division.DivisionName
FROM dbo.tb_Company
INNER JOIN dbo.tb_Division ON dbo.tb_Company.CompanyID = dbo.tb_Division.DivisionCompanyID
INNER JOIN (
SELECT DivisionCompanyID
FROM dbo.tb_Division
GROUP BY
DivisionCompanyID
HAVING COUNT(*) > 1
) d ON d.DivisionCompanyID = dbo.tb_Company.CompanyID
another alternative...
SELECT c.CompanyId, c.CompanyName, d.DivisionName
FROM tbl_Company c
INNER JOIN tbl_Division d ON c.CompanyId=d.DivisionCompanyId
GROUP BY c.CompanyId, c.CompanyName, d.DivisionName
HAVING COUNT(*) > 1
How about?
WITH COUNTED AS
(
SELECT C.CompanyID, C.CompanyName, D.DivisionName,
COUNT() OVER(PARTITION BY C.CompanyID) AS Cnt
FROM dbo.tb_Company C
INNER JOIN dbo.tb_Division D ON C.CompanyID = D.DivisionCompanyID
)
SELECT *
FROM COUNTED
WHERE Cnt > 1
With the other solutions (that join onto Division table twice), a single company/division can be returned under a heavy insert load.
If a row is inserted into the Division table between the time the first join occurs and the time the second join (with the group by/having) is evaluated, the first Division join will return a single row. However, the second one will return a count of 2.
How about...
SELECT dbo.tb_Company.CompanyID,
dbo.tb_Company.CompanyName,
FROM dbo.tb_Company
WHERE (SELECT COUNT(*)
FROM dbo.tb_Division
WHERE dbo.tb_Company.CompanyID =
dbo.tb_Division.DivisionCompanyID) > 1;