SQL request execution take to much time - sql

I have a SQL query which take sometimes 15 sec, sometimes 1 min.
Please, tell me how i can do my query lighter.
SELECT TOP 100
u.firstName,
u.id as userID,
ueh.targetID,
ueh.opened,
ueh.emailID,
u.phone,
u.gender
FROM dbo.Students u
INNER JOIN dbo.tblEmailHistory ueh
ON ueh.studentID = u.ID
WHERE (CONVERT(date,DATEADD(day,6,ueh.sDate))=CONVERT(date,getdate()))
AND IsNull(u.firstName, '') != ''
AND IsNull(u.email, '') != ''
AND IsNull(u.phone, '') != ''
AND ueh.status = 'sent'
AND ueh.reject_reason = 'null'
AND ueh.targetID = 28
AND ueh.opened = 0
AND u.deleted = 0
AND NOT EXISTS (SELECT ush.smsSendFullDate, ush.studentID FROM dbo.UsersSmsHistory ush WHERE u.id = ush.studentID AND DATEDIFF(DAY,ush.smsSendFullDate,GETDATE()) = 0)

This is your query greatly simplified:
SELECT TOP 100 u.firstName, u.id as userID,
ueh.targetID, ueh.opened, ueh.emailID,
u.phone, u.gender
FROM dbo.Students u INNER JOIN
dbo.tblEmailHistory ueh
ON ueh.studentID = u.ID
WHERE ueh.sDate >= cast(getdate() + 6 as date) AND
ueh.sDate < csat(getdate() + 7 as date) AND
u.firstName <> '' AND
u.email <> '' AND
u.phone <> '' AND
ueh.status = 'sent' AND
ueh.reject_reason = 'null' AND -- sure you mean a string here?
ueh.targetID = 28 AND
ueh.opened = 0 AND
u.deleted = 0 AND
NOT EXISTS (SELECT ush.smsSendFullDate, ush.studentID
FROM dbo.UsersSmsHistory ush
WHERE u.id = ush.studentID AND
convert(date, ush.smsSendFullDate) = convert(date, GETDATE())
);
Note: Comparisons to NULL are never true for almost all comparisons, so ISNULL()/COALESCE() is unnecessary.
Then start adding indexes. I would recommend:
tblEmailHistory(targetid, status, opened, deleted, rejectreason, sdate)
UsersSmsHistory(studentID, smsSendFullDate)
I am guessing most students have names and phone numbers, so indexes on those columns would not help.

Your query looks okay with no redundant parts. The reason it takes a lot of time is because you are joining tables three times and there may be a lot of data in them. So instead of improving your query, try to improve the performance of the table by adding index to them on columns like dbo.tblEmailHistory.studentID, dbo.Students.ID, etc

Related

TSQL - Multiple Values in INSERT (because of joins)

Im trying to insert a data from one database to another. This is what i have so far, on the select side:
USE [db2]
SELECT
sP.pers_FirstName
,sp.pers_LastName
,sPH.Phon_Number
,CASE WHEN LEFT(sPH.Phon_Number, 2) = '04' THEN sPH.Phon_number ELSE NULL END
,CASE WHEN sp.pers_gender = 1 THEN 'M' WHEN sp.pers_gender = 2 THEN 'F' ELSE 'U' END
,CASE
WHEN sP.pers_salutation = '10' THEN 8
WHEN sp.pers_salutation = '6' THEN 2
WHEN sp.pers_salutation = '7' THEN 1
WHEN sp.pers_salutation = '8' THEN 4
WHEN sp.pers_salutation = '9' THEN 5
WHEN sp.pers_salutation = 'APROF' THEN 6
WHEN sp.pers_salutation = 'Ms.' THEN 4
WHEN sp.pers_salutation = 'PROF' THEN 6
END
,sp.pers_dob
,sp.pers_CreatedDate
,sp.pers_UpdatedDate
,'Candidate'
,1
,e.Emai_EmailAddress
,sP.pers_personID
FROM [db1].dbo.person sP
LEFT JOIN [db1].dbo.PhoneLink sPL ON sp.pers_personID = sPL.PLink_recordID
LEFT JOIN [db1].dbo.Phone sPH ON sPL.PLink_PhoneId = sPH.Phon_PhoneID
LEFT JOIN [db1].dbo.EmailLink eL ON sP.pers_personID = eL.ELink_RecordID
LEFT JOIN [db1].dbo.Email e ON eL.Elink_EmailID = e.Emai_EmailID
WHERE
(
sP.pers_employedby NOT IN (
'Aspen'
,'ACH'
)
)
OR
(
sP.pers_employedby IN (
'Aspen'
,'ACH'
)
AND sP.pers_personID NOT IN (
SELECT c.oppo_PrimaryPersonID FROM [SageCRM].dbo.Opportunity c
WHERE (c.oppo_contractcompleted <= '2016-01-01' OR c.oppo_contractterminated <= '2016-01-01') and c.Oppo_Deleted is null)
AND
sp.pers_isanemployee != 'ECHO'
AND sP.pers_personID IN (
SELECT c.oppo_PrimaryPersonID FROM [SageCRM].dbo.Opportunity c
WHERE c.oppo_Status != 'In Progress' OR c.oppo_Status = 'Completed')
AND sP.pers_dod IS NULL
AND sP.pers_FirstName NOT LIKE '%test%'
AND sP.pers_LastName NOT LIKE '%test%'
AND sp.pers_isanemployee != 'SalesContact'
)
Due to the fact that each person record can have multiple phone numbers linked to them, i end up with multiple records for each person, which obviously wont work as i will end up with duplicates when i actually insert the data.
The problem is, that i need to have all of the phone numbers for each record, just displayed in a different field (home phone, work phone, mobile phone).
Any Ideas, other than doing this in a separate insert statement for each phone / email link?
-------- EDIT: -----------------------------------------------------------------
Ok so, my bad for not giving you enough information. Both of your answers were good answers so thanks for that (#Horaciux, #John Wu).
However, there is no phoneType column, just a phone number. That being said, since every mobile starts with 04 and every home phone with anything else, i can pretty easily distinguish between the two phone types.
There are duplicates in the phone table though, so i will have to delete these, most likely via CTE, shouldn't be too hard.
So, i will end up with something like this for the two phone numbers:
SELECT (phon_number FROM phone p INNER JOIN PhoneLink p1 on p1.PhoneLinkID = p.PhoneLink WHERE LEFT(p.Phon_Number, 2) = '04')
SELECT (phon_number FROM phone p INNER JOIN PhoneLink p1 on p1.PhoneLinkID = p.PhoneLink WHERE LEFT(p.Phon_Number, 2) != '04')
My duplicate removal will be something like this:
WITH CTE AS
(
SELECT phon_linkID, phon_phonNumber, ROW_NUMBER() OVER (PARTITION BY phon_phonNumber ORDER BY phon_linkID) AS RN
FROM phone
)
DELETE FROM CTE WHERE RN<>1
Two easy steps.
Get rid of the joins to the phone number table.
Lookup the phone numbers per record by using a subquery in the select clause, one for each type of phone. Example
SELECT sP.pers_FirstName,
sP.pers_LastName,
(SELECT Phon_Number FROM Phone p JOIN PhoneLink pl ON pl.PhoneLinkID = p.PhoneLinkID WHERE pl.Person_ID = sP.pers_personID AND pl.Type = 'WORK') WorkPhone,
(SELECT Phon_Number FROM Phone p JOIN PhoneLink pl ON pl.PhoneLinkID = p.PhoneLinkID WHERE pl.Person_ID = sP.pers_personID AND pl.Type = 'HOME') HomePhone
FROM person
Without knowing your table's structure, I'll do some example.
select person.id,
max(case when phone.type='home' then phone.vlaue else 0 end) 'home',
max(case when phone.type='work' then phone.vlaue else 0 end) 'work'
from person,phone where...
group by person.id
Then use this query to join all other tables needed

Can I modify indexes or views to improve 'is null' clause performance

I have a query that runs in 4 seconds without an is null in the WHERE, but takes almost a minute with an is null. I've read up on the performance impact of the null check, but in this case, I can't modify the query being run.
select
view_scores.*
from
view_scores
inner join licenses AS l on view_scores.studentId = l.account_id
where view_scores.archived_date is null
and l.school_id = 'aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee'
and l.is_current = 1
and l.expiration_date >= SYSDATETIME()
view_scores is a view that aggregates other views of data in other tables, one of which ultimately holds the archived_date field. A null value in that field means it hasn't been archived. Again, the data structure is outside of my control. All I can currently change is the internals of the views involved and indexes on the tables. Do I have any hope of dramatically improving the null check on archived_date without changing the query or schema?
view_scores is created with this SQL
SELECT
ueh.user_id AS studentId,
vu.first_name + ' ' + vu.last_name AS studentName,
ueh.archived_date as archived_date,
MIN([ueh].[date_taken]) AS [started_date],
MAX(ueh.date_taken) AS last_date,
SUM(CAST([ueh].[actual_time] AS FLOAT) / 600000000) AS [total_time_minutes],
SUM([exercise_scores].[earned_score]) AS [earned_score],
SUM([exercise_scores].[possible_score]) AS [possible_score],
AVG([exercise_scores].[percent_score]) AS [percent_score],
COUNT(ueh.exercise_id) AS total_exercises
FROM [user_exercise_history] AS [ueh]
LEFT JOIN
(
SELECT
coding_exercise_score.exercise_id AS exercise_id,
coding_exercise_score.assessment_id AS assessment_id,
coding_exercise_score.user_id AS user_id,
coding_exercise_score.archived_date AS archived_date,
score.earned AS earned_score,
score.possible AS possible_score,
CASE score.possible
WHEN 0 THEN 0
WHEN score.earned THEN 100
ELSE 9.5 * POWER(CAST(score.earned AS DECIMAL) / score.possible * 100, 0.511)
END AS percent_score
FROM coding_exercise_score
INNER JOIN
coding_exercise_score_detail AS score_detail
ON coding_exercise_score.id = score_detail.exercise_score_id
INNER JOIN
score
ON score.id = score_detail.score_id
WHERE score_detail.is_best_score = 'True'
UNION
SELECT
mc_score.exercise_id AS exercise_id,
mc_score.assessment_id AS assessment_id,
mc_score.user_id AS user_id,
mc_score.archived_date AS archived_date,
score.earned AS earned_score,
score.possible AS possible_score,
CASE score.possible
WHEN 0 THEN 0
WHEN score.earned THEN 100
ELSE 9.5 * POWER(CAST(score.earned AS DECIMAL) / score.possible * 100, 0.511)
END AS percent_score
FROM
multiple_choice_exercise_score AS mc_score
INNER JOIN score
ON score.id = mc_score.score_id
) AS [exercise_scores]
ON
(
(ueh.exercise_id = [exercise_scores].exercise_id
AND ueh.user_id = [exercise_scores].user_id
AND (
(ueh.assessment_id IS NULL AND [exercise_scores].assessment_id IS NULL)
OR ueh.assessment_id = [exercise_scores].assessment_id
)
AND (ueh.archived_date IS NULL)
)
)
INNER JOIN entity_account AS vu ON ((ueh.user_id = vu.account_id))
INNER JOIN (
select
g.group_id,
g.entity_name,
g.entity_description,
g.created_on_date,
g.modified_date,
g.created_by,
g.modified_by,
agj.account_id
from entity_group as g
inner join
account_group_join as agj
on agj.group_id = g.group_id
where g.entity_name <> 'Administrators'
and g.entity_name <> 'Group 1'
and g.entity_name <> 'Group 2'
and g.entity_name <> 'Group 3'
and g.entity_name <> 'Group 4'
and g.entity_name <> 'Group 5'
) AS g ON ueh.user_id = g.account_id
WHERE ueh.status = 'Completed'
GROUP BY ueh.user_id, vu.first_name, vu.last_name, ueh.archived_date
user_exercise_history.archived_date AS archived_date being the field that the null check is ultimately being executed against. I can modify the view in any way I want and index in any way I want, but that's about it.
The execution plan with the null check in it includes a pretty crazy set of sorting and Hash Matches that pertain to the score and coding_exercise_score_detail.
You can put an index on a view.
Create Indexed Views
Try an index on view_scores.archived_date
Generally all the columns involved in JOIN ON condition and in WHERE or ORDER BY should be indexed for better performance. Since you said view_scores is a view then check whether the column archived_date in actual table is indexed or not. If not then you should consider creating an index on that column.
You may also consider adding that condition to the view creation logic itself.
view_scores.archived_date is null
ON ueh.exercise_id = [exercise_scores].exercise_id
AND ueh.user_id = [exercise_scores].user_id
AND ueh.archived_date IS NULL
AND ( ( ueh.assessment_id IS NULL
AND [exercise_scores].assessment_id IS NULL
)
OR ueh.assessment_id = [exercise_scores].assessment_id
)
I would look at this
OR in Join is typically slow
Pick and ID that will not be used
ON ueh.exercise_id = [exercise_scores].exercise_id
AND ueh.user_id = [exercise_scores].user_id
AND ueh.archived_date IS NULL
AND isnull(ueh.assessment_id, -1) = isnull([exercise_scores].assessment_id, -1)

sqlite query not getting all records if 1 table has missing data

I've got a very complex database with a lot of tables in SQLite. I'm trying to design a query that will report out a lot of data from those tables and also report out those sheep who may not have a record in one or more tables.
My query is:
SELECT sheep_table.sheep_id,
(SELECT tag_number FROM id_info_table WHERE official_id = "1" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '')) AS fedtag,
(SELECT tag_number FROM id_info_table WHERE tag_type = "4" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '')) AS farmtag,
(SELECT tag_number FROM id_info_table WHERE tag_type = "2" AND id_info_table.sheep_id = sheep_table.sheep_id AND (tag_date_off IS NULL or tag_date_off = '') and ( id_info_table.official_id is NULL or id_info_table.official_id = 0 )) AS eidtag,
sheep_table.sheep_name, codon171_table.codon171_alleles, sheep_ebv_table.usa_maternal_index, sheep_ebv_table.self_replacing_carcass_index, cluster_table.cluster_name, sheep_evaluation_table.id_evaluationid,
(sheep_table.birth_type +
sheep_table.codon171 +
sheep_evaluation_table.trait_score01 +
sheep_evaluation_table.trait_score02 +
sheep_evaluation_table.trait_score03 +
sheep_evaluation_table.trait_score04 +
sheep_evaluation_table.trait_score05 +
sheep_evaluation_table.trait_score06 +
sheep_evaluation_table.trait_score07 +
sheep_evaluation_table.trait_score08 +
sheep_evaluation_table.trait_score09 +
sheep_evaluation_table.trait_score10 +
(sheep_evaluation_table.trait_score11 / 10 )) as overall_score, sheep_evaluation_table.sheep_rank, sheep_evaluation_table.number_sheep_ranked,
sheep_table.alert01,
sheep_table.birth_date, sheep_sex_table.sex_abbrev, birth_type_table.birth_type,
sire_table.sheep_name as sire_name, dam_table.sheep_name as dam_name
FROM sheep_table
join codon171_table on sheep_table.codon171 = codon171_table.id_codon171id
join sheep_cluster_table on sheep_table.sheep_id = sheep_cluster_table.sheep_id
join cluster_table on cluster_table.id_clusternameid = sheep_cluster_table.which_cluster
join birth_type_table on sheep_table.birth_type = birth_type_table.id_birthtypeid
join sheep_sex_table on sheep_table.sex = sheep_sex_table.sex_sheepid
join sheep_table as sire_table on sheep_table.sire_id = sire_table.sheep_id
join sheep_table as dam_table on sheep_table.dam_id = dam_table.sheep_id
left outer join sheep_ebv_table on sheep_table.sheep_id = sheep_ebv_table.sheep_id
left outer join sheep_evaluation_table on sheep_table.sheep_id = sheep_evaluation_table.sheep_id
WHERE (sheep_table.remove_date IS NULL or sheep_table.remove_date is '' )
and (eval_date > "2014-10-03%" and eval_date < "2014-11%")
and sheep_ebv_table.ebv_date = "2014-11-01"
order by sheep_sex_table.sex_abbrev asc, cluster_name asc, self_replacing_carcass_index desc, usa_maternal_index desc, overall_score desc
If a given sheep does not have a record in the evaluation table or does not have a record in the EBV table no record is returned. I need all the current animals returned with all available data on them and just leave the fields for EBVs and evaluations null if they have no data.
I'm not understanding why I'm not getting them all since none of the sheep have all 3 ID types (federal, farm and EID) so there are nulls in those fields and I was expecting nulls in the evaluation sum and ebv fields as well.
Totally lost in what to do to fix it.
The problem would appear to be that you're using eval_date in the WHERE statement. I'm assuming that eval_date is in the sheep_evaluation_table, so when you use it in WHERE, it gets rid of any rows where eval_date is NULL, which it would be when you're using a LEFT OUTER JOIN and there's no matching record in sheep_evaluation_table.
Try putting the eval_date filter on the join instead, like this:
left outer join sheep_evaluation_table on sheep_table.sheep_id = sheep_evaluation_table.sheep_id
AND (eval_date > "2014-10-03%" and eval_date < "2014-11%")
WHERE (sheep_table.remove_date IS NULL or sheep_table.remove_date is '' )

Issue with GROUP-BY using alias SELECT

I am using InterBase and struggle to put my query together.
This is my current query:
SELECT a.employee_no, ea.comment, ea.status as EmpadStatus, a.advices_value
FROM advices a
JOIN EMPAD EA ON (A.CODE = EA.CODE and ea.employee_no = A.employee_no ) AND EA.SUPER_FUND_CODE NOT IN ('000038','000113', '', ' ')
JOIN ALLDED ad ON AD.CODE = EA.CODE
WHERE a.employee_no = 13844 and a.advices_date between '1.10.2014' and '31.10.2014'
My result is this:
Employee_No Comment EmpadStatus Advices_value
1 aaa 0 10.20
1 1 30.50
1 bbb 0 69.30
What I need to do is to display Employee_no, Comment and SUM of Advices_Value, but Comment has to be first comment where empad.status = 0.
I was trying to use alias but I know that you cant group by it, so this query is not going to work
SELECT a.employee_no, SUM(a.advices_value), (SELECT DISTINCT ea.comment
FROM EMPAD ea
JOIN allded ad on ea.code = ad.code
WHERE ea.employee_no = a.employee_no and ea.status = 0 and ad.super_type = 1) as comment
FROM advices a
JOIN EMPAD EA ON (A.CODE = EA.CODE and ea.employee_no = A.employee_no ) AND EA.SUPER_FUND_CODE NOT IN ('000038','000113', '', ' ')
JOIN ALLDED ad ON AD.CODE = EA.CODE
WHERE a.employee_no = 13844 and a.advices_date between '1.10.2014' and '31.10.2014'
GROUP BY a.employee_no, comment
So I need result like this :
Employee_no Comment Total
1 aaa 110.00
I normally work with MySQL here, but I believe what I am going to say also a apply for InterBase.
You could just use a case or a if, if any of those exists in your databaseT, you could use a MIN, MAX or any other group function.
MIN(IF(EA.status = 0, a.comment, NULL))
or
MIN(CASE EA.status WHEN 0 then a.comment ELSE NULL END)
This works because group functions ignore NULL values.
The only thing is that, if there is more then one row with status = 0, you won't really be able to control and get the first or the last (you really can't control the order in which the rows are processed by group function unless the function itself support ORDER BY), unless your database has a group function that does the trick.

TOP Returning null

I have the following view below. The second nested select is always returning null when I use the TOP(1) clause, but when I remove this clause it returns the data as expected, just more rows than is needed. Does anyone see anything that would explain this?
SELECT TOP (100) PERCENT
a.ITEMID AS Model
,id.CONFIGID
,id.INVENTSITEID AS SiteId
,id.INVENTSERIALID AS Serial
,it.ITEMNAME AS Description
,CASE WHEN it.DIMGROUPID LIKE '%LR-Y' THEN 'Y'
ELSE 'N'
END AS SerialNumberReqd
,ISNULL(it.PRIMARYVENDORID, N'') AS Vendor
,ISNULL(vt.NAME, N'') AS VendorName
,id.INVENTLOCATIONID AS Warehouse
,id.WMSLOCATIONID AS Bin
,ISNULL(CONVERT(varchar(12), CASE WHEN C.DatePhysical < '1901-01-01'
THEN NULL
ELSE C.DatePhysical
END, 101), N' ') AS DeliveryDate
,CASE WHEN (a.RESERVPHYSICAL > 0
OR C.StatusIssue = 1)
AND c.TransType = 0 THEN C.PONumber
ELSE ''
END AS SoNumber
,'' AS SoDetail
,ISNULL(C.PONumber, N'') AS RefNumber
,ISNULL(CONVERT(varchar(12), CASE WHEN ins.ProdDate < '1901-01-01'
THEN NULL
ELSE ins.PRODDATE
END, 101), N' ') AS DateReceived
,it.STKSTORISGROUPID AS ProdGroup
,ISNULL(CONVERT(varchar(12), CASE WHEN ins.ProdDate < '1901-01-01'
THEN NULL
ELSE ins.PRODDATE
END, 101), N' ') AS ProductionDate
,it.ITEMGROUPID
,it.STKSTORISGROUPID AS MerchandisingGroup
,CASE WHEN a.postedValue = 0
THEN (CASE WHEN D.CostAmtPosted = 0 THEN D.CostAmtPhysical
ELSE D.CostAmtPosted
END)
ELSE a.POSTEDVALUE
END AS Cost
,CASE WHEN a.PHYSICALINVENT = 0 THEN a.Picked
ELSE a.PhysicalInvent
END AS PhysicalOnHand
,ins.STKRUGSQFT AS RugSqFt
,ins.STKRUGVENDSERIAL AS RugVendSerial
,ins.STKRUGVENDDESIGN AS RugVendDesign
,ins.STKRUGEXACTSIZE AS RugExactSize
,ins.STKRUGCOUNTRYOFORIGIN AS RugCountryOfOrigin
,ins.STKRUGQUALITYID AS RugQualityId
,ins.STKRUGCOLORID AS RugColorId
,ins.STKRUGDESIGNID AS RugDesignId
,ins.STKRUGSHAPEID AS RugShapeId
,CASE WHEN (a.AVAILPHYSICAL > 0) THEN 'Available'
WHEN (id.WMSLOCATIONID = 'NIL') THEN 'Nil'
WHEN (a.RESERVPHYSICAL > 0)
AND (c.TransType = 0) THEN 'Committed'
WHEN (a.RESERVPHYSICAL > 0) THEN 'Reserved'
WHEN (id.WMSLOCATIONID LIKE '%-Q') THEN 'Damaged'
WHEN (a.Picked > 0) THEN 'Picked'
ELSE 'UNKNOWN'
END AS Status
,'' AS ReasonCode
,'' AS BaseModel
,ISNULL(CAST(ins.STKSTORISCONFIGINFO AS nvarchar(1000)), N'') AS StorisConfigInfo
,ISNULL(C.ConfigSummary, N'') AS ConfigSummary
FROM
dbo.INVENTSUM AS a WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS id WITH (NOLOCK)
ON id.DATAAREAID = a.DATAAREAID
AND id.INVENTDIMID = a.INVENTDIMID
LEFT OUTER JOIN dbo.INVENTTABLE AS it WITH (NOLOCK)
ON it.DATAAREAID = a.DATAAREAID
AND it.ITEMID = a.ITEMID
LEFT OUTER JOIN dbo.VENDTABLE AS vt WITH (NOLOCK)
ON vt.DATAAREAID = it.DATAAREAID
AND vt.ACCOUNTNUM = it.PRIMARYVENDORID
LEFT OUTER JOIN dbo.INVENTSERIAL AS ins WITH (NOLOCK)
ON ins.DATAAREAID = id.DATAAREAID
AND ins.INVENTSERIALID = id.INVENTSERIALID
LEFT OUTER JOIN (SELECT TOP (1)
itt.ITEMID
,invt.INVENTSERIALID
,itt.DATEPHYSICAL AS DatePhysical
,itt.TRANSREFID AS PONumber
,itt.TRANSTYPE AS TransType
,itt.STATUSISSUE AS StatusIssue
,dbo.stkRowsToColumn(itt.INVENTTRANSID, 'STI') AS ConfigSummary
,itt.RECID
FROM
dbo.INVENTTRANS AS itt WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS invt WITH (NOLOCK)
ON invt.DATAAREAID = itt.DATAAREAID
AND invt.INVENTDIMID = itt.INVENTDIMID
WHERE
(itt.DATAAREAID = 'STI')
AND (itt.TRANSTYPE IN (0, 2, 3, 8))
AND (invt.INVENTSERIALID <> '')
ORDER BY
itt.RECID DESC) AS C
ON C.ITEMID = a.ITEMID
AND C.INVENTSERIALID = id.INVENTSERIALID
LEFT OUTER JOIN (SELECT TOP (1)
itt2.ITEMID
,invt2.INVENTSERIALID
,itt2.COSTAMOUNTPOSTED AS CostAmtPosted
,itt2.COSTAMOUNTPHYSICAL + itt2.COSTAMOUNTADJUSTMENT AS CostAmtPhysical
,itt2.RECID
FROM
dbo.INVENTTRANS AS itt2 WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS invt2 WITH (NOLOCK)
ON invt2.DATAAREAID = itt2.DATAAREAID
AND invt2.INVENTDIMID = itt2.INVENTDIMID
WHERE
(itt2.DATAAREAID = 'STI')
AND (itt2.TRANSTYPE IN (0, 2, 3, 4, 6, 8))
AND (invt2.INVENTSERIALID <> '')
ORDER BY
itt2.RECID DESC) AS D
ON D.ITEMID = a.ITEMID
AND D.INVENTSERIALID = id.INVENTSERIALID
WHERE
(a.DATAAREAID = 'STI')
AND (a.CLOSED = 0)
AND (a.PHYSICALINVENT > 0)
AND (it.ITEMGROUPID LIKE 'FG-%'
OR it.ITEMGROUPID = 'MULTISHIP')
ORDER BY
SiteId
,Warehouse
Presumably, the top value in the subquery doesn't meet the subsequent join conditions. That is, this condition is not met:
D.ITEMID = a.ITEMID AND D.INVENTSERIALID = id.INVENTSERIALID
You are using a left outer join, so NULL values are filled in.
EDIT:
To re-iterate. When you run it with top 1, there are no values (for at least some combinations of the two variables). So, NULL will be filled in for these values. After all, top 1 (with or without the parentheses) returns only one row.
When you run it returning multiple rows, presumably there are matches. For the rows that match, the corresponding values are put it. This is the way that left outer join works.
Gordon's answer is correct as to why I was getting a few rows when removing top and none when I had it. The subquery in question was returning all the rows in the InventTrans table (5 million+) so when I used top, it was just getting the first row which didn't have anything. I realized this was the case when I was trying random high values (e.g 50000) in the TOP clause.
The ultimate fix was to change the left outer joins on the C and D subqueries to Cross Apply, and then change the where clauses to better filter the table (e.g itt.itemid = a.itemid and invt1.inventserialid = id.inventserialid). Using that, I was able to use TOP 1 as expected.