SQL Where clause greatly increases query time - sql

I have a table that I do some joins and operations on. This table has about 150,000 rows and if I select all and run it, it returns in about 10 seconds. If I create my query into its own table, and filter out all the rows where a certain field is null, now the query takes 10 minutes to run. Is it suppoused to be like this or is there any way to fix it? Here is the query.
SELECT *
FROM
(
Select
I.Date_Created
,I.Company_Code
,I.Division_Code
,I.Invoice_Number
,Sh.CUST_PO
,I.Total_Quantity
,ID.Total
,SH.Ship_City City
,CASE WHEN SH.Ship_Cntry <> 'US' THEN 'INT' ELSE SH.Ship_prov END State
,SH.Ship_Zip Zip
,SH.Ship_Cntry Country
,S.CustomerEmail
from [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices I (nolock)
LEFT JOIN (SELECT
ID.Company_Code
,ID.Division_Code
,ID.Invoice_Number
,SUM (ID.Price* ID.Quantity) Total
FROM [JMNYC-AMTDB].[AMTPLUS].[dbo].Invoices_Detail ID (nolock)
GROUP BY ID.Company_Code, ID.Division_Code, ID.Invoice_Number) ID
ON I.Company_Code = ID.Company_Code
AND I.Division_Code = ID.Division_Code
AND I.Invoice_Number = ID.Invoice_Number
LEFT JOIN
[JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].SHIPHIST SH (nolock) ON I.Pickticket_Number = SH.Packslip
LEFT JOIN
[JMDNJ-ACCELSQL].[A1WAREHOUSE].[dbo].[MagentoCustomerEmailData] S on SH.CUST_PO = S.InvoiceNumber
Where I.Company_Code ='09' AND I.Division_Code = '001'
AND I.Customer_Number = 'ECOM2X'
)T
Where T.CustomerEmail IS NOT NULL -- This is the problematic line
Order By T.Date_Created desc

If you are aware of the Index Considerations and you are sure about the problem point, then you can use this to improve it:
USE A1WAREHOUSE;
GO
CREATE NONCLUSTERED INDEX IX_MagentoCustomerEmailData_CustomerEmail
ON [dbo].[MagentoCustomerEmailData] (CustomerEmail ASC);
GO
Totally, you need to add index on columns used in ORDER BY, WHERE, GROUP BY, ON etc sections. Before adding indexes be sure that you are aware of the consequences.
Read more about Index:
https://www.mssqltips.com/sqlservertutorial/9133/sql-server-nonclustered-indexes/
https://www.itprotoday.com/sql-server/indexing-dos-and-don-ts

Related

SQL View slow when filtered. Is there a clean way to improve performance?

Let me open with:
SHOWPLAN permission denied in database 'MyDatabase'.
With that out of the way, I'll layout my situation.
So, The database I work with has a view that executes fairly quickly.
SELECT * FROM MyView
returns 32 rows in 1 second and includes a non-indexed column of values (IDs) I need to filter on.
If I filter on these IDs directly in the view:
SELECT * FROM MyView WHERE MyView.SomeId = 18
Things slow immensely and it takes 21 seconds to return the 20 rows with that ID.
As an experiment I pushed the unfiltered results into a temporary table and executed the filtered query on the the temporary table:
IF OBJECT_ID('tempdb..#TEMP_TABLE') IS NOT NULL
BEGIN
DROP TABLE #TEMP_TABLE
END
SELECT * INTO #TEMP_TABLE
FROM MyView;
SELECT *
FROM #TEMP_TABLE
WHERE #TEMP_TABLE.SomeId = 18
DROP TABLE #TEMP_TABLE
And found that it returns the filtered results far faster (roughly 1 second)
Is there a cleaner syntax or pattern that can be implemented to achieve the same performance?
UPDATE: View Definition and Description
Manually obfuscated, but I was careful so hopefully there aren't many errors. Still waiting on SHOWPLAN permissions, so Execution Plan is still pending.
The view's purpose is to provide a count of all the records that belong to a specific component (CMP.COMPONENT_ID = '100') grouped by location.
"Belonging" is determined by the record's PROC_CODE (mapped through PROC_ID) being within the CMP's inclusion range (CMP_INCs) and not in the CMP's exclusion range (CMP_EXCs).
In practice, exclusion ranges are created for individual codes (the bounds are always equal) making it sufficient to check that the code is not equal a bound.
PROC_CODES can (and don't always) have an alphabetic prefix or suffix, which makes the ISNUMERIC() comparison necessary.
Records store PROC_IDs for their PROC_CODEs, so it's necessary to convert the CMP's PROC_CODE ranges into a set of PROC_IDs for identifying which records belong to that component
The performance issue occurs when trying to filter by DEPARTMENT_ID or LOCATION_ID
[CO_RECORDS] is also a view, but if it's that deep I'm going turf this to someone with less red tape to fight through.
CREATE VIEW [ViewsSchema].[MyView] AS
WITH
CMP_INCs AS (SELECT RNG.*, COALESCE(RNG.RANGE_END, RNG.RANGE_BEG) [SAFE_END] FROM DBEngine.DBO.DB_CMP_RANGE [RNG] WHERE [RNG].COMPONENT_ID = '100'),
CMP_EXCs AS (SELECT CER.* FROM DBEngine.DBO.DB_CMP_EXC_RANGE CER WHERE CER.COMPONENT_ID = '100'),
CMP_PROC_IDs AS (
SELECT
DBEngine_ProcTable.PROC_ID [CMP_PROC_ID],
DBEngine_ProcTable.PROC_CODE [CMP_PROC_CODE],
DB_CmpTable.COMPONENT_ID [CMP_ID],
MAX(DB_CmpTable.COMPONENT_NAME) [CMP_NAME]
FROM [DBEngine].DBO.DBEngine_ProcTable DBEngine_ProcTable
LEFT JOIN CMP_INCs ON ISNUMERIC(DBEngine_ProcTable.PROC_CODE) = ISNUMERIC(CMP_INCs.RANGE_BEG)
AND(DBEngine_ProcTable.PROC_CODE = CMP_INCs.RANGE_BEG
OR DBEngine_ProcTable.PROC_CODE BETWEEN CMP_INCs.RANGE_BEG AND CMP_INCs.SAFE_END)
INNER JOIN DBEngine.DBO.DB_CmpTable ON CMP_INCs.COMPONENT_ID = DB_CmpTable.COMPONENT_ID
LEFT JOIN CMP_EXCs EXCS ON EXCS.COMPONENT_ID = DB_CmpTable.COMPONENT_ID AND EXCS.EXCL_RANGE_END = DBEngine_ProcTable.PROC_CODE
WHERE EXCS.EXCL_RANGE_BEG IS NULL
GROUP BY
DBEngine_ProcTable.PROC_ID,
DBEngine_ProcTable.PROC_CODE,
DBEngine_ProcTable.BILL_DESC,
DBEngine_ProcTable.PROC_NAME,
DB_CmpTable.COMPONENT_ID
)
SELECT
RECORD.LOCATION_NAME [LOCATION_NAME]
, RECORD.LOCATION_ID [LOCATION_ID]
, MAX(RECORD.[Department]) [DEPARTMENT]
, RECORD.[Department ID] [DEPARTMENT_ID]
, SUM(RECORD.PROCEDURE_QUANTITY) [PROCEDURE_COUNT]
FROM DBEngineCUSTOMRPT.ViewsSchema.CO_RECORDS [RECORDS]
INNER JOIN CMP_PROC_IDs [CV] ON [CV].CMP_PROC_ID = [RECORDS].PROC_ID
CROSS JOIN (SELECT DATEADD(M, DATEDIFF(M, 0,GETDATE()), 0) [FIRSTOFTHEMONTH]) VARS
WHERE [RECORDS].TYPE = 1
AND ([RECORDS].VOID_DATE IS NULL OR [RECORDS].VOID_DATE >= VARS.[FIRSTOFTHEMONTH] )
AND [RECORDS].POST_DATE < VARS.[FIRSTOFTHEMONTH]
AND [RECORDS].DOS_MONTHS_BACK = 2
GROUP BY [RECORDS].LOCATION_NAME, [RECORDS].[Department ID]
GO
Based on the swift down votes, the answer to my question is
'No, there is not a clean syntax based solution for the improved
performance, and asking for one is ignorant of the declarative nature
of SQL you simple dirty plebeian'.
From the requests for the view's definition, it's clear that performance issues in simple queries should be addressed by fixing the structure of the objects being queried ('MyView' in this case) rather than syntactical gymnastics.
For interested parties the issue was resolved by adding a Row_Number() column to the final select in the view definition, wrapping it in a CTE, and using the new column in an always true filter while selecting the original columns.
I have no idea if this is the optimal solution. It doesn't feel good to me, but it appears to be working.
CREATE VIEW [ViewsSchema].[MyView] AS
WITH
CMP_INCs AS (SELECT RNG.*, COALESCE(RNG.RANGE_END, RNG.RANGE_BEG) [SAFE_END] FROM DBEngine.DBO.DB_CMP_RANGE [RNG] WHERE [RNG].COMPONENT_ID = '100'),
CMP_EXCs AS (SELECT CER.* FROM DBEngine.DBO.DB_CMP_EXC_RANGE CER WHERE CER.COMPONENT_ID = '100'),
CMP_PROC_IDs AS (
SELECT
DBEngine_ProcTable.PROC_ID [CMP_PROC_ID],
DBEngine_ProcTable.PROC_CODE [CMP_PROC_CODE],
DB_CmpTable.COMPONENT_ID [CMP_ID],
MAX(DB_CmpTable.COMPONENT_NAME) [CMP_NAME]
FROM [DBEngine].DBO.DBEngine_ProcTable DBEngine_ProcTable
LEFT JOIN CMP_INCs ON ISNUMERIC(DBEngine_ProcTable.PROC_CODE) = ISNUMERIC(CMP_INCs.RANGE_BEG)
AND(DBEngine_ProcTable.PROC_CODE = CMP_INCs.RANGE_BEG
OR DBEngine_ProcTable.PROC_CODE BETWEEN CMP_INCs.RANGE_BEG AND CMP_INCs.SAFE_END)
INNER JOIN DBEngine.DBO.DB_CmpTable ON CMP_INCs.COMPONENT_ID = DB_CmpTable.COMPONENT_ID
LEFT JOIN CMP_EXCs EXCS ON EXCS.COMPONENT_ID = DB_CmpTable.COMPONENT_ID AND EXCS.EXCL_RANGE_END = DBEngine_ProcTable.PROC_CODE
WHERE EXCS.EXCL_RANGE_BEG IS NULL
GROUP BY
DBEngine_ProcTable.PROC_ID,
DBEngine_ProcTable.PROC_CODE,
DBEngine_ProcTable.BILL_DESC,
DBEngine_ProcTable.PROC_NAME,
DB_CmpTable.COMPONENT_ID
),
RESULTS as (
SELECT
RECORD.LOCATION_NAME [LOCATION_NAME]
, RECORD.LOCATION_ID [LOCATION_ID]
, MAX(RECORD.[Department]) [DEPARTMENT]
, RECORD.[Department ID] [DEPARTMENT_ID]
, SUM(RECORD.PROCEDURE_QUANTITY) [PROCEDURE_COUNT]
, ROW_NUMBER() OVER (ORDER BY TDL.[Medical Department ID], TDL.[BILL_AREA_ID], TDL.JP_POS_NAME) [ROW]
FROM DBEngineCUSTOMRPT.ViewsSchema.CO_RECORDS [RECORDS]
INNER JOIN CMP_PROC_IDs [CV] ON [CV].CMP_PROC_ID = [RECORDS].PROC_ID
CROSS JOIN (SELECT DATEADD(M, DATEDIFF(M, 0,GETDATE()), 0) [FIRSTOFTHEMONTH]) VARS
WHERE [RECORDS].TYPE = 1
AND ([RECORDS].VOID_DATE IS NULL OR [RECORDS].VOID_DATE >= VARS.[FIRSTOFTHEMONTH] )
AND [RECORDS].POST_DATE < VARS.[FIRSTOFTHEMONTH]
AND [RECORDS].DOS_MONTHS_BACK = 2
GROUP BY [RECORDS].LOCATION_NAME, [RECORDS].[Department ID]
)
SELECT
[LOCATION_NAME]
, [LOCATION_ID]
, [DEPARTMENT]
, [DEPARTMENT_ID]
, [PROCEDURE_COUNT]
FROM RESULTS
WHERE [ROW] > 0
GO

slow performance with exists case statement

Essentially I am trying to see if c_DSS_PG_Submission.PtNum is in the ED_MLP_ATTN temp table and if it is then assign 'MLP+ATTN'. The temp table alone takes about 2 minutes to generate and has ~1000 rows, the PG table has about ~300 rows so these are not big tables. However the query below runs for 20+ minutes. Would you recommend anything different with the query? I've tried changing exists to IN but same slow performance.
WITH ed_mlp_attn
AS ( SELECT smsdss.c_cfvhs_emstat_chart.visitno ,
CAST (smsdss.c_cfvhs_emstat_chart.dschdate AS DATE) AS dschdate
FROM smsdss.c_cfvhs_emstat_chart
INNER JOIN smsdss.c_cfvhs_emstat_oi_header ON smsdss.c_cfvhs_emstat_chart.chrtno = c_cfvhs_emstat_oi_header.chartno
COLLATE SQL_Latin1_General_Pref_CP1_CI_AS
INNER JOIN smsdss.c_cfvhs_emstat_oi_detail ON c_cfvhs_emstat_oi_header.oi_header_id = smsdss.c_cfvhs_emstat_oi_detail.oi_header_id
INNER JOIN smsdss.c_cfvhs_emstat_physician ON smsdss.c_cfvhs_emstat_chart.erphys = smsdss.c_cfvhs_emstat_physician.physid
COLLATE SQL_Latin1_General_Pref_CP1_CI_AS
WHERE smsdss.c_cfvhs_emstat_chart.dschdate >= DATEADD(mm, -1,
GETDATE())
AND smsdss.c_cfvhs_emstat_chart.dispocd <> 'DXERR'
AND c_cfvhs_emstat_oi_detail.VALUE IN ( '21504',
'21505' )
AND smsdss.c_cfvhs_emstat_physician.code1 = 'RES'
)
SELECT atndrname ,
atndrno ,
CASE WHEN EXISTS ( SELECT 1
FROM ed_mlp_attn
WHERE visitno COLLATE SQL_Latin1_General_Pref_CP1_CI_AS = smsdss.c_dss_pg_submission.ptnum )
THEN 'MLP+ATTN'
ELSE 'NO'
END AS ed_prov_type
FROM smsdss.c_dss_pg_submission
WHERE date_run = '2014-12-12'
AND surveydesignator IN ( 'ER0101', 'PE0101' )
ORDER BY surveydesignator ,
ptnum
Firstly, a CTE is not the same as a temp table, note the information in #JodyT's comment.
The query in the CTE will be executed for each row returned by outer query.
This will slow down the query a great deal I'd expect. I would break down the current CTE in to an actual temp table as a starting point to improve performance.
NOTE: I've used aliases for table names to reduce the amount of SQL and make it a little easier to read.
SELECT chart.visitno , CAST (chart.dschdate AS DATE) AS dschdate
INTO #TEMP
FROM c_cfvhs_emstat_chart chart
INNER JOIN c_cfvhs_emstat_oi_header header
ON chart.chrtno = header.chartno COLLATE SQL_Latin1_General_Pref_CP1_CI_AS
INNER JOIN c_cfvhs_emstat_oi_detail detail
ON header.oi_header_id = detail.oi_header_id
INNER JOIN c_cfvhs_emstat_physician physician
ON chart.erphys = physician.physid COLLATE SQL_Latin1_General_Pref_CP1_CI_AS
WHERE chart.dschdate >= DATEADD(mm, -1, GETDATE())
AND chart.dispocd <> 'DXERR'
AND detail.VALUE IN ( '21504', '21505' )
AND physician.code1 = 'RES'
Then query that:
SELECT atndrname ,
atndrno ,
CASE WHEN EXISTS ( SELECT 1
FROM #TEMP
WHERE visitno COLLATE SQL_Latin1_General_Pref_CP1_CI_AS = smsdss.c_dss_pg_submission.ptnum )
THEN 'MLP+ATTN'
ELSE 'NO'
END AS ed_prov_type
FROM smsdss.c_dss_pg_submission
WHERE date_run = '2014-12-12'
AND surveydesignator IN ( 'ER0101', 'PE0101' )
ORDER BY surveydesignator , ptnum
Breaking it down like this should improve performance by a degree, but without information on indexes and an execution plan, it's difficult to provide further advice.

How to find duplicates in a large table based on matching and non matching fields?

I have a very large table with more than 10 million records. I want to find duplicates based on some fields matching and some fields not matching in it.
The query currently I am using is below:
SELECT DISTINCT MainTable.[lineitemid]
FROM [dbo].[lineitem] MainTable
INNER JOIN [dbo].[lineitem] AS ChildTable
ON ChildTable.invoicedate = MainTable.invoicedate
AND LEFT(ChildTable.vendorname, 4) = LEFT(MainTable.vendorname, 4)
AND ChildTable.invoiceid <> MainTable.invoiceid AND -- Invoice ID column not matching
ChildTable.documentcurrencyamount = MainTable.documentcurrencyamount
WHERE ChildTable.lineitemid <> MainTable.lineitemid AND -- LineItemId is PK
MainTable.projectid = 1125 AND ChildTable.projectid = 1125 -- Duplicates should be identified with specific ProjectId
This query is working fine if the number of records for ProjectId is under 100,000.
When the ProjectId records are more than 1 million, while executing this query, the tempdb size shoots up to 100 GB and causing low disk space issues. The query is taking forever to execute.
Please help me in optimizing the query.
Added the below lines after getting answer for the above query....
Thanks a lot, #Gordon-Linoff. The query you suggested worked much faster. The VendorName is from a different table. Can I include a inner join as shown below?
SELECT li1.[LineItemId]
FROM [dbo].[LineItem] li1
INNER JOIN VendorMaster vm1 ON li1.VendorNumber=vm1.VendorNumber
AND vm1.CompanyCode = li1.CompanyCode
WHERE EXISTS (SELECT 1
FROM [dbo].[LineItem] as li2
INNER JOIN VendorMaster vm2 on li2.VendorNumber = vm2.VendorNumber
AND vm2.CompanyCode = li2.CompanyCode
WHERE li2.InvoiceDate = li.InvoiceDate and
LEFT(li2.VendorName, 4) = LEFT(li1.VendorName, 4) and
li2.InvoiceId <> li1.InvoiceId and -- Invoice ID column not matching
li2.DocumentCurrencyAmount = li1.DocumentCurrencyAmount and
li2.LineItemId <> li1.LineItemId and
li2.ProjectId = li1.ProjectId
li2.VendorNumber = li.VendorNumber)
AND li.ProjectId = 1125
Is it an efficient approach?
A less expensive way to run this query is to use exists and dispense with the distinct:
SELECT li.[LineItemId]
FROM [dbo].[LineItem] li
WHERE EXISTS (SELECT 1
FROM [dbo].[LineItem] as li2 on
WHERE li2.InvoiceDate = li.InvoiceDate and
LEFT(li2.VendorName, 4) = LEFT(li.VendorName, 4) and
li2.InvoiceId <> li.InvoiceId and -- Invoice ID column not matching
li2.DocumentCurrencyAmount = li.DocumentCurrencyAmount and
li2.LineItemId <> li.LineItemId and
li2.ProjectId = li.ProjectId
WHERE MainTable.ProjectId = 1125;
For performance, an index on LineItem(ProjectId, InvoiceDate, DocumentCurrencyAmount, VendorName, InvoiceId, LineItemId) would help. You could further speed the query by declaring LEFT(LineItem.VendorName, 4) as a computed column and adding it to the index before VendorName.

Select statement to show the corresponding user with the lowest/highest amount?

I want to write a select statement output that, among other things, has both a lowest_bid and highest_bid column. I know how to do that bit, but want I also want is to show the user (user_firstname and user_lastname combined into their own column) as lowest_bidder and highest_bidder. What I have so far is:
select item_name, item_reserve, count(bid_id) as number_of_bids,
min(bid_amount) as lowest_bid, ???, max(big_amount) as highest_bid,
???
from vb_items
join vb_bids on item_id=bid_item_id
join vb_users on item_seller_user_id=user_id
where bid_status = ‘ok’ and
item_sold = ‘no’
sort by item_reserve
(The ???'s are where the columns should go, once I figure out what to put there!)
This seems like good use of window functions. I've assumed a column vb_bids.bid_user_id. If there's no link between a bid and a user, you can't answer this question
With x as (
Select
b.bid_item_id,
count(*) over (partition by b.bid_item_id) as number_of_bids,
row_number() over (
partition by b.bid_item_id
order by b.bid_amount desc
) as high_row,
row_number() over (
partition by b.bid_item_id
order by b.bid_amount
) as low_row,
b.bid_amount,
u.user_firstname + ' ' + u.user_lastname username
From
vb_bids b
inner join
vb_users u
on b.bid_user_id = u.user_id
Where
b.bid_status = 'ok'
)
Select
i.item_name,
i.item_reserve,
min(x.number_of_bids) number_of_bids,
min(case when x.low_row = 1 then x.bid_amount end) lowest_bid,
min(case when x.low_row = 1 then x.username end) low_bidder,
min(case when x.high_row = 1 then x.bid_amount end) highest_bid,
min(case when x.high_row = 1 then x.username end) high_bidder
From
vb_items i
inner join
x
on i.item_id = x.bid_item_id
Where
i.item_sold = 'no'
Group By
i.item_name,
i.item_reserve
Order By
i.item_reserve
Example Fiddle
In order to get the users, I broke out the aggregates into their own tables, joined them by the item_id and filtered them by a derived value that is either the min or max of bid_amount. I could have joined to vb_bids for a third time, and kept the aggregate functions, but that would've been redundant.
This will fail if you have two low bids of the exact same amount for the same item, since the join is on bid_amount. If you use this, then you'd want to created an index on vb_bids covering bid_amount.
select item_name, item_reserve, count(bid_id) as number_of_bids,
low_bid.bid_amount as lowest_bid, low_user.first_name + ' ' + low_user.last_name,
high_bid.bid_amount as highest_bid, high_user.first_name + ' ' + high_user.last_name
from vb_items
join vb_bids AS low_bid on item_id = low_bid.bid_item_id
AND low_bid.bid_amount = (
SELECT MIN(bid_amount)
FROM vb_bids
WHERE bid_item_id = low_bid.bid_item_id)
join vb_bids AS high_bid on item_id = high_bid.bid_item_id
AND high_bid.bid_amount = (
SELECT MAX(bid_amount)
FROM vb_bids
WHERE bid_item_id = high_bid.bid_item_id)
join vb_users AS low_user on low_bid.user_id=user_id
join vb_users AS high_user on high_bid.user_id=user_id
where bid_status = ‘ok’ and
item_sold = ‘no’
group by item_name, item_reserve,
low_bid.bid_amount, low_user.first_name, low_user.last_name,
high_bid.bid_amount, high_user.first_name, high_user.last_name
order by item_reserve
I am a big fan of using Common Table Expressions (CTEs) for situations like this, because of the following advantages:
Separating different parts of the logic, adding to readability, and
Reducing complexity (for example, the need to GROUP BY a large number of fields, or to repeat the same join multiple times.)
So, my suggested approach would be something like this:
-- semi-colon must precede CTE
;
-- collect bid info
WITH item_bids AS (
SELECT
i.item_id, i.item_name, i.item_reserve, b.bid_id, b.bid_amount,
(u.first_name + ' ' + u.last_name) AS bid_user_name
FROM vb_items i
JOIN vb_bids b ON i.item_id = b.bid_item_id
JOIN vb_users u ON b.user_id = u.user_id
WHERE b.bid_status = 'ok'
AND i.item_sold = 'no'
),
-- group bid info
item_bid_info AS (
SELECT item_id, item_name, item_reserve
COUNT(bid_id) AS number_of_bids, MIN(bid_amount) AS lowest_bid, MAX(bid_amount) AS highest_bid
FROM item_bids
GROUP BY item_id, item_name, item_reserve
)
-- assemble final result
SELECT
bi.item_name, bi.item_reserve, bi.number_of_bids,
bi.low_bid, low_bid.bid_user_name AS low_bid_user,
bi.high_bid, high_bid.bid_user_name AS high_bid_user
FROM item_bid_info bi
JOIN item_bids AS low_bid ON bi.lowest_bid = low_bid.bid_amount AND bi.item_id = low_bid.bid_item_id
JOIN item_bids AS high_bid ON bi.lowest_bid = high_bid.bid_amount AND bi.item_id = high_bid.bid_item_id
ORDER BY bi.item_reserve;
Note that the entire SQL statement (from the starting WITH all the way down to the final semi-colon after the ORDER BY) is a single statement, and is evaluated by the optimizer as such. (Some people think each part is evaluated separately, like temp tables, and then all the rows are joined together at the end in a final step. That's not how it works. CTEs are just as efficient as sub-queries.)
Also note that this approach does a JOIN on the bid amount, so if there are identical bids for a single item, it will fail. (Seems like that should be an invalid state anyway, though, right?) Also you may have efficiency concerns depending on:
The size of your table
Whether the lookup can use an index
You could address both issues by including a unique constraint (which has the added advantage of indexing the foreign key bid_item_id as well; always a good practice):
ALTER TABLE [dbo].[vb_bids] ADD CONSTRAINT [UK_vbBids_item_amount]
UNIQUE NONCLUSTERED (bid_item_id, bid_amount)
GO
Hope that helps!

Compare values from one table with the results from a query?

First, I will explain the what is being captured. User's have a member level associated with their accounts (Bronze, Gold, Diamond, etc). A nightly job needs to run to calculate the orders from today a year back. If the order total for a given user goes over or under a certain amount their level is upgraded or downgraded. The table where the level information is stored will not change much, but the minimum and maximum amount thresholds may over time. This is what the table looks like:
CREATE TABLE [dbo].[MemberAdvantageLevels] (
[Id] int NOT NULL IDENTITY(1,1) ,
[Name] varchar(255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[MinAmount] int NOT NULL ,
[MaxAmount] int NOT NULL ,
CONSTRAINT [PK__MemberAd__3214EC070D9DF1C7] PRIMARY KEY ([Id])
)
ON [PRIMARY]
GO
I wrote a query that will group the orders by user for the year to date. The query includes their current member level.
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.UserProfile.UserId) AS UserOrders,
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.UserProfile ON dbo.tbh_Orders.CustomerID = dbo.UserProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.UserProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.UserProfile.UserId,
dbo.UserProfile.UserName,
dbo.UserProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
So, I need to check the OrdersTotal and if it exceeds the current level threshold, I then need to find the Level that fits their current order total and create a new record with their new level.
So for example, lets say jon#doe.com currently is at bronze. The MinAmount for bronze is 0 and the MaxAmount is 999. Currently his Orders for the year are at $2500. I need to find the level that $2500 fits within and upgrade his account. I also need to check their LevelAchievmentDate and if it is outside of the current year we may need to demote the user if there has been no activity.
I was thinking I could create a temp table that holds the results of all levels and then somehow create a CASE statement in the query above to determine the new level. I don't know if that is possible. Or, is it better to iterate over my order results and perform additional queries? If I use the iteration pattern I know i can use the When statement to iterate over the rows.
Update
I updated my Query A bit and so far came up with this, but I may need more information than just the ID from the SubQuery
Select * into #memLevels from MemberAdvantageLevels
SELECT
Sum(dbo.tbh_Orders.SubTotal) AS OrderTotals,
Count(dbo.AZProfile.UserId) AS UserOrders,
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AZProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent as IsCurrentLevel,
dbo.MemberAdvantageLevels.Id as MemberLevelId,
(Select Id from #memLevels where Sum(dbo.tbh_Orders.SubTotal) >= #memLevels.MinAmount and Sum(dbo.tbh_Orders.SubTotal) <= #memLevels.MaxAmount) as NewLevelId
FROM
dbo.tbh_Orders
INNER JOIN dbo.tbh_OrderStatuses ON dbo.tbh_Orders.StatusID = dbo.tbh_OrderStatuses.OrderStatusID
INNER JOIN dbo.AZProfile ON dbo.tbh_Orders.CustomerID = dbo.AZProfile.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ON dbo.AZProfile.UserId = dbo.UserMemberAdvantageLevels.UserId
INNER JOIN dbo.MemberAdvantageLevels ON dbo.UserMemberAdvantageLevels.MemberAdvantageLevelId = dbo.MemberAdvantageLevels.Id
WHERE
dbo.tbh_OrderStatuses.OrderStatusID = 4 AND
(dbo.tbh_Orders.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE()) and IsCurrent = 1
GROUP BY
dbo.AZProfile.UserId,
dbo.AZProfile.UserName,
dbo.AzProfile.Email,
dbo.MemberAdvantageLevels.Name,
dbo.MemberAdvantageLevels.MinAmount,
dbo.MemberAdvantageLevels.MaxAmount,
dbo.UserMemberAdvantageLevels.LevelAchievmentDate,
dbo.UserMemberAdvantageLevels.LevelAchiementAmount,
dbo.UserMemberAdvantageLevels.IsCurrent,
dbo.MemberAdvantageLevels.Id
This hasn't been syntax checked or tested but should handle the inserts and updates you describe. The insert can be done as single statement using a derived/virtual table which contains the orders group by caluclation. Note that both the insert and update statement be done within the same transaction to ensure no two records for the same user can end up with IsCurrent = 1
INSERT UserMemberAdvantageLevels (UserId, MemberAdvantageLevelId, IsCurrent,
LevelAchiementAmount, LevelAchievmentDate)
SELECT t.UserId, mal.Id, 1, t.OrderTotals, GETDATE()
FROM
(SELECT ulp.UserId, SUM(ord.SubTotal) OrderTotals, COUNT(ulp.UserId) UserOrders
FROM UserLevelProfile ulp
INNER JOIN tbh_Orders ord ON (ord.CustomerId = ulp.UserId)
WHERE ord.StatusID = 4
AND ord.AddedDate BETWEEN DATEADD(year,-1,GETDATE()) AND GETDATE()
GROUP BY ulp.UserId) AS t
INNER JOIN MemberAdvantageLevels mal
ON (t.OrderTotals BETWEEN mal.MinAmount AND mal.MaxAmount)
-- Left join needed on next line in case user doesn't currently have a level
LEFT JOIN UserMemberAdvantageLevels umal ON (umal.UserId = t.UserId)
WHERE umal.MemberAdvantageLevelId IS NULL -- First time user has been awarded a level
OR (mal.Id <> umal.MemberAdvantageLevelId -- Level has changed
AND (t.OrderTotals > umal.LevelAchiementAmount -- Acheivement has increased (promotion)
OR t.UserOrders = 0)) -- No. of orders placed is zero (de-motion)
/* Reset IsCurrent flag where new record has been added */
UPDATE UserMemberAdvantageLevels
SET umal1.IsCurrent=0
FROM UserMemberAdvantageLevels umal1
INNER JOIN UserMemberAdvantageLevels umal2 On (umal2.UserId = umal1.UserId)
WHERE umal1.IsCurrent = 1
AND umal2.IsCurrent = 2
AND umal1.LevelAchievmentDate < umal2.LevelAchievmentDate)
One approach:
with cte as
(SELECT Sum(o.SubTotal) AS OrderTotals,
Count(p.UserId) AS UserOrders,
p.UserId,
p.UserName,
p.Email,
l.Name,
l.MinAmount,
l.MaxAmount,
ul.LevelAchievmentDate,
ul.LevelAchiementAmount,
ul.IsCurrent as IsCurrentLevel,
l.Id as MemberLevelId
FROM dbo.tbh_Orders o
INNER JOIN dbo.UserProfile p ON o.CustomerID = p.UserId
INNER JOIN dbo.UserMemberAdvantageLevels ul ON p.UserId = ul.UserId
INNER JOIN dbo.MemberAdvantageLevels l ON ul.MemberAdvantageLevelId = l.Id
WHERE o.StatusID = 4 AND
o.AddedDate BETWEEN dateadd(year,-1,getdate()) AND GETDATE() and
IsCurrent = 1
GROUP BY
p.UserId, p.UserName, p.Email, l.Name, l.MinAmount, l.MaxAmount,
ul.LevelAchievmentDate, ul.LevelAchiementAmount, ul.IsCurrent, l.Id)
select cte.*, ml.*
from cte
join #memLevels ml
on cte.OrderTotals >= ml.MinAmount and cte.OrderTotals <= ml.MaxAmount