MS Access Complicated Order By - sql

For an invoicing app I'm working on my PHB has decided that the parts sold listed on the invoice need to go in a very unique order. I would like to accomplish this with a single sql statement if possible.
Essentially the order needs to be as such
The most expensive part (but only if there is another part listed at $0)
All parts listed at $0
All other parts (or all parts) listed by order of part_id
All parts with a part_id of ("MISC-30","MISC-31","TEMP")
All parts with a negative qty [returns]
There is also a jumble of comments that will need to be added but That will have to be handled by the code
So far I have:
SELECT *
FROM order_part
WHERE ordid = 1234
ORDER BY qty > 0, part_id NOT IN("MISC-30","MISC-31","TEMP"), part_id
However I cannot figure out how to incorporate the first 2 rules

Since you've had to give up being messing long ago on this project ;)
Select *
, IIF(((Select Count(*) from order_part
where orderid = 1234 and price = 0))=0
and price = ((select max(price) from
order_part where orderid = 1234
and qty >0 and part_id not in(("MISC-30","MISC-31","TEMP")
)), 1
, IIf(price = 0, 2
, IIf(part_id IN("MISC-30","MISC-31","TEMP"), 4
, IIf(qty < 0, 5
, 3)))) AS Part_Sort
from order_part
Order By Part Sort, part_id
Really wish Access had case statement. But you can build these nested IIf's and provide a sorting number based on your logic. The final "ELSE" part is the #3 since just sorting by the part ID is the third choice/ doesn't fall under these other categories. Sorry, I know the parenthesis are wrong.

Perhaps you mean you want to have a single recordset i.e. output from your SQL that can be processed by your invoicing App?
Have you thought of the folling -- Its not pretty but it might work.
Select * From
(
Select 1 as MyOrder .... rest of criteria 1
Union
Select 2 as MyOrder .... rest of criteria 2
Union
Select 3 as MyOrder .... rest of criteria 3
Union
Select 4 as MyOrder .... rest of criteria 4
Union
Select 5 as MyOrder .... rest of criteria 5
)
Order by MyOrder

If it's not possible to do this with one select statement, I would write 5 queries that each get the parts of this end query you need with no intersections. Then add a SortBy integer value to each query and union them together (sorting by the SortBy value).
I've done this in SQL Server and I'm guessing this is possible in Access...

Even if this is possible with a single query, I think you owe it to yourself and future developers to make individual queries and join the results together some other way.
The only exception to this is if performance is 100% completely critical and you need to save every microsecond.
But as a developer and manager I'd rather see maintainable code that a junior team member can figure out than some uber-messy SQL statement.

My advice: have the guy that makes you waste your precious time doing such dummy things fired! He must be a slave driver or something? If you can't have him fired, leave the company.

Related

SQL question with attempt on customer information

Schema
Question: List all paying customers with users who had 4 or 5 activities during the week of February 15, 2021; also include how many of the activities sent were paid, organic and/or app store. (i.e. include a column for each of the three source types).
My attempt so far:
SELECT source_type, COUNT(*)
FROM activities
WHERE activity_time BETWEEN '02-15-21' AND '02-19-21'
GROUP BY source_type
I would like to get a second opinion on it. I didn't include the accounts table because I don't believe that I need it for this query, but I could be wrong.
Have you tried to run this? It doesn't satisfy the brief on FOUR counts:
List all the ... customers (that match criteria)
There is no customer information included in the results at all, so this is an outright fail.
paying customers
This is the top level criteria, only customers that are not free should be included in the results.
Criteria: users who had 4 or 5 activities
There has been no attempt to evaluate this user criteria in the query, and the results do not provide enough information to deduce it.
there is further ambiguity in this requirement, does it mean that it should only include results if the account has individual users that have 4 or 5 acitvities, or is it simply that the account should have 4 or 5 activities overall.
If this is a test question (clearly this is contrived, if it is not please ask for help on how to design a better schema) then the use of the term User is usually very specific and would suggest that you need to group by or otherwise make specific use of this facet in your query.
Bonus: (i.e. include a column for each of the three source types).
This is the only element that was attempted, as the data is grouped by source_type but the information cannot be correlated back to any specific user or customer.
Next time please include example data and the expected outcome with your post. In preparing the data for this post you would have come across these issues yourself and may have been inspired to ask a different question, or through the process of writing the post up you may have resolved the issue yourself.
without further clarification, we can still start to evolve this query, a good place to start is to exclude the criteria and focus on the format of the output. the requirement mentions the following output requirements:
List Customers
Include a column for each of the source types.
Firstly, even though you don't think you need to, the request clearly states that Customer is an important facet in the output, and in your schema account holds the customer information, so although we do not need to, it makes the data readable by humans if we do include information from the account table.
This is a standard PIVOT style response then, we want a row for each customer, presenting a count that aggregates each of the values for source_type. Most RDBMS will support some variant of a PIVOT operator or function, however we can achieve the same thing with simple CASE expressions to conditionally put a value into projected columns in the result set that match the values we want to aggregate, then we can use GROUP BY to evaluate the aggregation, in this case a COUNT
The following syntax is for MS SQL, however you can achieve something similar easily enough in other RBDMS
OP please tag this question with your preferred database engine...
NOTE: there is NO filtering in this query... yet
SELECT accounts.company_id
, accounts.company_name
, paid = COUNT(st_paid)
, organic = COUNT(st_organic)
, app_store = COUNT(st_app_store)
FROM activities
INNER JOIN accounts ON activities.company_id = accounts.company_id
-- PIVOT the source_type
CROSS APPLY (SELECT st_paid = CASE source_type WHEN 'paid' THEN 1 END
,st_organic = CASE source_type WHEN 'organic' THEN 1 END
,st_app_store = CASE source_type WHEN 'app store' THEN 1 END
) as PVT
GROUP BY accounts.company_id, accounts.company_name
This results in the following shape of result:
company_id
company_name
paid
organic
app_store
apl01
apples
4
8
0
ora01
oranges
6
12
0
Criteria
When you are happy with the shpe of the results and that all the relevant information is available, it is time to apply the criteria to filter this data.
From the requirement, the following criteria can be identified:
paying customers
The spec doesn't mention paying specifically, but it does include a note that (free customers have current_mrr = 0)
Now aren't we glad we did join on the account table :)
users who had 4 or 5 activities
This is very specific about explicitly 4 or 5 activities, no more, no less.
For the sake of simplicity, lets assume that the user facet of this requirement is not important and that is is simply a reference to all users on an account, not just users who have individually logged 4 or 5 activities on their own - this would require more demo data than I care to manufacture right now to prove.
during the week of February 15, 2021.
This one was correctly identified in the original post, but we need to call it out just the same.
OP has used Monday to Friday of that week, there is no mention that weeks start on a Monday or that they end on Friday but we'll go along, it's only the syntax we need to explore today.
In the real world the actual values specified in the criteria should be parameterised, mainly because you don't want to manually re-construct the entire query every time, but also to sanitise input and prevent SQL injection attacks.
Even though it seems overkill for this post, using parameters even in simple queries helps to identify the variable elements, so I will use parameters for the 2nd criteria to demonstrate the concept.
DECLARE #from DateTime = '2021-02-15' -- Date in ISO format
DECLARE #to DateTime = (SELECT DateAdd(d, 5, #from)) -- will match Friday: 2021-02-19
/* NOTE: requirement only mentioned the start date, not the end
so your code should also only rely on the single fixed start date */
SELECT accounts.company_id, accounts.company_name
, paid = COUNT(st_paid), organic = COUNT(st_organic), app_store = COUNT(st_app_store)
FROM activities
INNER JOIN accounts ON activities.company_id = accounts.company_id
-- PIVOT the source_type
CROSS APPLY (SELECT st_paid = CASE source_type WHEN 'paid' THEN 1 END
,st_organic = CASE source_type WHEN 'organic' THEN 1 END
,st_app_store = CASE source_type WHEN 'app store' THEN 1 END
) as PVT
WHERE -- paid accounts = exclude 'free' accounts
accounts.current_mrr > 0
-- Date range filter
AND activity_time BETWEEN #from AND #to
GROUP BY accounts.company_id, accounts.company_name
-- The fun bit, we use HAVING to apply a filter AFTER the grouping is evaluated
-- Wording was explicitly 4 OR 5, not BETWEEN so we use IN for that
HAVING COUNT(source_type) IN (4,5)
I believe you are missing some information there.
without more information on the tables, I can only guess that you also have a customer table. i am going to assume there is a customer_id key that serves as key between both tables
i would take your query and do something like:
SELECT customer_id,
COUNT() AS Total,
MAX(CASE WHEN source_type = "app" THEN "numoperations" END) "app_totals"),
MAX(CASE WHEN source_type = "paid" THEN "numoperations" END) "paid_totals"),
MAX(CASE WHEN source_type = "organic" THEN "numoperations" END) "organic_totals"),
FROM (
SELECT source_type, COUNT() AS num_operations
FROM activities
WHERE activity_time BETWEEN '02-15-21' AND '02-19-21'
GROUP BY source_type
) tb1 GROUP BY customer_id
This is the most generic case i can think of, but does not scale very well. If you get new source types, you need to modify the query, and the structure of the output table also changes. Depending on the sql engine you are using (i.e. mysql vs microsoft sql) you could also use a pivot function.
The previous query is a little bit rough, but it will give you a general idea. You can add "ELSE" statements to the clause, to zero the fields when they have no values, and join with the customer table if you want only active customers, etc.

Trouble with pulling distinct data

Ok this is hard to explain partially because I'm bad at sql but this code isn't doing exactly what I want it to do. I'll try to explain what it is supposed to do as best I can and hopefully someone can spot a glaring mistake. I'm sorry about the long winded explanation but there is a lot going on here and I really could use the help.
The point of this script is to search for parts which need to be obsoleted. in other words they haven't been used in three years and are still active.
When we obsolete part, "part.status" is set to 'O'. It is normally null. Also, the word 'OBSOLETE' is usually written in to "part.description"
The "WORK_ORDER" contains every scheduled work order. These are defined by base,lot, and sub ID's. It also contains many dates such as the date when the work order was closed.
the "REQUIREMENT" table contains all the parts require for each job. many jobs may require multiple parts, some at different legs of the job. The way this is handled is that for a given "REQUIREMENT.WORKORDER_BASE_ID" and "REQUIREMENT.WORKORDER_LOT_ID", they may be listed on a dozen or so subsequent rows. Each line specifies a different "REQUIREMENT.PART_ID". The sub id separates what leg of the job that the part is needed. All of the parts I care about start with 'PCH'
When I run this code it returns 14 lines, I happen to know it should be returning about 39 right now. I believe the screwy part starts at line 17. I found that code on another form hoping that it would help solve the original problem. Without that code, I get like 27K lines because the DB is pulling every criteria matching requirement from every criteria matching work order. Many of these parts are used on multiple jobs. I've also tried using DISTINCT on REQUIREMENT.PART_ID which seems like it should solve the problem. Alas it doesn't.
So I know despite all the information I probably still didn't give nearly enough. Does anyone have any suggestions?
SELECT
PART.ID [Engr Master]
,PART.STATUS [Master Status]
,WO.CLOSE_DATE
,PT.ID [Die]
,PT.STATUS [Die Status]
FROM PART
CROSS APPLY(
SELECT
WORK_ORDER.BASE_ID
,WORK_ORDER.LOT_ID
,WORK_ORDER.SUB_ID
,WORK_ORDER.PART_ID
,WORK_ORDER.CLOSE_DATE
FROM WORK_ORDER
WHERE
GETDATE() - (360*3) > WORK_ORDER.CLOSE_DATE
AND PART.ID = WORK_ORDER.PART_ID
AND PART.STATUS ='O'
)WO
CROSS APPLY(
SELECT
REQUIREMENT.WORKORDER_BASE_ID
,REQUIREMENT.WORKORDER_LOT_ID
,REQUIREMENT.WORKORDER_SUB_ID
,REQUIREMENT.PART_ID
FROM REQUIREMENT
WHERE
WO.BASE_ID = REQUIREMENT.WORKORDER_BASE_ID
AND WO.LOT_ID = REQUIREMENT.WORKORDER_LOT_ID
AND WO.SUB_ID = REQUIREMENT.WORKORDER_SUB_ID
AND REQUIREMENT.PART_ID LIKE 'PCH%'
)REQ
CROSS APPLY(
SELECT
PART.ID
,PART.STATUS
FROM PART
WHERE
REQ.PART_ID = PART.ID
AND PART.STATUS IS NULL
)PT
ORDER BY PT.ID
This is difficult to understand without any sample data, but I took a stab at it anyway. I removed the second JOIN to PART (that had alias PART1) as it seemed unecessary. I also removed the subquery that was looking for parts HAVING COUNT(PART_ID) = 1
The first JOIN to PART should be done on REQUIREMENT.PART_ID = PART.PART_ID as the relationship as already been defined from WORK_ORDER to REQUIREMENT, hence you can JOIN PART directly to REQUIREMENT at this point.
EDIT 03/23/2015
If I understand this correctly, you just need a distinct list of PCH parts, and their respective last (read: MAX) CLOSE_DATE. If that is the case, here is what I propose.
I broke the query up into a couple of CTE's. The first CTE is simply going through the PART table and pulling out a DISTINCT list of PCH parts, grouping by PART_ID and DESCRIPTION.
The second CTE, is going through the REQUIREMENT table, joining to the WORK_ORDER table and, for each PART_ID (handled by the PARTITION) assigning the CLOSE_DATE a ROW_NUMBER in descending order. This will ensure that each ROW_NUMBER with a value of "1" will be the Max CLOSE_DATE for each PART_ID.
The final SELECT statement simply JOINS the two Cte's on PART_ID, filtering where LastCloseDate = 1 (the ROW_NUMBER assigned in the second CTE).
If I understand the requirements correctly, this should give you the desired results.
Additionally, I removed the filter WHERE PART.DESCRIPTION NOT LIKE 'OB%' because we're already filtering by PART.STATUS IS NULL and you stated above that an 'O' is placed in this field for Obsolete parts. Also, [DIE] and [ENGR MASTER] have the same value in the 27 rows being pulled before, so I just used the same field and labeled them differently.
; WITH Parts AS(
SELECT prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
FROM PART prt
WHERE prt.STATUS IS NULL
AND prt.PART_ID LIKE 'PCH%'
GROUP BY prt.ID, prt.DESCRIPTION
)
, LastCloseDate AS(
SELECT req.PART_ID
, wrd.CLOSE_DATE
, ROW_NUMBER() OVER(PARTITION BY req.PART_ID ORDER BY wrd.CLOSE_DATE DESC) AS LastCloseDate
FROM REQUIREMENT req
INNER JOIN WORK_ORDER wrd
ON wrd.BASE_ID = req.WORKORDER_BASE_ID
AND wrd.LOT_ID = req.WORKORDER_LOT_ID
AND wrd.SUB_ID = req.WORKORDER_SUB_ID
WHERE wrd.CLOSE_DATE IS NOT NULL
AND GETDATE() - (365 * 3) > wrd.CLOSE_DATE
)
SELECT prt.PART_ID AS [DIE]
, prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
, lst.CLOSE_DATE
FROM Parts prt
INNER JOIN LastCloseDate lst
ON prt.PART_ID = lst.PART_ID
WHERE LastCloseDate = 1

Oracle SQL Newbie needs help finding similar records

Please forgive my naivety, I’m an Oracle SQL newbie using Toad. I have a table with sales records, call it Sales. It has records of customers (by CustID) the date of a sale (SaleDate) and the item sold (by ItemID). It’s an Mview actually of other tables with final sales status in it.
I am trying to construct a query to return CustID, SaleDate, and ItemID if there is a sale on the same day for that customer for both ItemID=A and ItemID=B if between SaleDate 7/1/2013 and 7/31/2013. If this condition exists I want both records returned with the CustID, SaleDate and ItemID. I assume the two records would be on separate rows.
I’ve been researching IN, EXISTS, and sub queries but have yet to strike upon the right approach. The table has about 7 million records on it so I need something fairly efficient. Can someone point me in the right direction to achieve this? I’m learning, but I need to learn faster :)
GOT IT WOKING!
2/24/2014: Hey, I got it working and it returns the results on thesame row. One caveat to this. In my orginal example I was looking for dates when both 5P311 and 6R641 existed. In actuality I wanted all the days where 5P311 and any of the values from the RES group exists - of which 6R641 is a member. The code below achieves the results as I need them:
SELECT ItemA.CLM_SSN,
ItemA.CLM_SERV_STRT Service_Date,
ItemA.CLM_COST_CTR_NBR,
ItemA.CLM_RECV_AMT,
ItemB.CLM_COST_CTR_NBR RES_Cost_Center,
ItemB.CLM_RECV_AMT,
GroupCode,
Service
FROM DDIS.PTS_MV_CLM_STAT ItemA,
DDIS.PTS_MV_CLM_STAT ItemB,
DDIS.CST_SERV
WHERE TRUNC(ItemA.CLM_SERV_STRT) between to_date ('01-07-2013','dd-mm-yyyy') and to_date('31- 07-2013','dd-mm-yyyy')
and TRUNC(ItemA.CLM_SERV_STRT) = TRUNC(ItemB.CLM_SERV_STRT)
and TRIM(ItemA.CLM_COST_CTR_NBR) = '5P311'
and ITEMB.FK_SERV = CST_SERV.PKSERVICE
and CST_SERV.GroupCode = 'RES'
and Itema.CLM_SSN = ItemB.CLM_SSN
and ItemA.CLM_RECV_AMT <> 0
and ItemB.CLM_RECV_AMT <> 0
ORDER BY ItemA.CLM_SSN, ItemA.CLM_SERV_STRT
Try this, replace 'A' and 'B' values of course
SELECT CustID, SaleDate, ItemID
FROM Mview AS mv
WHERE EXISTS(SELECT 1 FROM Mview AS itemA WHERE itemA.ItemID = 'A'
AND TRUNC(itemA.SaleDate) = TRUN(mv.SaleDate) )
AND EXISTS(SELECT 1 FROM Mview AS itemB WHERE itemB.ItemID = 'B'
AND TRUNC(itemB.SaleDate) = TRUNC(mv.SaleDate) )
AND mv.SaleDate BETWEEN TO_DATE ('2003/01/07', 'yyyy/mm/dd')
AND TO_DATE ('2003/01/31', 'yyyy/mm/dd');
The exists combined ensures you that there is a sell that day that had those 2 items, the TRUNC in the date is to get rid of the hours and minutes of the date.
The between lets you seek the current range of dates, you have to convert it to date, since you are passing a string.
Edit:
ItemA is a alias for the table Mview inside the exists oracle: can you assign an alias to the from clause? sql understand alias without the AS, but you can put it if it makes it easier for you to read.
In the full example you posted, you are not using any alias for DDIS.PTS_MV_CLM_STAT, so, the database motor doesnt distict wich table you are refering and that's why you dont get the values you want.

looping through a numeric range for secondary record ID

So, I figure I could probably come up with some wacky solution, but i figure i might as well ask up front.
each user can have many orders.
each desk can have many orders.
each order has maximum 3 items in it.
trying to set things up so a user can create an order and the order auto generates a reference number and each item has a reference letter. reference number is 0-99 and loops back around to 0 once it hits 99, so orders throughout the day are easy to reference for the desks.
So, user places an order for desk #2 of 3 items:
78A: red stapler
78B: pencils
78C: a kangaroo foot
not sure if this would be done in the program logic or done at the SQL level somehow.
was thinking something like neworder = order.last + 1 and somehow tying that into a range on order create. pretty fuzzy on specifics.
Without knowing the answer to my comment above, I will assume you want to have the full audit stored, rather than wiping historic records; as such the 78A 78B 78C type orders are just a display format.
If you have a single Order table (containing your OrderId, UserId, DeskId, times and any other top-level stuff) and an OrderItem table (containing your OrderItemId, OrderId, LineItemId -- showing 1,2 or 3 for your first and optional second and third line items in the order, and ProductId) and a Product table (ProductId, Name, Description)
then this is quite simple (thankfully) using the modulo operator, which gives the remainder of a division, allowing you in this case to count in groups of 3 and 100 (or any other number you wish).
Just do something like the following:
(you will want to join the items into a single column, I have just kept them distinct so that you can see how they work)
Obviously join/query/filter on user, desk and product tables as appropriate
select
o.OrderId,
o.UserId,
o.DeskId
o.OrderId%100 + 1 as OrderNumber,
case when LineItem%3 = 1 then 'A'
when LineItem%3 = 2 then 'B'
when LineItem%3 = 0 then 'C'
end as ItemLetter,
oi.ProductId
from tb_Order o inner join tb_OrderItem oi on o.OrderId=oi.OrderId
Alternatively, you can add the itemLetter (A,B,C) and/or the OrderNumber (1-100) as computed (and persisted) columns on the tables themselves, so that they are calculated once when inserted, rather than recalculating/formatting when they are selected.
This sort-of breaks some best practice that you store the raw data in the DB and you format on retrieval; but if you are not going to update the data and you are going to select the data for more than you are going to write the data; then I would break this rule and calculate your formatting at insert time

How to optimize group by in table with huge number of records

I have a Person table with huge number of records(for about 16 million), and have a requirement to find all persons, with same lastname, first letter of firstname and birthyear, in other worlds I want to show assuming duplicate persons in UI for users to analyze and decide are there a same person or not.
Here is the query I write
SELECT *
FROM Person INNER JOIN
(
SELECT SUBSTRING(firstName, 1, 1) firstNameF,lastName,YEAR(birthDate) birthYear
FROM Person
GROUP BY SUBSTRING(firstName, 1,1),lastName,YEAR(birthDate)
HAVING count(*) > 1
) as dupPersons
ON SUBSTRING(Person.firstName,1,1) = dupPersons.firstNameF and Person.lastName = dupPersons.lastName and YEAR(Person.birthDate) = dupPersons.birthYear
order by Person.lastName,Person.firstName
but as I am not SQL expert, want too know, is this good way to do that? are there more optimized way?
EDIT
Note that I can cut data, which can have contribution in optimization
for example if I want to cut data by 2 it could return two persons
Johan Smith |
Jane Smith | have same lastname and first name inita
Jack Smith |
Mark Tween | have same lastname and first name inita
Mac Tween |
If the performance using a GROUP BY is not adequate, You could try using an INNER JOIN
SELECT *
FROM Person p1
INNER JOIN Person p2 ON p2.PersonID > p1.PersonID
WHERE SUBSTRING(p2.Firstname, 1, 1) = SUBSTRING(p1.Firstname, 1, 1)
AND p2.LastName = p1.LastName
AND YEAR(p2.BirthDate) = YEAR(p1.BirthDate)
ORDER BY
p1.LastName, p1.FirstName
Well, if you're not an expert, the query you wrote says to me that you're at least pretty competent. When we look at whether a query is "optimized", there are two immediate parts to that: 1. The query just on its own has something notably wrong with it - a bad join, keyword misuse, exploding result set size, supersitions about NOT IN, etc. 2. The context that the query operates within - DB specifics, task specifics, etc.
Your query passes #1, no problem. I would have written it differently - aliased the Person table, used LEFT(P.FirstName, 1) instead of SUBSTRING, and used a CTE (WITH-clause) instead of a subquery. But these aren't optimization issues. Maybe I'd use WITH(READUNCOMMITTED) if the results weren't sensitive to dirty reads. Out of any further context, your query doesn't look like a bomb waiting to go off.
As for #2 - You should probably switch to specifics. Like "I have to run this every week. It takes 17 minutes. How can I get it down to under a minute?" Then people will ask you what your plan looks like, what indexes you have, etc.
Things I'd want to know:
How long does it already take to run?
What's your runtime window? (User & app tolerance for query time.)
Is this run once a day? Week? Month? Quarter?
Do you have the permission to create tables, change current tables, or alter indexes?
Maybe based on having run it, what's the ratio of duplicates you're expecting to find? 5%? 90%?
How stable is the matching criteria requirement?
Example scenario: If this was a run-on-command feature, it will be in my app indefinitely, it will get run weekly, with 10% or fewer records expected to be duplicates, with ability to change the DB how I'd like, if the duplicate matching criteria is firm (not fluctuating), and I wan to cut it from 90s to 5s, I'd create a dedicated BirthYear column (possibly a persisted computed column off of BirthDate), and an index on LastName ASC, BirthYear ASC, FirstName ASC. If too many of those stipulations change, I might to a different direction entirely.
You can try something like this and see the difference on the execution plans, or benchmark the results on performance:
;WITH DupPersons AS
(
SELECT *, COUNT(1) OVER(PARTITION BY SUBSTRING(firstName, 1, 1), lastName, YEAR(birthDate)) Quant
FROM Person
)
SELECT *
FROM DupPersons
WHERE Quant > 1
Of course, it would also help to know your table definition and the indexes you created. I think that maybe it can help to add a computed column with the year of birthdate and create an index on it, the same with the first letter of firstname.