sql server LIMIT with where but include at least once - sql

I have a simple query like this:
SELECT subs.id_number,
op.dtupdated as dtopened
FROM subscribers AS subs
LEFT OUTER JOIN messages AS msg
ON msg.id_number = subs.id_number
LEFT JOIN email_opens AS op
ON op.message_id = msg.message_id
WHERE op.dtupdated > dateadd(month,-3,CURRENT_TIMESTAMP)
Basically I'm trying to get all the records in a table that tracks when an email was opened (email_opens) which is associated to subscribers by the id_number field. I want to get all email opens within the past 3 months and the associated id_number, but I also want to get include at least once, all id-numbers in the subscribers table.
The problem is my where clause eliminates all records that never had any open emails in the past 3 months, but i want to include one record with id_number and "NULL" as dtopened for subscribers who havent any opens.
I tried left outer join, union (which works but then I have duplicates), and I just cant seem to find out how to do this. Im sure there has to be a simple way and I just havent had enough coffee yet.

Checking op.dtupdated in the WHERE clause means the records that don't match get filtered out. If you do that in your ON clause of the LEFT OUTER JOIN instead, you'll get at least one row for each record on subscribers:
SELECT subs.id_number
,op.dtupdated AS dtopened
FROM subscribers AS subs
LEFT OUTER JOIN messages AS msg ON msg.id_number = subs.id_number
LEFT OUTER JOIN email_opens AS op ON op.message_id = msg.message_id
AND op.dtupdated > dateadd(month, - 3, CURRENT_TIMESTAMP)

Related

Record with latest date, where date comes from a joined table

I have tried every answer that I have found to finding the last record, and I have failed in getting a successful result. I currently have a query that lists active trailers. I am needing it to only show a single row for each trailer entry, where that row is based on a date in a joined table.
I have tables
trailer, company, equipment_group, movement, stop
In order to connect trailer to stop (which is where the date is), i have to join it to equipment group, which joins to movement, which then joins to stop.
I have tried using MAX and GROUP BY, and PARTITION BY, both of which error out.
I have tried many solutions here, as well as these
https://thoughtbot.com/blog/ordering-within-a-sql-group-by-clause
https://www.geeksengine.com/article/get-single-record-from-duplicates.html
It seems that all of these solutions have the date in the same table as the thing that they want to group by, which I do not.
SELECT
trailer.*
company.name,
equipment_group.currentmovement_id,
equipment_group.company_id,
movement.dest_stop_id, stop.location_id,
stop.*
FROM trailer
LEFT OUTER JOIN company ON (company.id = trailer.company_id)
LEFT OUTER JOIN equipment_group ON (equipment_group.id =
trailer.currenteqpgrpid)
LEFT OUTER JOIN movement ON (movement.id =
equipment_group.currentmovement_id)
LEFT OUTER JOIN stop ON (stop.id = movement.dest_stop_id)
WHERE trailer.is_active = 'A'
Using MAX and GROUP BY gives error "invalid in the select list... not contained in...aggregate function"
Welllllll, I never did end up figuring that out, but if I joined movements on to equipment group by two conditions, all is well. Each extra record was created by each company id.... company id is in EVERY table.

What's the most efficient way to exclude possible results from an SQL query?

I have a subscription database containing Customers, Subscriptions and Publications tables.
The Subscriptions table contains ALL subscription records and each record has three flags to mark the status: isActive, isExpire and isPending. These are Booleans and only one flag can be True - this is handled by the application.
I need to identify all customers who have not renewed any magazines to which they have previously subscribed and I'm not sure that I've written the most efficient SQL query. If I find a lapsed subscription I need to ignore it if they already have an active or pending subscription for that particular magazine.
Here's what I have:
SELECT DISTINCT Customers.id, Subscriptions.publicationName
FROM Subscriptions
LEFT JOIN Customers
ON Subscriptions.id_Customer = Customers.id
LEFT JOIN Publications
ON Subscriptions.id_Publication = Publications.id
WHERE Subscriptions.isExpired = 1
AND NOT EXISTS
( SELECT * FROM Subscriptions s2
WHERE s2.id_Publication = Subscriptions.id_Publication
AND s2.id_Customer = Subscriptions.id_Customer
AND s2.isPending = 1 )
AND NOT EXISTS
( SELECT * FROM Subscriptions s3
WHERE s3.id_Publication = Subscriptions.id_Publication
AND s3.id_Customer = Subscriptions.id_Customer
AND s3.isActive = 1 )
I have just over 50,000 subscription records and this query takes almost an hour to run which tells me that there's a lot of looping or something going on where for each record the SQL engine is having to search again to find any 'isPending' and 'isActive' records.
This is my first post so please be gentle if I've missed out any information in my question :) Thanks.
I don't have your complete database structure, so I can't test the following query but it may contain some optimization. I will leave it to you to test, but will explain why I have changed, what I have changed.
select Distinct Customers.id, Subscriptions.publicationName
from Subscriptions
join Customers on Subscriptions.id_Customer = Customer.id
join Publications
ON Subscriptions.id_Publication = Publications.id
Where Subscriptions.isExpired = 1
And Not Exists
(select * from Subscriptions s2
join Customers on s2.id_Customer = Customer.id
join Publications
ON s2.id_Publication = Publications.id
where s2.id_Customer = s2.id_customer and
(s2.isPending = 1 or s2.isActive = 1))
If you have no resulting data in Customer or Publications DB, then the Subscription information isn't useful, so I eliminated the LEFT join in favor of simply join. Combine the two Exists subqueries. These are pretty intensive if I recall so the fewer the better. Last thing which I did not list above but may be worth looking into is, can you run a subquery with specific data fields returned and use it in an Exists clause? The use of Select * will return all data fields which slows down processing. I'm not sure if you can limit your result unfortunately, because I don't have an equivalent DB available to me that I can test on (the google probably knows).
I suspect there are further optimizations that could be made on this query. Eliminating the Exists clause in favor of an 'IN' clause may help, but I can't think of a way right now, seeing how you've got to match two unique fields (customer id and the relevant subscription). Let me know if this helps at all.
With a table of 50k rows, you should be able to run a query like this in seconds.

How to get the recors with count zero if there are no records

I have three tables like
I want to display the leave types with the count. For that I have written a query like
SELECT VM.vacation_id,
VM.vacation_desc,
isnull(sum(VR.total_hours_applied),0) AS totalCount
FROM EMPTYPE_VACATIONCONFIG VC
LEFT JOIN HR_Vacation_Master VM ON VC.VACATIONID=VM.vacation_id
INNER JOIN HR_Employee_Vacation_Request VR ON VR.vacation_id=VM.vacation_id
WHERE VR.employee_id=156
AND VC.BRANCHID=20
GROUP BY VM.vacation_desc,
VM.vacation_id
my query is working fine and giving results of existed vacationids only. like
I want third leave alos with zero total.
If the employee not applied any leave(in second table), that record not coming in list. I want that record also with the totalCount zero.
How can I do it
This is because of VR.employee_id=156 you are not allowing null row.
You can do that :
SELECT VM.vacation_id,
VM.vacation_desc,
isnull(sum(VR.total_hours_applied),0) AS totalCount
FROM EMPTYPE_VACATIONCONFIG VC
LEFT JOIN HR_Vacation_Master VM ON VC.VACATIONID=VM.vacation_id
LEFT JOIN HR_Employee_Vacation_Request VR
ON VR.vacation_id=VM.vacation_id AND VR.employee_id=156
WHERE VC.BRANCHID=20
GROUP BY VM.vacation_desc,
VM.vacation_id
Leave me a comment if this not works, I have some other ideas.
If one employee didn't apply any leave, there shouldn't have a record with his(or her) employee id in table HR_Employee_Vacation_Request , right? So I think you should use HR_Vacation_Master left outer join table HR_Employee_Vacation_Request .

Full outer join joining together every record multiple times

Query below:
select
cu.course_id as 'bb_course_id',
cu.user_id as 'bb_user_id',
cu.role as 'bb_role',
cu.available_ind as 'bb_available_ind',
CASE cu.row_status WHEN 0 THEN 'ENABLED' ELSE 'DISABLED' END AS 'bb_row_status',
eff.course_id as 'registrar_course_id',
eff.user_id as 'registrar_user_id',
eff.role as 'registrar_role',
eff.available_ind as 'registrar_available_ind',
CASE eff.row_status WHEN 'DISABLE' THEN 'DISABLED' END as 'registrar_row_status'
into enrollments_comparison_temp
from narrowed_users_enrollments cu
full outer join enrollments_feed_file eff on cu.course_id = eff.course_id
Quick background: I'm taking the data from a replicated table and selecting it into narrowed_users_enrollments based on some criteria. In a script I'm taking a text feed file, with enrollment data, and inserting it into enrollments_feed_file. The purpose is to compare the most recent enrollment data with enrollments already in the database.
However the issue is that joining these tables results in about 160,000 rows when I'm really only expecting about 22,000. The point of doing this comparison is so that I can look for nulled values on either side of the join. For example, if the table on the right contains a null, then disable the enrollment record. If the table on the left contains a null, then add this student's enrollment.
I know it's a little off because I'm not using PKs or FKs. This is what is selected into the table:
Here's a screenshot showing a select * from the enrollments table on the left and a feed file on the right.
http://i.imgur.com/0ZPZ9HS.png
Here's a screenshot showing the newly created table from the full outer join.
http://i.imgur.com/89ssAkS.png
As you can see even though there there's only one matching enrollment(the matching jmartinez12 columns), there's 4 extra rows created for the same record on the left for the enrollments on the right. What I'm trying to get is for it to be 5 rows, with the first being how it is in the screenshot(matching pre-existing enrollment and enrollment in the feed file), BUT, the next 4 rows with the bb_* columns should be NULL up to the registrar_course_id.
Am I overlooking something simple here? I've tried a select distinct and I've added a where clause specifying when the course_ids are equal however that ensures that I won't get null rows which I need. I have also joined the tables on the user_id however the results are still the same.
One quick suggestion is to add the DISTNCT clause. If the records you are setting are complete duplicates that may cut it down to what you are expecting.
The fix was to also join on:
ON cu.course_id = eff.course_id AND cu.user_id = eff.user_id

Left Join with all rows from the left not matching the Where Clause

I have the following problem:
I have an account table and an entries for account table.
account_id
account_name
entry_id
account_idfk
entry_date
entry_amount
Now I want to query all entries for all accounts in a given period. Eg. I want all Entries for all accounts from October 2008 - October 2009. If there are no entries for this account at all, or there are only entries in other timeperiods for this account, I want the account returned as well.
My current query works, if there are no entries at all, or there are entries for this timeperiod for this account. However - it leaves out the Accounts which have only entries for other timeperiods.
SELECT * FROM Account a
LEFT JOIN Entries e ON e.account_idfk = a.account_id
WHERE e.entry_date BETWEEN '2009-08-13' AND '2009-08-13'
OR e.entry_date IS NULL
I know that the problem is in the where clause - I eliminate all Accounts for which only entries in other time periods exist.
But I have no idea how to restate the query to get the desired result...
Thanks,
Martin
Move that condition to the join:
SELECT
*
FROM
Account a
LEFT JOIN Entries e ON
e.account_idfk = a.account_id
AND e.entry_date BETWEEN '2009-08-13' AND '2009-08-13'
What you see here is the difference between a join and a where condition. The join will only join rows that meet that condition. However, with a left join, you still return all the rows in the left table. With the where clause, you're filtering rows after the join. In this case, you only want to join entries where the date is 8/13/09 (or 13/8/09, for those across the pond), but you want to return all accounts. Therefore, the condition needs to go into the join clause, and not the where.
This often gets confused with any outer join, because with an inner join, the result is the same no matter if the condition is in the join or where clause. However, this does not mean that they are equivalent, as demonstrated by you today!