Record with latest date, where date comes from a joined table - sql

I have tried every answer that I have found to finding the last record, and I have failed in getting a successful result. I currently have a query that lists active trailers. I am needing it to only show a single row for each trailer entry, where that row is based on a date in a joined table.
I have tables
trailer, company, equipment_group, movement, stop
In order to connect trailer to stop (which is where the date is), i have to join it to equipment group, which joins to movement, which then joins to stop.
I have tried using MAX and GROUP BY, and PARTITION BY, both of which error out.
I have tried many solutions here, as well as these
https://thoughtbot.com/blog/ordering-within-a-sql-group-by-clause
https://www.geeksengine.com/article/get-single-record-from-duplicates.html
It seems that all of these solutions have the date in the same table as the thing that they want to group by, which I do not.
SELECT
trailer.*
company.name,
equipment_group.currentmovement_id,
equipment_group.company_id,
movement.dest_stop_id, stop.location_id,
stop.*
FROM trailer
LEFT OUTER JOIN company ON (company.id = trailer.company_id)
LEFT OUTER JOIN equipment_group ON (equipment_group.id =
trailer.currenteqpgrpid)
LEFT OUTER JOIN movement ON (movement.id =
equipment_group.currentmovement_id)
LEFT OUTER JOIN stop ON (stop.id = movement.dest_stop_id)
WHERE trailer.is_active = 'A'
Using MAX and GROUP BY gives error "invalid in the select list... not contained in...aggregate function"

Welllllll, I never did end up figuring that out, but if I joined movements on to equipment group by two conditions, all is well. Each extra record was created by each company id.... company id is in EVERY table.

Related

Why do I get extra rows in LEFT JOIN when joining to an ID and TIMESTAMP column?

I have a table that contains multiple registration periods (date and time for the start of the registration, as well as date and time for when that instance of registration ends). For each row (registration period), there is a status column that contains the status at the end of the registration period. I was trying to get the status associated with the most recent end date of registration per a given ID. I've used a window function to get the most recent end date of interest per ID, and then I wanted to LEFT JOIN on ID and end date to get the status from the same table on which I used the window function. There should really just be one just one combination for a given end date and status per ID, but somehow I get more rows that what's in the left table.
Like I mentioned earlier, my approach was to use a window function to get MAX(end_date) per ID and some other column, let's call it enrollment_number. Then use LEFT JOIN on this table and its parent table to bring in status associated with that date only. Later, I'd like to use the result of this join to bring in the status associated with the end date into other tables where I need it.
WITH
my_first_test AS
(
SELECT my_id,
enrollment_number,
MAX(end_date_of_enrollment) OVER (partition by my_id, enrollment_number) AS end_date_enrolled
FROM enrollments
)
SELECT mft.my_id, mft.end_date_enrolled, e.status
FROM my_first_test AS mft
LEFT JOIN enrollments AS e
ON mft.my_id = e.my_id AND mft.end_date_enrolled = e.end_date_enrolled;
The CTE returns 42917 rows, same number of rows as in the enrollments table, which it should be if I understand it correctly.
Then, I LEFT JOIN enrollments, to bring in information from the status column also contained in the enrollments table. The LEFT JOIN is done on my_id and end_date_enrolled.
I expect 42917 rows in the resulting table, because my_id and end_date_enrolled together should be unique. However, I get slightly more rows in my final table - 44408. I was wondering if the StackOverflow community would be able to help me solve this mystery. I am using SQL in AWS Redshift.
You have duplicates in enrollments. You can find them with aggregation:
SELECT my_id, end_date_enrolled, COUNT(*)
FROM enrollments AS e
GROUP BY my_id, end_date_enrolled
HAVING COUNT(*) > 1;

Select distinct record with join count records

I have two tables: Company and Contact, with a relationship of one-to-many.
I have another table Track which identifies some of the companies as parent companies to other companies.
I want to write a SQL query that selects the parent companies from Track and the amount of contacts that each parent has.
SELECT Track.ParentId, Count(Contact.companyId)
FROM Track
INNER JOIN Contact
ON Track.ParentId = Contact.companyId
GROUP BY Track.ParentId
however The result holds less records than when I run the following query:
SELECT DISTINCT Track.ParentId
FROM Track
I tried the first query with an added DISTINCT and it returned the same results (less then what it was meant to).
You're performing an INNER JOIN with the Contact table, which means that any rows from the first table (Track in this case) with no matches to the JOINed table will not show up in your results. Try using a LEFT OUTER JOIN instead.
The COUNT with Contact.companyId will only count rows where there is a match (Contact.companyId is not NULL). Since you're counting contacts that's fine as they will count as 0. If you were trying to count some other set of data and tried to do a COUNT on a specific column (rather than COUNT(*)) then any NULL values in that column would not count towards your total, which might or might not be what you want.
I used an INNER JOIN which returns only records that are identical in both tables.
To return all records from Track table, and records that match in the Contact table, I need to use LEFT JOIN.

How to get the recors with count zero if there are no records

I have three tables like
I want to display the leave types with the count. For that I have written a query like
SELECT VM.vacation_id,
VM.vacation_desc,
isnull(sum(VR.total_hours_applied),0) AS totalCount
FROM EMPTYPE_VACATIONCONFIG VC
LEFT JOIN HR_Vacation_Master VM ON VC.VACATIONID=VM.vacation_id
INNER JOIN HR_Employee_Vacation_Request VR ON VR.vacation_id=VM.vacation_id
WHERE VR.employee_id=156
AND VC.BRANCHID=20
GROUP BY VM.vacation_desc,
VM.vacation_id
my query is working fine and giving results of existed vacationids only. like
I want third leave alos with zero total.
If the employee not applied any leave(in second table), that record not coming in list. I want that record also with the totalCount zero.
How can I do it
This is because of VR.employee_id=156 you are not allowing null row.
You can do that :
SELECT VM.vacation_id,
VM.vacation_desc,
isnull(sum(VR.total_hours_applied),0) AS totalCount
FROM EMPTYPE_VACATIONCONFIG VC
LEFT JOIN HR_Vacation_Master VM ON VC.VACATIONID=VM.vacation_id
LEFT JOIN HR_Employee_Vacation_Request VR
ON VR.vacation_id=VM.vacation_id AND VR.employee_id=156
WHERE VC.BRANCHID=20
GROUP BY VM.vacation_desc,
VM.vacation_id
Leave me a comment if this not works, I have some other ideas.
If one employee didn't apply any leave, there shouldn't have a record with his(or her) employee id in table HR_Employee_Vacation_Request , right? So I think you should use HR_Vacation_Master left outer join table HR_Employee_Vacation_Request .

sql server LIMIT with where but include at least once

I have a simple query like this:
SELECT subs.id_number,
op.dtupdated as dtopened
FROM subscribers AS subs
LEFT OUTER JOIN messages AS msg
ON msg.id_number = subs.id_number
LEFT JOIN email_opens AS op
ON op.message_id = msg.message_id
WHERE op.dtupdated > dateadd(month,-3,CURRENT_TIMESTAMP)
Basically I'm trying to get all the records in a table that tracks when an email was opened (email_opens) which is associated to subscribers by the id_number field. I want to get all email opens within the past 3 months and the associated id_number, but I also want to get include at least once, all id-numbers in the subscribers table.
The problem is my where clause eliminates all records that never had any open emails in the past 3 months, but i want to include one record with id_number and "NULL" as dtopened for subscribers who havent any opens.
I tried left outer join, union (which works but then I have duplicates), and I just cant seem to find out how to do this. Im sure there has to be a simple way and I just havent had enough coffee yet.
Checking op.dtupdated in the WHERE clause means the records that don't match get filtered out. If you do that in your ON clause of the LEFT OUTER JOIN instead, you'll get at least one row for each record on subscribers:
SELECT subs.id_number
,op.dtupdated AS dtopened
FROM subscribers AS subs
LEFT OUTER JOIN messages AS msg ON msg.id_number = subs.id_number
LEFT OUTER JOIN email_opens AS op ON op.message_id = msg.message_id
AND op.dtupdated > dateadd(month, - 3, CURRENT_TIMESTAMP)

Left Join with all rows from the left not matching the Where Clause

I have the following problem:
I have an account table and an entries for account table.
account_id
account_name
entry_id
account_idfk
entry_date
entry_amount
Now I want to query all entries for all accounts in a given period. Eg. I want all Entries for all accounts from October 2008 - October 2009. If there are no entries for this account at all, or there are only entries in other timeperiods for this account, I want the account returned as well.
My current query works, if there are no entries at all, or there are entries for this timeperiod for this account. However - it leaves out the Accounts which have only entries for other timeperiods.
SELECT * FROM Account a
LEFT JOIN Entries e ON e.account_idfk = a.account_id
WHERE e.entry_date BETWEEN '2009-08-13' AND '2009-08-13'
OR e.entry_date IS NULL
I know that the problem is in the where clause - I eliminate all Accounts for which only entries in other time periods exist.
But I have no idea how to restate the query to get the desired result...
Thanks,
Martin
Move that condition to the join:
SELECT
*
FROM
Account a
LEFT JOIN Entries e ON
e.account_idfk = a.account_id
AND e.entry_date BETWEEN '2009-08-13' AND '2009-08-13'
What you see here is the difference between a join and a where condition. The join will only join rows that meet that condition. However, with a left join, you still return all the rows in the left table. With the where clause, you're filtering rows after the join. In this case, you only want to join entries where the date is 8/13/09 (or 13/8/09, for those across the pond), but you want to return all accounts. Therefore, the condition needs to go into the join clause, and not the where.
This often gets confused with any outer join, because with an inner join, the result is the same no matter if the condition is in the join or where clause. However, this does not mean that they are equivalent, as demonstrated by you today!