Sql inner join only with last row in second table - sql

I have two tables: leads and tracking_leads.
Table structure is as below,
---------------------------- ----------------------
| leads | | tracking_leads |
---------------------------- ----------------------
| id | | tracking_id |
| lead_id | | lead_id |
| anzahl_tickets | | field_name |
| bearbeitungs_id_einkauf | | date |
---------------------------- -----------------------
I need sql for join table lead with tracking_leads table but get only LAST match row in table tracking_leads .
Sql example:
SELECT DATE_FORMAT(tracking_leads.date, "%d.%m.%Y") as trackDate, SUM(l.anzahl_tickets)
as sumValue FROM leads as l INNER JOIN tracking_leads ON l.lead_id=tracking_leads.lead_id
WHERE bearbeitungs_id_einkauf <> '' AND tracking_leads.field_name='bearbeitungs_id_einkauf'
GROUP BY DATE_FORMAT(tracking_leads.date, "%d.%m.%Y")
In this part : INNER JOIN tracking_leads ON l.lead_id=tracking_leads.lead_id need only last record from tracking_leads table.
For example, leads data:
id lead_id anzahl_tickets bearbeitungs_id_einkauf
1 20 2 100
tracking_leads data:
tracking_id lead_id field_name date
1 20 bearbeitungs_id_einkauf 2019-05-31 13:55
2 20 bearbeitungs_id_einkauf 2019-05-31 15:00
In result i need get :
2019-05-31 2
But now i get
2019-05-31 4
Because there are duplicated of lead_id (need only last record).
How can i solve this problem?
Thanks!

My preference would be to use an inline view to get the max dates.
A correlated subquery would be executed once for each row, while the inline view would only need to be executed once.
This should work:
SELECT DATE_FORMAT(tl.date, "%d.%m.%Y") as trackDate,
SUM(l.anzahl_tickets) as sumValue
FROM leads as l
INNER JOIN (
select x.lead_id, max(x.date) date from tracking_leads x where x.field_name = 'bearbeitungs_id_einkauf' group by x.lead_id
) tl ON l.lead_id=tl.lead_id
WHERE bearbeitungs_id_einkauf <> ''
GROUP BY DATE_FORMAT(tl.date, "%d.%m.%Y")
Side node: the test for empty value of bearbeitungs_id_einkauf in the WHERE clause is database-specific, so watch out for issues there. In Oracle, for example, there is no such thing as an empty string, so you would have to test it for NOT NULL. I'm assuming this is not Oracle.

First, I don't like the date format DD-MM-YYYY, because you cannot sort by it. Just use YYYY-MM-DD.
Second, you can use a correlated subquery to get the most recent date:
SELECT DATE(tl.date) as trackDate, SUM(l.anzahl_tickets) as sumValue
FROM leads l INNER JOIN
tracking_leads tl
ON l.lead_id = tl.lead_id
WHERE l.bearbeitungs_id_einkauf <> '' AND
tl.field_name = 'bearbeitungs_id_einkauf' AND
tl.date = (SELECT MAX(tl2.date)
FROM tracking_leads tl2
WHERE tl2.lead_id = tl.lead_id AND
tl2.field_name = tl.field_name
)
GROUP BY DATE(tl.date);
Of course, you can leave your original date format if you prefer. If you do, you can use:
ORDER BY MIN(tl.date)
so the results are order by the date.

Related

Creating user time report that includes zero hour weeks

I'm having a heck of a time putting together a query that I thought would be quite simple. I have a table that records total hours spent on a task and the user that reported those hours. I need to put together a query that returns how many hours a given user charged to each week of the year (including weeks where no hours were charged).
Expected Output:
|USER_ID | START_DATE | END_DATE | HOURS |
-------------------------------------------
|'JIM' | 4/28/2019 | 5/4/2019 | 6 |
|'JIM' | 5/5/2019 | 5/11/2019 | 0 |
|'JIM' | 5/12/2019 | 5/18/2019 | 16 |
I have a function that returns the start and end date of the week for each day, so I used that and joined it to the task table by date and summed up the hours. This gets me very close, but since I'm joining on date I obviously end up with NULL for the USER_ID on all zero hour rows.
Current Output:
|USER_ID | START_DATE | END_DATE | HOURS |
-------------------------------------------
|'JIM' | 4/28/2019 | 5/4/2019 | 6 |
| NULL | 5/5/2019 | 5/11/2019 | 0 |
|'JIM' | 5/12/2019 | 5/18/2019 | 16 |
I've tried a few other approaches, but each time I end up hitting the same problem. Any ideas?
Schema:
---------------------------------
| TASK_LOG |
---------------------------------
|USER_ID | DATE_ENTERED | HOURS |
-------------------------------
|'JIM' | 4/28/2019 | 6 |
|'JIM' | 5/12/2019 | 6 |
|'JIM' | 5/13/2019 | 10 |
------------------------------------
| DATE_HELPER_TABLE |
|(This is actually a function, but I|
| put it in a table to simplify) |
-------------------------------------
|DATE | START_OF_WEEK | END_OF_WEEK |
-------------------------------------
|5/3/2019 | 4/28/2019 | 5/4/2019 |
|5/4/2019 | 4/28/2019 | 5/4/2019 |
|5/5/2019 | 5/5/2019 | 5/11/2019 |
| ETC ... |
Query:
SELECT HRS.USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
,SUM(HOURS)
FROM DATE_HELPER_TABLE DHT
LEFT JOIN (
SELECT TL.USER_ID
,TL.HOURS
,DHT2.START_OF_WEEK
,DHT2.END_OF_WEEK
FROM TASK_LOG TL
JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
WHERE TL.USER_ID = 'JIM1'
) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
GROUP BY USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
ORDER BY DHT.START_OF_WEEK
http://sqlfiddle.com/#!18/02d43/3 (note: for this sql fiddle, I converted my date helper function into a table to simplify)
Cross join the users (in question) and include them in the join condition. Use coalesce() to get 0 instead of NULL for the hours of weeks where no work was done.
SELECT u.user_id,
dht.start_of_week,
dht.end_of_week,
coalesce(sum(hrs.hours), 0)
FROM date_helper_table dht
CROSS JOIN (VALUES ('JIM1')) u (user_id)
LEFT JOIN (SELECT tl.user_id,
dht2.start_of_week,
tl.hours
FROM task_log tl
INNER JOIN date_helper_table dht2
ON dht2.date_value = tl.date_entered) hrs
ON hrs.user_id = u.user_id
AND hrs.start_of_week = dht.start_of_week
GROUP BY u.user_id,
dht.start_of_week,
dht.end_of_week
ORDER BY dht.start_of_week;
I used a VALUES clause here to list the users. If you only want to get the times for particular users you can do so too (or use any other subquery, or ...). Otherwise you can use your user table (which you didn't post, so I had to use that substitute).
However the figures that are produced by this (and your original query) look strange to me. In the fiddle your user has worked for a total of 23 hours in the task_log table. Yet your sums in the result are 24 and 80, that is way to much on its own and even worse taking into account, that 1 hour in task_log isn't even on a date listed in date_helper_table.
I suspect you get more accurate figures if you just join task_log, not that weird derived table.
SELECT u.user_id,
dht.start_of_week,
dht.end_of_week,
coalesce(sum(tl.hours), 0)
FROM date_helper_table dht
CROSS JOIN (VALUES ('JIM1')) u (user_id)
LEFT JOIN task_log tl
ON tl.user_id = u.user_id
AND tl.date_entered = dht.date_value
GROUP BY u.user_id,
dht.start_of_week,
dht.end_of_week
ORDER BY dht.start_of_week;
But maybe that's just me.
SQL Fiddle
http://sqlfiddle.com/#!18/02d43/65
Using your SQL fiddle, I simply updated the select statement to account for and convert null values. As far as I can tell, there is nothing in your post that makes this option not viable. Please let me know if this is not the case and I will update. (This is not intended to detract from sticky bit's answer, but to offer an alternative)
SELECT ISNULL(HRS.USER_ID, '') as [USER_ID]
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
,SUM(ISNULL(HOURS,0)) as [SUM]
FROM DATE_HELPER_TABLE DHT
LEFT JOIN (
SELECT TL.USER_ID
,TL.HOURS
,DHT2.START_OF_WEEK
,DHT2.END_OF_WEEK
FROM TASK_LOG TL
JOIN DATE_HELPER_TABLE DHT2 ON DHT2.DATE_VALUE = TL.DATE_ENTERED
WHERE TL.USER_ID = 'JIM1'
) HRS ON HRS.START_OF_WEEK = DHT.START_OF_WEEK
GROUP BY USER_ID
,DHT.START_OF_WEEK
,DHT.END_OF_WEEK
ORDER BY DHT.START_OF_WEEK
Create a dates table that includes all dates for the next 100 years in the first column, the week of the year, day of the month etc in the next.
Then select from that dates table and left join everything else. Do isnull function to replace nulls with zeros.

How to get data by max(date) group by column in SQL Server

In the table below I need lastservice data based on the last date group by nopol.
Table data:
I tried to use the query below
SELECT
TGL, A.NOPOL, LASTSERVICE
FROM
TRREALPRWT AS A
INNER JOIN
(SELECT
MAX(TGLREAL) AS TGL, NOPOL
FROM
TRREALPRWT
GROUP BY
NOPOL) AS B ON B.NOPOL = A.NOPOL
GROUP BY
A.NOPOL, TGL,LASTSERVICE
but the results obtained are not per nopol.
What I should do from the query so that it produces data like the following
| NOPOL | LASTSERVICE | TGLREAL |
| L9235VB | 224270 | 2018-01-26 00: 00: 00.000 |
| B9891JT | 219270 | 2018-02-28 00: 00: 00.000 |
Take all MAX(TGL) for all NOPOL in an inner query and join it again with your original table matching on NOPOL and MAX(TGL) values.
SELECT T.NOPOL, T.TGL, T.LASTSERVICE
FROM
YOUR_TABLE T
INNER JOIN
(SELECT NOPOL, MAX(TGL) AS MAX_TGL FROM YOUR_TABLE GROUP BY NOPOL) A
ON T.NOPOL = A.NOPOL AND T.TGL = A.MAX_TGL;
I believe this should work, assuming TGLREAL is your date.
SELECT TGL,A.NOPOL,LASTSERVICE FROM TRREALPRWT
WHERE TGLREAL IN (SELECT MAX(TGLREAL) FROM TRREALPRWT)
The sample output in the question seems to indicate that you want the first tgl for each nopol, max(lastservice)
SELECT nopol,MIN(lastservice) lastservice, MAX(tgl) tglreal
FROM trreal
GROUP BY nopol
Is seems to be use only group by clause no need to inner query
select nopol, min(LASTSERVICE) LASTSERVICE, max(TGL) TGLREAL
from TRREALPRWT
group by nopol

How to fill in empty date rows multiple times?

I am trying to fill in dates with empty data, so that my query returned has every date and does not skip any.
My application needs to count bookings for activities by date in a report, and I cannot have skipped dates in what is returned by my SQL
I am trying to use a date table (I have a table with every date from 1/1/2000 to 12/31/2030) to accomplish this by doing a RIGHT OUTER JOIN on this date table, which works when dealing with one set of activities. But I have multiple sets of activities, each needing their own full range of dates regardless if there were bookings on that date.
I also have a function (DateRange) I found that allows for this:
SELECT IndividualDate FROM DateRange('d', '11/01/2017', '11/10/2018')
Let me give an example of what I am getting and what I want to get:
BAD: Without empty date rows:
date | activity_id | bookings
-----------------------------
1/2 | 1 | 5
1/4 | 1 | 4
1/3 | 2 | 6
1/4 | 2 | 2
GOOD: With empty date rows:
date | activity_id | bookings
-----------------------------
1/2 | 1 | 5
1/3 | 1 | NULL
1/4 | 1 | 4
1/2 | 2 | NULL
1/3 | 2 | 6
1/4 | 2 | 2
I hope this makes sense. I get the whole point of joining to a table of just a list of dates OR using the DateRange table function. But neither get me the "GOOD" result above.
Use a cross join to generate the rows and then left join to fill in the values:
select d.date, a.activity_id, t.bookings
from DateRange('d', ''2017-11-01',''2018-11-10') d cross join
(select distinct activity_id from t) a left join
t
on t.date = d.date and t.activity_id = a.activity_id;
It is a bit hard to follow what your data is and what comes from the function. But the idea is the same, wherever the data comes from.
I figured it out:
SELECT TOP 100 PERCENT masterlist.dt, masterlist.activity_id, count(r_activity_sales_bymonth.bookings) AS totalbookings
FROM (SELECT c.activity_id, dateadd(d, b.incr, '2016-12-31') AS dt
FROM (SELECT TOP 365 incr = row_number() OVER (ORDER BY object_id, column_id), *
FROM (SELECT a.object_id, a.column_id
FROM sys.all_columns a CROSS JOIN
sys.all_columns b) AS a) AS b CROSS JOIN
(SELECT DISTINCT activity_id
FROM r_activity_sales_bymonth) AS c) AS masterlist LEFT OUTER JOIN
r_activity_sales_bymonth ON masterlist.dt = r_activity_sales_bymonth.purchase_date AND masterlist.activity_id = r_activity_sales_bymonth.activity_id
GROUP BY masterlist.dt, masterlist.activity_id
ORDER BY masterlist.dt, masterlist.activity_id

joins in sql giving me weird results

I have two queries, Q1 and Q2.
Q1 produces one result for each demo and date.
Q2 produces one result for each demo, date and site.
Also, the dates for a given demo and site from Q2 will have some overlap with Q1,
but all dates from Q1 won't be there and there might even be some new dates in Q2 that were not there in Q1.
What I want to do is produce a resulting table that has the results of Q1 basically repeated (rows beneath rows) equal to the number of sites in Q2.
And the results from Q2 should be in the second column with a match on the date and demo.
If a date in Q1 doesn't exist in that site of Q2, the entry should be zero or null. I know this can be achieved with joins, but I can't get it to work. I tried -
select a.result, b.site, b.result from
(Q1) as a right join (Q2) as b on a.demo = b.demo and a.date=b.date
but this is producing some weird results. The entries of a.result are different for each site of Q2 though they shouldn't be.
edit - here is what I'm trying to do -
Q1 -
demo | date
------------------------------
1 | 10/31/2013
1 | 11/01/2013
2 | 11/02/2013
Q2 -
demo | site | date
------------------------------
1 | A | 10/31/2013
1 | A | 11/01/2013
2 | B | 11/01/2013
2 | B | 11/02/2013
desired result -
demo | date | site
---------------------------------------
1 | 10/31/2013 | A
1 | 11/01/2013 | A
2 | 11/02/2013 | null
1 | 10/31/2013 | null
1 | 11/01/2013 | B
2 | 11/02/2013 | B
Use inner join instead of right join
select a.result, b.site, b.result from (Q1) as a
inner join (Q2) as b on a.demo = b.demo and a.date=b.date
Here is an SQL Fiddle example of what I think you are asking for:
SELECT M.demo, M.date, M.site FROM
(
SELECT 2 AS FromQuery, Q2.demo, Q2.date, Q2.site
FROM Q2
UNION
SELECT 1 AS FromQuery, Q1.demo, Q1.date, null AS site
FROM Q1
) AS M
ORDER BY M.FromQuery
Based on your clarification, you could get that result with this query.
SELECT
a.demo,
a.date,
b.site
FROM (Q1) a
LEFT JOIN (Q2) a ON b.date = a.date
Sorting it as you have in your result list would require more information in the subqueries, however. You'd need to use a function like Row_Number() (assuming you're using MSSQL) to generate unique IDs in the sub-queries to use for sorting.

Perform right outer join with a condition for left table

I have two tables,
Student:
rollno | name
1 | Abc
2 | efg
3 | hij
4 | klm
Attendance:
name | date |status
Abc | 10-10-2013 | A
efg | 10-10-2013 | A
Abc | 11-10-2013 | A
hij | 25-10-2013 | A
My required output is:
Some query with where condition as "where date between '10-09-2013' and '13-10-2013' "
rollno| name |count
1 | Abc | 2
2 | efg | 1
3 | hij | 0
4 | klm | 0
I tried using:
SELECT p.rollno,p.name,case when s.statuss='A' then COUNT(p.rollno) else '0' end as count
from attendance s
right outer join student p
on s.rollno=p.rollno
where s.date between '10-09-2013' and '13-10-2013'
group by p.rollno,p.regno,p.name,s.statuss
order by p.rollno
And the Output is:
rollno| name |count
1 | Abc | 2
2 | efg | 1
I want the remaining values from the student table to also be appended. I have tried many different queries, but all have been unsuccessful. Is there a query that will return the required output above?
You need to move the criteria from the where to the join:
SELECT p.rollno,p.name,case when s.statuss='A' then COUNT(p.rollno) else 0 end as count
from attendance s
right outer join student p
on s.rollno=p.rollno
and s.date between '10-09-2013' and '13-10-2013'
group by p.rollno,p.regno,p.name,s.statuss
order by p.rollno;
At the moment even though you have an outer join, by referring to the outer table in the where clause you effectively turn it into an inner join. Where there is no match in attendance, s.Date will be NULL, and because NULL is not between '10-09-2013' and '13-10-2013' the rows are excluded.
It is not apparent from the question, but I would image that what you are actually looking for is this. It appears you are just after a count of entries in attendance where status = 'A' by student:
SELECT p.rollno,
p.name,
COUNT(s.statuss) as count
from attendance s
right outer join student p
on s.rollno=p.rollno
and s.date between '10-09-2013' and '13-10-2013'
AND s.statuss = 'A'
group by p.rollno,p.regno,p.name,
order by p.rollno;
I have removed s.statuss from the group by, and changed the count so that there is only one row per student, rather than one row per status per student. I have changed the column within the count to a column in the attendance status table, to ensure that you get a count of 0 when there are no entries in attendance. if you use a column in students you will get a count of 1 even when there are no entries. Finally, since you are only interested in entries with statuss = 'A' I have also moved this to the join condition.
On one final note, it is advisable when using strings for dates to use the culture insensitive format yyyyMMdd, as this is completely unanbiguous, 20130201' is always the 1st February, and never 2nd January, whereas in your query10-09-2013' could be 10th September, or 9th October, depending on your settings.