When Joining a SQL Table to Itself, Return only One Result. Time Log Entry - sql

I am working with some data from Jira. In Jira there are "Issues" and each issue can go through states such as New, In Progress, Review, etc. I want to measure the time something stays in each state. It is possible for things to move back and forth and return to a state multiple times. Jira produces a log table which logs the event of the issue moving from one state to another. To measure how long it’s been in a state you need to have the entry from the log for that state (end) and the previous change of state (start).
I can’t return one entry for each state change. For Issues that moved between a state multiple times I get multiple rows.
I tried the gaps and islands approach. Also a select within the top select. Min or Max in the join was atrociously slow.
The desired result would be a column added to the table in the select which gives the duration for the State in the column ItemFromString. The date difference is between this entry’s Created date and the previous state change entry’s created date, which shows when the issue moved to this state. In the example data below the first entry for Assessment, History ID 436260, would be a duration of 9/19–9/14. When I join I get multiple entries for this History ID since there are multiple Assessments. I filter the join by Issue key and Item from/to String where they match; however, I need to also add a filter where it looks at any entries created before the current items created date and selects the most recent, or largest, one. This is where I am hung up.
Fields:
Created - This is when the log entry was created which is the date time it changed state from ItemFromString to ItemToString.
IssueCreated - This is when the issue the log is about, was created. For example, they start in the new state so we need this date to figure out how long it sat in new as the first log entry will be it moving from New to something else.
IssueKey and IssueID are almost the same thing, they are key ID's for the issue in a different table.
HistoryID is the key for each log entry in this table.
Assessment
IssueKey
HistoryID
IssueId
Created
IssueCreatedDate
ItemFromString
ItemToString
TPP-16
434905
208965
9/14/2022 14:33
9/14/2022 8:56
New
Assessment
TPP-16
436260
208965
9/19/2022 8:32
9/14/2022 8:56
Assessment
Internal Review
TPP-16
437795
208965
9/19/2022 16:11
9/14/2022 8:56
Internal Review
New
TPP-16
437796
208965
9/19/2022 16:11
9/14/2022 8:56
New
Assessment
TPP-16
439006
208965
9/20/2022 15:08
9/14/2022 8:56
Assessment
New
TPP-16
457786
208965
10/17/2022 11:02
9/14/2022 8:56
New
Assessment
TPP-16
457789
208965
10/17/2022 11:03
9/14/2022 8:56
Assessment
Internal Review
TPP-16
490205
208965
10/27/2022 15:15
9/14/2022 8:56
Internal Review
On Hold
TPP-16
539391
208965
1/11/2023 15:24
9/14/2022 8:56
On Hold
Backlog
This query does not get a duration as the last column in the query. The query creates a table that is then published and utilized by BI products for graphing and analysis.
SELECT
IssueChangelogs.IssueKey IssueKey,
IssueChangelogs.HistoryId HistoryId,
IssueChangelogs.IssueId IssueId,
IssueChangelogs.IssueCreatedDate IssueCreatedDate,
IssueChangelogs.ItemFromString ItemFromString,
IssueChangelogs.ItemToString ItemToString,
ICLPrev.Created PrevCreated, --For Testing
IssueChangelogs.Created Created,
ICLPrev.HistoryID PrevHistoryID, --For Testing
CASE
-- If the join found a match for a previous status, then we can calculate the Duration it was in that state.
WHEN ICLPrev.HistoryID IS NOT NULL
THEN DATEDIFF(hour, ICLPrev.Created, IssueChangeLogs.Created)/24
-- If the state was new then we need to use the IssueCreatedDate as the start date as the default state is New for each issue.
WHEN IssueChangeLogs.ItemFromString LIKE '%New%'
THEN Round(DATEDIFF(hour, IssueChangeLogs.IssueCreatedDate, IssueChangeLogs.Created), 2)/24
-- Else, let's add something easy to identify so when we test and look at the table we know what occured.
ELSE 0.01
END AS Duration
FROM
TableNameRedacted AS IssueChangelogs
LEFT JOIN
TableNameRedacted AS ICLPrev
ON ICLPrev.IssueKey = IssueChangeLogs.IssueKey AND
ICLPrev.ItemToString = IssueChangeLogs.ItemFromString
WHERE
IssueChangelogs.IssueKey LIKE '%TPP%'

I used the Row_number function to identify the rows I wanted, then wrap it in a CTE and pull those row numbers, then add one more criteria to the join to rule out oddities.
First we select what we need from the table and then pull out the previous entry related to the state (status) change so we can calculate the time
it spent in that state as Duration. So create a Temporary Table (CTE) and order this to use the Row_Number function to identify the rows we want. Then select only those rows from the CTE.
WITH CTE AS (
SELECT
IssueChangelogs.IssueKey IssueKey,
IssueChangelogs.HistoryId HistoryId,
IssueChangelogs.IssueId IssueId,
IssueChangelogs.AuthorDisplayName AuthorDisplayName,
IssueChangelogs.IssueCreatedDate IssueCreatedDate,
IssueChangelogs.ItemFromString ItemFromString,
IssueChangelogs.ItemToString ItemToString,
ICLPrev.Created PrevRelatedCreatedDate,
ICLPrev.HistoryId PrevRelatedHistoryID,
IssueChangelogs.Created Created,
CASE
-- If the join found a match for a previous status, then we can calculate the Duration it was in that state.
WHEN ICLPrev.HistoryID IS NOT NULL
-- Funky math is so we can get the number of days with 2 decimal points.
THEN CAST((DATEDIFF(second, ICLPrev.Created, IssueChangeLogs.Created)/86400.00)AS DECIMAL(6,2))
-- If the state was new then we need to use the IssueCreatedDate as the start date as the default state is New for each issue.
WHEN IssueChangeLogs.ItemFromString LIKE '%New%'
-- Again calculate to get 2 decimal places for duration in days.
THEN CAST((DATEDIFF(second, IssueChangeLogs.IssueCreatedDate, IssueChangeLogs.Created)/86400.00)AS DECIMAL(6,2))
-- Else, let's add something easy to identify so when we test and look at the table we know what occured.
ELSE NULL
END AS Duration,
-- Here we are going to assign the number 1 to the rows we want to keep.
ROW_NUMBER() OVER(PARTITION BY IssueChangeLogs.IssueKey, IssueChangeLogs.HistoryID, IssueChangeLogs.Created
-- Since duplicates can exist we only want the most recent match. The join ensures we don't have any start dates greater than the end date and the order by here puts the most recent (DESC) on top so its gets tagged with a 1.
ORDER BY ICLPrev.Created DESC) AS RowNo
FROM
/shared/"Space"/CUI/TandTE/Physical/Metadata/JIRA/CIS/JIRA/IssueChangelogs IssueChangelogs
LEFT JOIN
/shared/"Space"/CUI/TandTE/Physical/Metadata/JIRA/CIS/JIRA/IssueChangelogs AS ICLPrev
-- These are the closest we can get to a key; however, it will still produce duplicates since an issue can return to a previous state multiple times.
ON ICLPrev.IssueKey = IssueChangeLogs.IssueKey AND ICLPrev.ItemToString = IssueChangeLogs.ItemFromString
-- This will remove anything that doesn't make sense in the join, for example a state change where the start date is greater than the end date. Still leaves duplicates which the row number will fix.
AND ICLPrev.Created < IssueChangeLogs.Created
WHERE
IssueChangelogs.IssueKey LIKE '%TPP%'
)
-- Now that the CTE table has been created and we have identifed the rows we want from the join with a 1 in RowNo, we filter by those and only pull the columns we want.
SELECT
CTE.IssueKey,
CTE.HistoryId,
CTE.IssueId,
CTE.AuthorDisplayName,
CTE.IssueCreatedDate,
CTE.ItemFromString,
CTE.ItemToString,
CTE.PrevRelatedHistoryID,
CTE.PrevRelatedCreatedDate,
CTE.Created,
CTE.Duration
FROM CTE
WHERE CTE.RowNo = 1

Related

Get a unique current user, depending on what Online Platform they are using

Our goal is to measure if they are still active over the last month, and if not the last month the last time the user was active. We are storing two tables, Employees and Online Transactions. The Online transactions table is built from monthly reports where we go out to the Online Platforms and get a listing of all users, with the date their id was activated and when it was last logged in. We get other data that matters to us, like what role they have in the Online Platform and how much data they are storing.
The Employee table has termination dates for employees that have left the company, as well as a unique id. The unique id is what we join to the Online table.
We use these two tables to manage the Online tool access as well as to report overall usage of the platforms by individual user per platform. There are three different platforms used, Portal, AGOL or Training. The Online transactions table stores all three in the same table.
The issue is I am not understanding the proper way to get the results I’m looking for. I am attaching example code, which I know by the result set isn’t what I want. Of course, since this is PII, I must scrub my example results to remove information but still show what I am having trouble with.
SELECT DISTINCT
U.EMPLOYEE_TRACKING_ID,U.LAST_NAME, U.FIRST_NAME, U.EMAIL_ADDRESS, U.PERSON_TYPE,
U.SERVICE_LINE, U.SUPERVISOR_NAME, U.OFFICE_LOCATION, U.OFFICE_CITY, U.OFFICE_STATE,
U.OFFICE_COUNTRY, U.OFFICE_POSTAL_CODE, U.ACTUAL_TERMINATION_DATETIME, O.FiscalPeriod,
O.Role, O.Source, O.LogDate
FROM dbo.Users U
INNER JOIN dbo.Online_Transactions O ON U.EMPLOYEE_TRACKING_ID = O.TrackingId
WHERE O.Source ='AGOL
In the example result set you can see that there are repeating records, because I have more than just the Tracking ID as part of the distinct, thus resulting in each field’s distinct value being considered.
Do I need to do some type of inner select to get down to the distinct user, by office and the latest log date they were reported? For example, our records go back 3.5 years, but what I want is the all unique users by the last time they logged in. I can remove the termination date for current users and keep the termination date so I can remove users who have left. I thought I would do a series of views to get me each scenario, therefore there would be a total of 6, one for each Online type and the second for terminated or not?
If anyone can help me learn how to do this.
EMPLOYEE_TRACKING_ID,LAST_NAME,FIRST_NAME,EMAIL_ADDRESS,PERSON_TYPE,SERVICE_LINE,SUPERVISOR_NAME,OFFICE_LOCATION,OFFICE_CITY,OFFICE_STATE,OFFICE_COUNTRY,OFFICE_POSTAL_CODE,ACTUAL_TERMINATION_DATETIME,FiscalPeriod,Role,Source,LogDate
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2020-Q4-08,AECOM PUBLISHER,AGOL,2020-09-02 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2020-Q4-09,AECOM PUBLISHER,AGOL,2020-09-30 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2021-Q1-11,AECOM PUBLISHER,AGOL,2021-02-02 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2021-Q1-12,AECOM PUBLISHER,AGOL,2021-03-01 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2021-Q2-01,AECOM PUBLISHER,AGOL,2021-05-01 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2021-Q2-02,AECOM PUBLISHER,AGOL,2020-09-02 00:00:00.000
111483,Name4,User4,User4.Name4#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,4/30/2021,FY2021-Q2-03,AECOM PUBLISHER,AGOL,2021-01-04 00:00:00.000
113311,Name3,User3,User3.Name3#mycompany.com,Employee,NULL,,,Sydney,NSW,Australia,2000,5/21/2021,FY2020-Q3-06,AECOM COLLECTOR,AGOL,2020-09-02 00:00:00.000
14001627,Name1,User1,user1.name1#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,3/3/2021,FY2021-Q1-12,AECOM COLLECTOR,AGOL,2021-02-02 00:00:00.000
14001627,Name1,User1,user1.name1#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,3/3/2021,FY2021-Q2-01,AECOM COLLECTOR,AGOL,2021-04-01 00:00:00.000
14001627,Name1,User1,user1.name1#mycompany.com,Employee,NULL,,,Melbourne,VIC,Australia,3008,3/3/2021,FY2021-Q2-02,AECOM COLLECTOR,AGOL,2020-09-02 00:00:00.000
14007604,Name2,User2,User2.Name2#mycompany.com,Employee,NULL,,,Newcastle upon Tyne,POST-TWR,United Kingdom,NE1 2HF,9/30/2020,FY2020-Q2-03,AECOM COLLECTOR,AGOL,2020-09-02 00:00:00.000
It looks like your Online_Transactions table has multiple entries for your users?
You could use a CROSS APPLY to get the one row note the ORDER BY to get the latest one
SELECT DISTINCT
U.EMPLOYEE_TRACKING_ID,U.LAST_NAME, U.FIRST_NAME, U.EMAIL_ADDRESS, U.PERSON_TYPE,
U.SERVICE_LINE, U.SUPERVISOR_NAME, U.OFFICE_LOCATION, U.OFFICE_CITY, U.OFFICE_STATE,
U.OFFICE_COUNTRY, U.OFFICE_POSTAL_CODE, U.ACTUAL_TERMINATION_DATETIME,
T.FiscalPeriod, T.Role, T.Source, T.LogDate
FROM dbo.Users AS U
CROSS APPLY (
SELECT TOP 1 O.FiscalPeriod, O.Role, O.Source, O.LogDate
FROM dbo.Online_Transactions AS O
WHERE O.TrackingId = U.EMPLOYEE_TRACKING_ID
AND O.Source = 'AGOL'
ORDER BY O.LogDate DESC
) AS T
This could also be done with an inner select sub query, but I like this syntax.
You might want to change CROSS APPLY for OUTER APPLY just to see the results, but I normally use CROSS APPLY which is like an INNER JOIN, in that it will only show users that have a match to an Online_Transaction.

Access query, grouped sum of 2 columns where either column contains values

Another team has an Access database that they use to track call logs. It's very basic, really just a table with a few lookups, and they enter data directly in the datasheet view. They've asked me to assist with writing a report to sum up their calls by week and reason and I'm a bit stumped on this problem because I'm not an Access guy by any stretch.
The database consists of two core tables, one holding the call log entries (Calls) and one holding the lookup list of call reasons (ReasonsLookup). Relevant table structures are:
Calls
-----
ID (autonumber, PK)
DateLogged (datetime)
Reason (int, FK to ReasonLookup.ID)
Reason 2 (int, FK to ReasonLookup.ID)
ReasonLookup
------------
ID (autonumber PK)
Reason (text)
What they want is a report that looks like this:
WeekNum Reason Total
------- ---------- -----
10 Eligibility Request 24
10 Extension Request 43
10 Information Question 97
11 Eligibility Request 35
11 Information Question 154
... ... etc ...
My problem is that there are TWO columns in the Calls table, because they wanted to log a primary and secondary reason for receiving the call, i.e. someone calls for reason A and while on the phone also requests something under reason B. Every call will have a primary reason column value (Calls.Reason not null) but not necessarily a secondary reason column value (Calls.[Reason 2] is often null).
What they want is, for each WeekNum, a single (distinct) entry for each possible Reason, and a Total of how many times that Reason was used in either the Calls.Reason or Calls.[Reason 2] column for that week. So in the example above for Eligibility Request, they want to see one entry for Eligibility Request for the week and count every record in Calls that for that week that has Calls.Reason = Eligibility Request OR Calls.[Reason 2] = Eligibility Request.
What is the best way to approach a query that will display as shown above? Ideally this is a straight query, no VBA required. They are non-technical so the simpler and easier to maintain the better if possible.
Thanks in advance, any help much appreciated.
The "normal" approach would be to use a union all query as a subquery to create a set of weeks and reasons, however Access doesn't support this, but what you can do that should work is to first define a query to make the union and then use that query as a source for the "main" query.
So the first query would be
SELECT datepart("ww",datelogged) as week, Reason from calls
UNION ALL
SELECT datepart("ww",datelogged), [Reason 2] from calls;
Save this as UnionQuery and make another query mainQuery:
SELECT uq.week, rl.reason, Count(*) AS Total
FROM UnionQuery AS uq
INNER JOIN reasonlookup AS rl ON uq.reason = rl.id
GROUP BY uq.week, rl.reason;
You can use a Union query to append individual Group By Aggregate queries for both Reason and Reason 2:
SELECT DatePart("ww", Calls.DateLogged) As WeekNum, ReasonLookup.Reason,
Sum(Calls.ID) As [Total]
FROM Calls
INNER JOIN Calls.Reason = ReasonLookup.ID
GROUP BY DatePart("ww", Calls.DateLogged) As WeekNum, ReasonLookup.Reason;
UNION
SELECT DatePart("ww", Calls.DateLogged) As WeekNum, ReasonLookup.Reason,
Sum(Calls.ID) As [Total]
FROM Calls
INNER JOIN Calls.[Reason 2] = ReasonLookup.ID
GROUP BY DatePart("ww", Calls.DateLogged) As WeekNum, ReasonLookup.Reason;
DatePart() outputs the specific date's week number in the calendar year. Also, UNION as opposed to UNION ALL prevents duplicate rows from appearing.

How Do I Get Total 1 Time for Multiple Rows

I've been asked to modify a report (which unfortunately was written horribly!! not by me!) to include a count of days. Please note the "Days" is not calculated using "StartDate" & "EndDate" below. The problem is, there are multiple rows per record (users want to see the detail for start & enddate), so my total for "Days" are counting for each row. How can I get the total 1 time without the total in column repeating?
This is what the data looks like right now:
ID Description startdate enddate Days
REA145681 Emergency 11/17/2011 11/19/2011 49
REA145681 Emergency 12/6/2011 12/9/2011 49
REA145681 Emergency 12/10/2011 12/14/2011 49
REA146425 Emergency 11/23/2011 12/8/2011 54
REA146425 Emergency 12/9/2011 12/12/2011 54
I need this:
ID Description startdate enddate Days
REA145681 Emergency 11/17/2011 11/19/2011 49
REA145681 Emergency 12/6/2011 12/9/2011
REA145681 Emergency 12/10/2011 12/14/2011
REA146425 Emergency 11/23/2011 12/8/2011 54
REA146425 Emergency 12/9/2011 12/12/2011
Help please. This is how the users want to see the data.
Thanks in advance!
Liz
--- Here is the query simplified:
select id
,description
,startdate -- users want to see all start dates and enddates
,enddate
,days = datediff(d,Isnull(actualstardate,anticipatedstartdate) ,actualenddate)
from table
As you didn't provide the data of your tables I'll operate over your result as if it was a table. This will result in what you're looking for:
select *,
case row_number() over (partition by id order by id)
when 1 then days
end
from t
Edit:
Looks like you DID added some SQL code. This should be what you're looking for:
select *,
case row_number() over (partition by id order by id)
when 1 then
datediff(d,Isnull(actualstardate,anticipatedstartdate) ,actualenddate)
end
from t
That is a task for the reporting tool. You will have to write something like he next code in teh Display Properties of the Days field:
if RowNumber > 1 AND id = previous_row(id)
then -- hide the value of Days
Colour = BackgroundColour
Days = NULL
Days = ' '
Display = false
... (anything that works)
So they want the output to be exactly the same except that they don't want to see the days listed multiple times for each ID value? And they're quite happy to see the ID and Description repeatedly but the Days value annoys them?
That's not really an SQL question. SQL is about which rows, columns and derived values are supposed to be presented in what order and that part seems to be working fine.
Suppressing the redundant occurrences of the Days value is more a matter of using the right tool. I'm not up on the current tools but the last time I was, QMF was very good for this kind of thing. If a column was the basis for a control break, you could, in effect, select an option for that column that told it not to repeat the value of the control break repeatedly. That way, you could keep it from repeating ID, Description AND Days if that's what you wanted. But I don't know if people are still using QMF and I have no idea if you are. And unless the price has come way down, you don't want to go out and buy QMF just to suppress those redundant values.
Other tools might do the same kind of thing but I can't tell you which ones. Perhaps the tool you are using to do your reporting - Crystal Reports or whatever - has that feature. Or not. I think it was called Outlining in QMF but it may have a different name in your tool.
Now, if this report is being generated by an application program, that is a different kettle of Fish. An application could handle that quite nicely. But most people use end-user reporting tools to do this kind of thing to avoid the greater cost involved in writing programs.
We might be able to help further if you specify what tool you are using to generate this report.

Splitting one table based on criteria and comparing

I'm not quite sure on the best way to phrase this particular query, so I hope the title is adequate, however, I will attempt to describe what it is I need to be able to understand how to do. Just to clarify, this is for oracle sql.
We have a table called assessments. There are different kinds of assessments within this table, however, some assessments should follow others in a logical order and within set time frames. The problems come in when a client has multiple assessments of the same type, as we have to use a fairly inefficient array formula in excel to identify which 'full' assessment corresponds with the 'initial' assessment.
I have an earlier query that was resolved on this site (Returning relevant date from multiple tables including additional table info) which I believe includes a lot of the logic for what is required (particularly in identifying a corresponding event which has occurred within a specified timeframe). However, whilst that query pulls data from 3 seperate tables (assessments, events, responsiblities), I now need to create a query that generates a similar outcome but pulling from 1 main table and a 2nd table to return worker information. I thought the most logical way would be be to create a query that looks at the assessment table with one type of assessment, and then joins to the assessment table again (possibly a temporary table?) with assessment type that would follow the initial one.
For example:
Table 1 (Assessments):
Client ID Assessment Type Start End
P1 1 Initial 01/01/2012 05/01/2012
Table 2 (Assessments temp?):
Client ID Assessment Type Start End
P1 2 Full 12/01/2012
Table 3:
ID Worker Team
1 Bob Team1
2 Lyn Team2
Result:
Client ID Initial Start Initial End Initial Worker Full Start Full End
P1 1 01/01/2012 05/01/2012 Bob 12/01/2012
So table 1 and table 2 draw from the same table, except it's bringing back different assessments. Ideally, there'd be a check to make sure that the 'full' assessment started within X days of the end of the 'initial' assessment (similar to the 'likely' check in the previous query mentioned earlier). If this can be achieved, it's probably worth mentioning that I'd also be interested in expanding this to look at multiple assessment types, as roughly in the cycle a client could be expected to have between 4 or 5 different types of assessment. Any pointers would be appreciated, I've already had a great deal of help from this community which is very valuable.
Edit:
Edited to include solution following MBs advice.
Select
*
From(
Select
I.ASM_SUBJECT_ID as PNo,
I.ASM_ID As IAID,
I.ASM_QSA_ID as IAType,
I.ASM_START_DATE as IAStart,
I.ASM_END_DATE as IAEnd,
nvl(olm_bo.get_ref_desc(I.ASM_OUTCOME,'ASM_OUTCOME'),'') as IAOutcome,
C.ASM_ID as CAID,
C.ASM_QSA_ID as CAType,
C.ASM_START_DATE as CAStart,
C.ASM_END_DATE as CAEnd,
nvl(olm_bo.get_ref_desc(C.ASM_OUTCOME,'ASM_OUTCOME'),'') as CAOutcome,
ROUND(C.ASM_START_DATE -I.ASM_START_DATE,0) as "Likely",
row_number() over(PARTITION BY I.ASM_ID
ORDER BY
abs(I.ASM_START_DATE - C.ASM_START_DATE))as "Row Number"
FROM
O_ASSESSMENTS I
left join O_ASSESSMENTS C
on I.ASM_SUBJECT_ID = C.ASM_SUBJECT_ID
and C.ASM_QSA_ID IN ('AA523','AA1326') and
ROUND(C.ASM_START_DATE - I.ASM_START_DATE,0) >= -2
AND
ROUND(C.ASM_START_DATE - I.ASM_START_DATE,0) <= 25
and C.ASM_OUTCOME <>'ABANDON'
Where I.ASM_QSA_ID IN ('AA501','AA1323')
AND I.ASM_OUTCOME <> 'ABANDON'
AND
I.ASM_END_DATE >= '01-04-2011') WHERE "Row Number" = 1
You can access the same table multiple times in a given query in SQL, simply by using table aliases. So one way of doing this would be:
select i.client,
i.id initial_id,
i.start initial_start,
i.end initial_end,
w.worker initial_worker,
f.id full_id,
f.start full_start,
f.end full_end
from assessments i
join workers w on i.id = w.id
left join assessments f
on i.client = f.client and
f.assessment_type = 'Full' and
f.start between i.end and i.end + X
/* replace X with appropriate number of days */
where i.assessment_type = 'Initial'
Note: column names such as end (that are reserved words in Oracle SQL) should normally be double-quoted, but from the previous question it looks as though these are simplified versions of the actual column names.
From your post, I assume that you're using Oracle here (as I see "Oracle" in the question).
In terms of "temp" tables, Views come right to mind. An Oracle View can give you different looks of a table which is what it sounds like you're looking for with different kinds of assessments.
Don Burleson is a good source for anything Oracle related and he gives some tips on Oracle Views at http://www.dba-oracle.com/concepts/views.htm

Group by run when there is no run number in data (was Show how changing the length of a production run affects time-to-build)

It would seem that there is a much simpler way to state the problem. Please see Edit 2, following the sample table.
I have a number of different products on a production line. I have the date that each product entered production. Each product has two identifiers: item number and serial number I have the total number of labour hours for each product by item number and by serial number (i.e. I can tell you how many hours went into each object that was manufactured and what the average build time is for each kind of object).
I want to determine how (if) varying the length of production runs affects the average time it takes to build a product (item number). A production run is the sequential production of multiple serial numbers for a single item number. We have historical records going back several years with production runs varying in length from 1 to 30.
I think to achieve this, I need to be able to assign 'run id'. To me, that means building a query that sorts by start date and calculates a new unique value at each change in item number. If I knew how to do that, I could solve the rest of the problem on my own.
So that suggests a series of related questions:
Am I thinking about this the right way?
If I am on the right track, how do I generate those run id values? Calculate and store is an option, although I have a (misguided?) preference for direct queries. I know exactly how I would generate the run numbers in Excel, but I have a (misguided?) preference to do this in the database.
If I'm not on the right track, where might I find that track? :)
Edit:
Table structure (simplified) with sample data:
AutoID Item Serial StartDate Hours RunID (proposed calculation)
1 Legend 1234 2010-06-06 10 1
3 Legend 1235 2010-06-07 9 1
2 Legend 1237 2010-06-08 8 1
4 Apex 1236 2010-06-09 12 2
5 Apex 1240 2010-06-10 11 2
6 Legend 1239 2010-06-11 10 3
7 Legend 1238 2010-06-12 8 3
I have shown that start date, serial, and autoID are mutually unrelated. I have shown the expectation that labour goes down as the run length increases (but this is a 'fact' only via received wisdom, not data analysis). I have shown what I envision as the heart of the solution, that being a RunID that reflects sequential builds of a single item. I know that if I could get that runID, I could group by run to get counts, averages, totals, max, min, etc. In addition, I could do something like hours/ to get percentage change from the start of the run. At that point I could graph the trends associated with different run lengths either globally across all items or on a per item basis. (At least I think I could do all that. I might have to muck about a bit, but I think I could get it done.)
Edit 2: This problem would appear to be: how do I get the 'starting' member (earliest start date) of each run when I don't already have a runID? (The runID shown in the sample table does not exist and I was originally suggesting that being able to calculate runID was a potentially viable solution.)
AutoID Item
1 Legend
4 Apex
6 Legend
I'm assuming that having learned how to find the first member of each run that I would then be able to use what I've learned to find the last member of each run and then use those two results to get all other members of each run.
Edit 3: my version of a query that uses the AutoID of the first item in a run as the RunID for all units in a run. This was built entirely from samples and direction provided by Simon, who has the accepted answer. Using this as the basis for grouping by run, I can produce a variety of run statistics.
SELECT first_product_of_run.AutoID AS runID, run_sibling.AutoID AS itemID, run_sibling.Item, run_sibling.Serial, run_sibling.StartDate, run_sibling.Hours
FROM (SELECT first_of_run.AutoID, first_of_run.Item, first_of_run.Serial, first_of_run.StartDate, first_of_run.Hours
FROM dbo.production AS first_of_run LEFT OUTER JOIN
dbo.production AS earlier_in_run ON first_of_run.AutoID - 1 = earlier_in_run.AutoID AND
first_of_run.Item = earlier_in_run.Item
WHERE (earlier_in_run.AutoID IS NULL)) AS first_product_of_run LEFT OUTER JOIN
dbo.production AS run_sibling ON first_product_of_run.Item = run_sibling.Item AND first_product_of_run.AutoID run_sibling.AutoID AND
first_product_of_run.StartDate product_between.Item AND
first_product_of_run.StartDate
Could you describe your table structure some more? If the "date that each product entered production" is a full time stamp, or if there is a sequential identifier across products, you can write queries to identify the first and last products of a run. From that, you can assign IDs to or calculate the length of the runs.
Edit:
Once you've identified 1,4, and 6 as the start of a run, you can use this query to find the other IDs in the run:
select first_product_of_run.AutoID, run_sibling.AutoID
from first_product_of_run
left join production run_sibling on first_product_of_run.Item = run_sibling.Item
and first_product_of_run.AutoID <> run_sibling.AutoID
and first_product_of_run.StartDate < run_sibling.StartDate
left join production product_between on first_product_of_run.Item <> product_between.Item
and first_product_of_run.StartDate < product_between.StartDate
and product_between.StartDate < run_sibling.StartDate
where product_between.AutoID is null
first_product_of_run can be a temp table, table variable, or sub-query that you used to find the start of a run. The key is the where product_between.AutoID is null. That restricts the results to only pairs where no different items were produced between them.
Edit 2, here's how to get the first of each run:
select first_of_run.AutoID
from
(
select product.AutoID, product.Item, MAX(previous_product.StartDate) as PreviousDate
from production product
left join production previous_product on product.AutoID <> previous_product.AutoID
and product.StartDate > previous_product.StartDate
group by product.AutoID, product.Item
) first_of_run
left join production earlier_in_run
on first_of_run.PreviousDate = earlier_in_run.StartDate
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
It's not pretty, and will break if StartDate is not unique. The query could be simplified by adding a sequential and unique identifier with no gaps. In fact, that step will probably be necessary if StartDate is not unique. Here's how it would look:
select first_of_run.AutoID
from production first_of_run
left join production earlier_in_run
on (first_of_run.Sequence - 1) = earlier_in_run.Sequence
and first_of_run.Item = earlier_in_run.Item
where earlier_in_run.AutoID is null
Using outer joins to find where things aren't still twists my brain, but it's a very powerful technique.