In Firebird 2.5 I have a table of hardware device events; each row contains a timestamp, a device ID and an integer status of the event. I need to retrieve a rowset of the subset of IDs with non-0 statuses and the number of instances of the non-0 events for each ID, within a specified date range. I can get the subset of IDs with non-0 statuses in the specified date range, but I can't figure out how to get the count of non-0-status rows associated with each ID in the same rowset. I'd prefer to do this in a query rather than a stored proc, if possible.
The table is:
RPR_HISTORY
TSTAMP timestamp
RPRID integer
PARID integer
LASTRES integer
LASTCUR float
The rowset I want is like
RPRID ERRORCOUNT
-------------------
18 4
19 2
66 7
The query
select distinct RPRID from RPR_HISTORY
where (LASTRES <> 0)
and (TSTAMP >= :STARTSTAMP);
gives me the IDs I'm looking for, but obviously not the count of non-0-status rows for each ID. I've tried a bunch of combinations of nested queries derived from the above; all generate errors, usually on grouping or aggregation errors. It seems like a straightforward thing to do but is just escaping me.
Got it! The query
select rh.RPRID, count(rh.RPRID) from RPR_HISTORY rh
where (rh.LASTRES <> 0)
and (rh.TSTAMP >= :STARTSTAMP)
and rh.RPRID in
(select distinct rd.RPRID from RPR_HISTORY rd where rd.LASTRES <> 0)
group by rh.RPRID;
returns the rowset I need.
Yes, I know this seems simple:
SELECT DISTINCT(...)
Except, it apparently isn't
Here is my actual Query:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS,
IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune,
IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical,
IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther,
IIf([DecReason]=7,1,0) AS YesAlready
FROM
EmployeeInformation
INNER JOIN (CompletedTrainings
LEFT JOIN DeclinationReasons ON CompletedTrainings.DecReason = DeclinationReasons.ReasonID)
ON EmployeeInformation.ID = CompletedTrainings.Employee
GROUP BY
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No"),
IIf([DecReason]=1,1,0),
IIf([DecReason]=2,1,0),
IIf([DecReason]=3,1,0),
IIf([DecReason]=4,1,0),
IIf([DecReason]=5,1,0),
IIf([DecReason]=6,1,0),
IIf([DecReason]=7,1,0)
HAVING
((((EmployeeInformation.Active) Like -1)
AND ((CompletedTrainings.DecShotDate + 365 >= DATE())
OR (CompletedTrainings.DecShotDate IS NULL))));
This is Joining a few tables (obviously) in order to get a number of records. The problem is that if someone is duplicated on the table with a NULL in one of the date fields, and a date in another field, it pulls both the NULL and the DATE, or pulls multiple NULLS it might pull multiple dates but those are not present right at the moment.
I need the Nulls, they are actual data in this particular case, but if someone has a date and a NULL I need to pull only the newest record, I thought I could add MAX(RecordID) from the table, but that didn't change the results of the query either.
That code:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
MAX(CompletedTrainings.RecordID),
CompletedTrainings.DecShotDate
...
And it returned the same issue, Duplicated EmployeeInformation.ID with different DecShotDate values.
Currently it returns:
ID
Active
DecShotDate
etc. x a bunch
1
-1
date date
whatever goes
2
-1
in these
2
-1
date date
columns
These are being used in a report, that is to determine the total number of employees who fit the criteria of the report. The NULLs in DecShotDate are needed as they show people who did not refuse to get a flu vaccine in the current year, while the dates are people who did refuse.
Now I have come up with one simple solution, I could add a column to the CompletedTrainings Table that contains a date or other value, and add that to the HAVING statement. This might be the right solution as this is a yearly training questionnaire that employees have to fill out. But I am asking for advice before doing this.
Am I right in thinking I need to add a column to filter by so that older data isn't being pulled, or should I be able to do this by pulling recordID, and did I just bork that part of the query up?
Edited to add raw table views:
EmployeeInformation Table:
ID
Last
First
empID
Active
Termdate
DoH
Title
PT/FT/PD
PI
1
Doe
Jane
982
-1
date
Sr
PD
X
2
Roe
John
278
0
date
date
Jr
PD
X
3
Moe
Larry
1232
-1
date
Sr
FT
X
4
Zoe
Debbie
1424
-1
date
Sr
PT
X
DeclinationReasons Table:
ReasonID
Reason
1
Allergy
2
Already got it
3
Illness
CompletedTrainings Table:
RecordID
Employee
Training
...
DecShotdate
DecShotLocation
DecShotReason
DecExp
1
1
4
date
location
2
text
2
1
4
3
2
4
4
3
4
date
location
3
text
5
3
4
date
location
1
text
6
4
4
After some serious soul searching, I decided to use another column and filter by that.
In the end my query looks like this:
SELECT *
FROM (
(
SELECT RecordID, DecShotDate, DecShotLocation, DecReason, DecExplanation, Employee,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS, IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune, IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical, IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther, IIf([DecReason]=7,1,0) AS YesAlready
FROM CompletedTrainings WHERE (CompletedDate > DATE() - 365 ) AND (Training = 69)) AS T1
LEFT JOIN
(
SELECT ID, Active FROM EmployeeInformation) AS T2 ON T1.Employee = T2.ID)
LEFT JOIN
(
SELECT Reason, ReasonID FROM DeclinationReasons) AS T3 ON T1.DecReason = T3.ReasonID;
This may not have been the best solution, but it did exactly what I needed. Which is to get the information by latest entry into the database.
Previously I had tried to use MAX(), DISTINCT(), etc. but always had a problem of multiple records being retrieved. In this case, I intentionally SELECT the most recent records first, then join them to the results of the next query, and so on. Until I have all the required data for my report.
I write this in hopes someone else finds it useful. Or even better if someone tells me why this is wrong, so as to improve my own skills.
OK, another newbie SQL question which i'm sure has a simple solution and i'll kick myself when someone posts the answer!
I have two tables as follows
PRICE_DTA
PRC_DATE PRC_TIME PRC_PRICE PRC_ITEM
2008-01-01 06.00.00 1.05 JUMPER
2008-01-01 09.00.00 1.20 JUMPER
2008-01-25 17.00.00 1.75 JUMPER
2008-01-02 09.00.00 2.25 TROUSERS
2008-10-25 12.00.00 2.95 TROUSERS
SALE_DTA
TRN_DATE TRN_TIME TRN_PRICE_PAID TRN_ITEM
2008-01-01 08.30.00 JUMPER
2008-01-03 10.00.00 JUMPER
2008-01-03 17.00.00 JUMPER
2008-01-01 13.00.00 TROUSERS
2008-01-02 09.00.00 TROUSERS
The way the prices work is that you get the NEXT available price(prices aren't set until after the purchase because we bulk all the orders up and get a cheaper price the more we buy in one go). So, in the example the 08.30.00 purchase on 2008-01-01 will have been for 1.20 because that is the first available price after the purchase date/time
So, I need to populate the prices in the SALE_DTA table using the TRN_DATE/TRN_TIME fields to go an get the next available price off the PRICE_DTA tables. NOTE: The DATE and TIME fields on both tables are CHAR fields not date/timestamp fields
I can concatenate the date and time easily enough but i'm not sure how to find the FIRST record on PRICE_DTA with a date/time stamp greater than that. I know on UNISYS DMS II I can use a 'FIND NEXT GREATER THAN' but can't find a similar command in SQL?
I'm happy to create a temporary table as part of the solution if that makes it simpler.
The generic SQL solution for this can be done with a couple of joins:
SELECT
* --TODO - Pick appropriate columns
FROM
SALE_DTA s
INNER JOIN
PRICE_DTA p
ON
p.PRC_ITEM = s.TRN_ITEM and
(p.PRC_DATE > s.TRN_DATE or
(p.PRC_DATE = s.TRN_DATE and
p.PRC_TIME > s.TRN_TIME
))
LEFT JOIN
PRICE_DTA p2
ON
p2.PRC_ITEM = s.TRN_ITEM and
(p2.PRC_DATE > s.TRN_DATE or
(p2.PRC_DATE = s.TRN_DATE and
p2.PRC_TIME > s.TRN_TIME
)) and
(p2.PRC_DATE < p.PRC_DATE or
(p2.PRC_DATE = p.PRC_DATE and
p2.PRC_TIME < p.PRC_TIME
))
WHERE
p2.PRC_ITEM IS NULL
Hopefully, you can see the logic here. The INNER JOIN is used to match rows in SALE_DTA with all rows in PRICE_DTA that occur afterwards. We then do a second join (the LEFT JOIN) to this PRICE_DTA again, this time trying to locate a row with this join (p2) such that it still occurs after the s date/time, but occurs before the p date/time.
Finally, in the WHERE clause, we eliminate any rows where this LEFT JOIN actually succeeded. Therefore, by deduction, we know that the row that we matched in p is the earliest row from PRICE_DTA which occurs after the SALE_DTA date/time.
You can certainly get the data required but DB2 don't support JOIN with UPDATE statement. So you can take a different route like
Create a auxiliary table
create table SALE_DTA_temp(TRN_DATE,TRN_TIME,TRN_PRICE_PAID,TRN_ITEM)
Do a insert into temp table from the query
insert into SALE_DTA_temp
select sd.TRN_DATE,
sd.TRN_TIME,
tab.max_PRC_PRICE as TRN_PRICE_PAID,
sd.TRN_ITEM
from SALE_DTA sd
join
(
select PRC_DATE, max(PRC_PRICE) as max_PRC_PRICE
from PRICE_DTA
group by PRC_DATE
) tab on sd.TRN_DATE = tab.PRC_DATE
Drop the old table
drop table SALE_DTA
Rename the table
RENAME TABLE SALE_DTA_temp TO SALE_DTA
I have a state machine architecture, where a record will have many state transitions, the one with the greatest sort_key column being the current state. My problem is to determine which records held a particular state (or states) for a given date.
Example data:
items table
id
1
item_transitions table
id item_id created_at to_state sort_key
1 1 05/10 "state_a" 1
2 1 05/12 "state_b" 2
3 1 05/15 "state_a" 3
4 1 05/16 "state_b" 4
Problem:
Determine all records from items table which held state "state_a" on date 05/15. This should obviously return the item in the example data, but if you query with date "05/16", it should not.
I presume I'll be using a LEFT OUTER JOIN to join the items_transitions table to itself and narrow down the possibilities until I have something to query on that will give me the items that I need. Perhaps I am overlooking something much simpler.
Your question rephrased means "give me all items which have been changed to state_a on 05/15 or before and have not changed to another state afterwards. Please note that for the example it added 2001 as year to get a valid date. If your "created_at" column is not a datetime i strongly suggest to change it.
So first you can retrieve the last sort_key for all items before the threshold date:
SELECT item_id,max(sort_key) last_change_sort_key
FROM item_transistions it
WHERE created_at<='05/15/2001'
GROUP BY item_id
Next step is to join this result back to the item_transitions table to see to which state the item was switched at this specific sort_key:
SELECT *
FROM item_transistions it
JOIN (SELECT item_id,max(sort_key) last_change_sort_key
FROM item_transistions it
WHERE created_at<='05/15/2001'
GROUP BY item_id) tmp ON it.item_id=tmp.item_id AND it.sort_key=tmp.last_change_sort_key
Finally you only want those who switched to 'state_a' so just add a condition:
SELECT DISTINCT it.item_id
FROM item_transistions it
JOIN (SELECT item_id,max(sort_key) last_change_sort_key
FROM item_transistions it
WHERE created_at<='05/15/2001'
GROUP BY item_id) tmp ON it.item_id=tmp.item_id AND it.sort_key=tmp.last_change_sort_key
WHERE it.to_state='state_a'
You did not mention which DBMS you use but i think this query should work with the most common ones.
I am working on a project that keeps a track of repaired cell phones.
In the select statement, I would like to find the duplicate IMEI numbers and check if the AddedDate between the duplicates is less than 30 days. Another words, the select should list all the phones even including the duplicated IMEI numbers if the AddedDate is more than 30 days.
I hope I described it clear enough. Thank you.
Additional notes:
I have tried it by including groupBy under a sub-select which did find the duplicates, but I wasn't able to implement an if condition. Instead, I was going to place all duplicates into a dynamic table and then use a select statement against this table. Before doing so, I thought of posting my question here.
For example DB_Phones has the following rows
ID - AddedDate - IMEI
1 - 01.10.2012 - 123456789012345
2 - 15.10.2012 - 987654321012345
3 - 20.10.2012 - 123456789012345
Based on the table above, I would like to list only the second row (ID# 2) because the last duplicate (ID# 3) wasn't added 30 days after the row with the ID# 1. If rows were as below:
ID - AddedDate - IMEI
1 - 01.10.2012 - 123456789012345
2 - 15.10.2012 - 987654321012345
3 - 20.10.2012 - 123456789012345
4 - 21.11.2012 - 123456789012345
Then the second and fourth row should be returned. I need to return just one of the duplicates (last one) if the 30 day condition is met.
I hope it make more sense now. Thanks again.
A guess at what you're after:
SELECT
r.*,
(SELECT COUNT(*) FROM Repairs r2 WHERE r.IMEI = r2.IMEI
AND r.ID != r2.ID) as NumberOfAllDuplicates,
(SELECT COUNT(*) FROM Repairs r2 WHERE r.IMEI = r2.IMEI
AND ABS(DATEDIFF(day, r.AddedDate, r2.AddedDate)) < 30
AND r.ID != r2.ID) as NumberOfNearDuplicates
FROM
Repairs r
This depends on having an ID field, and everything existing in one table. With the correlated sub queries, it may not be very fast on long data.