I have a table like below where I need to query a count of how many times each ID went from specifically 'Waste Sale' in one value to 'On Stop' in the very next value based on ascending date and if there are no instances of this, the count will be 0
ID
Stage name
Stage Changed Date
1
Waste Sale
06-05-2022
1
On Stop
08-06-2022
1
Cancelled
09-02-2022
2
Waste Sale
06-05-2022
2
On Stop
07-05-2022
2
Waste Sale
08-06-2022
2
On Stop
10-07-2022
3
Cancelled
10-07-2022
3
On Stop
11-07-2022
The result I would be looking for based on the above table would be something like this:
ID
Count of 'Waste Sales to On Stops'
1
1
2
2
3
0
ID 1 having a count of 1 because there was one instance of 'Waste Sale' changing to 'On Stop' in the very next value based on date range
ID 3 having a count of 0 because even though the stage name changed to 'On Stop' the previous value based on date range wasn't 'Waste Sale'.
I have a hunch I would have to use something like LEAD() and GROUP BY/ ORDER BY but since I'm so new to SQL would really appreciate some help on the specific syntax and coding. Any version of SQL is okay.
We can use window function lead to take a peek at the next value of the query result.
select distinct id,
(
select count(*)
from
(
select *,
lead(stage_name)
over(
partition by id
order by stage_changed_date)
as stage_next
from sales s2
) s3
where s3.id = s1.id
and s3.stage_name = 'waste sale'
and s3.stage_next = 'on stop'
) as count_of_waste_sales_to_on_stop
from sales s1
order by id;
Query above uses lead(stage_name) over(partition by id order by stage_changed_date) to get the next stage_name in the query result while segregating it by id and order it based on stage_changed_date. Check the query on DB Fiddle.
Note:
I have no experience in zoho, so i'm unsure if the query will 100% works or not. They said it supported ansi-sql, however there might some differences with MySQL due to reasons.
The column names are not the exact same with op question due to testing only done using DB Fiddle.
There might better query out there waiting to be written.
Yes, I know this seems simple:
SELECT DISTINCT(...)
Except, it apparently isn't
Here is my actual Query:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS,
IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune,
IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical,
IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther,
IIf([DecReason]=7,1,0) AS YesAlready
FROM
EmployeeInformation
INNER JOIN (CompletedTrainings
LEFT JOIN DeclinationReasons ON CompletedTrainings.DecReason = DeclinationReasons.ReasonID)
ON EmployeeInformation.ID = CompletedTrainings.Employee
GROUP BY
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No"),
IIf([DecReason]=1,1,0),
IIf([DecReason]=2,1,0),
IIf([DecReason]=3,1,0),
IIf([DecReason]=4,1,0),
IIf([DecReason]=5,1,0),
IIf([DecReason]=6,1,0),
IIf([DecReason]=7,1,0)
HAVING
((((EmployeeInformation.Active) Like -1)
AND ((CompletedTrainings.DecShotDate + 365 >= DATE())
OR (CompletedTrainings.DecShotDate IS NULL))));
This is Joining a few tables (obviously) in order to get a number of records. The problem is that if someone is duplicated on the table with a NULL in one of the date fields, and a date in another field, it pulls both the NULL and the DATE, or pulls multiple NULLS it might pull multiple dates but those are not present right at the moment.
I need the Nulls, they are actual data in this particular case, but if someone has a date and a NULL I need to pull only the newest record, I thought I could add MAX(RecordID) from the table, but that didn't change the results of the query either.
That code:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
MAX(CompletedTrainings.RecordID),
CompletedTrainings.DecShotDate
...
And it returned the same issue, Duplicated EmployeeInformation.ID with different DecShotDate values.
Currently it returns:
ID
Active
DecShotDate
etc. x a bunch
1
-1
date date
whatever goes
2
-1
in these
2
-1
date date
columns
These are being used in a report, that is to determine the total number of employees who fit the criteria of the report. The NULLs in DecShotDate are needed as they show people who did not refuse to get a flu vaccine in the current year, while the dates are people who did refuse.
Now I have come up with one simple solution, I could add a column to the CompletedTrainings Table that contains a date or other value, and add that to the HAVING statement. This might be the right solution as this is a yearly training questionnaire that employees have to fill out. But I am asking for advice before doing this.
Am I right in thinking I need to add a column to filter by so that older data isn't being pulled, or should I be able to do this by pulling recordID, and did I just bork that part of the query up?
Edited to add raw table views:
EmployeeInformation Table:
ID
Last
First
empID
Active
Termdate
DoH
Title
PT/FT/PD
PI
1
Doe
Jane
982
-1
date
Sr
PD
X
2
Roe
John
278
0
date
date
Jr
PD
X
3
Moe
Larry
1232
-1
date
Sr
FT
X
4
Zoe
Debbie
1424
-1
date
Sr
PT
X
DeclinationReasons Table:
ReasonID
Reason
1
Allergy
2
Already got it
3
Illness
CompletedTrainings Table:
RecordID
Employee
Training
...
DecShotdate
DecShotLocation
DecShotReason
DecExp
1
1
4
date
location
2
text
2
1
4
3
2
4
4
3
4
date
location
3
text
5
3
4
date
location
1
text
6
4
4
After some serious soul searching, I decided to use another column and filter by that.
In the end my query looks like this:
SELECT *
FROM (
(
SELECT RecordID, DecShotDate, DecShotLocation, DecReason, DecExplanation, Employee,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS, IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune, IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical, IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther, IIf([DecReason]=7,1,0) AS YesAlready
FROM CompletedTrainings WHERE (CompletedDate > DATE() - 365 ) AND (Training = 69)) AS T1
LEFT JOIN
(
SELECT ID, Active FROM EmployeeInformation) AS T2 ON T1.Employee = T2.ID)
LEFT JOIN
(
SELECT Reason, ReasonID FROM DeclinationReasons) AS T3 ON T1.DecReason = T3.ReasonID;
This may not have been the best solution, but it did exactly what I needed. Which is to get the information by latest entry into the database.
Previously I had tried to use MAX(), DISTINCT(), etc. but always had a problem of multiple records being retrieved. In this case, I intentionally SELECT the most recent records first, then join them to the results of the next query, and so on. Until I have all the required data for my report.
I write this in hopes someone else finds it useful. Or even better if someone tells me why this is wrong, so as to improve my own skills.
I'm developing a system using Trac, and I want to limit the number of "changelog" entries returned. The issue is that Trac collates these entries from multiple tables using a union, and then later combines them into single 'changesets' based on their timestamp. I wish to limit the results to the latest e.g. 3 changesets, but this requires retrieving as many rows as necessary until I've got 3 unique timestamps. Solution needs to work for SQLite/Postgres.
Trac's current SQL
Current SQL Result
Time User Field oldvalue newvalue permanent
=======================================================================
1371806593507544 a owner b c 1
1371806593507544 a comment 2 lipsum 1
1371806593507544 a description foo bar 1
1371806593324529 b comment hello world 1
1371806593125677 c priority minor major 1
1371806592492812 d comment x y 1
Intended SQL Result (Limited to 1 timestamp e.g.)
Time User Field oldvalue newvalue permanent
=======================================================================
1371806593507544 a owner b c 1
1371806593507544 a comment 2 lipsum 1
1371806593507544 a description foo bar 1
As you already pointed out on your own, this cannot be resolved in SQL due to the undetermined number of results. And I think this is not even required.
You can use a slightly modified trac/ticket/templates/ticket.html Genshi template to get what you want. Change
<div id="changelog">
<py:for each="change in changes">
into
<div id="changelog">
<py:for each="change in changes[-3:]">
and place the file into <env>/templates/ restart your web-server. But watch out for changes to ticket.html, whenever you attempt to upgrade your Trac install. Every time you do that, you might need to re-apply this change on the current template of the respective version. But IMHO its still a lot faster and cleaner than to patch Trac core code.
If you want just three records (as in the "Data Limit 1" result set), you can use limit:
select *
from t
order by time desc
limit 3
If you want all records for the three most recent time stamps, you can use a join:
select t.*
from t join
(select distinct time
from t
order by times desc
limit 3
) tt
on tt.time = t.time
I Have an SQL query giving me X results, I want the query output to have a coulmn called
count making the query somthing like this:
count id section
1 15 7
2 3 2
3 54 1
4 7 4
How can I make this happen?
So in your example, "count" is the derived sequence number? I don't see what pattern is used to determine the count must be 1 for id=15 and 2 for id=3.
count id section
1 15 7
2 3 2
3 54 1
4 7 4
If id contained unique values, and you order by id you could have this:
count id section
1 3 2
2 7 4
3 15 7
4 54 1
Looks to me like mikeY's DSum approach could work. Or you could use a different approach to a ranking query as Allen Browne described at this page
Edit: You could use DCount instead of DSum. I don't know how the speed would compare between the two, but DCount avoids creating a field in the table simply to store a 1 for each row.
DCount("*","YourTableName","id<=" & [id]) AS counter
Whether you go with DCount or DSum, the counter values can include duplicates if the id values are not unique. If id is a primary key, no worries.
I frankly don't understand what it is you want, but if all you want is a sequence number displayed on your form, you can use a control bound to the form's CurrentRecord property. A control with the ControlSource =CurrentRecord will have an always-accurate "record number" that is in sequence, and that will update when the form's Recordsource changes (which may or may not be desirable).
You can then use that number to navigate around the form, if you like.
But this may not be anything like what you're looking for -- I simply can't tell from the question you've posted and the "clarifications" in comments.
The only trick I have seen is if you have a sequential id field, you can create a new field in which the value for each record is 1. Then you do a running sum of that field.
Add to your query
DSum("[New field with 1 in it]","[Table Name]","[ID field]<=" & [ID Field])
as counterthing
That should produce a sequential count in Access which is what I think you want.
HTH.
(Stolen from Rob Mills here:
http://www.access-programmers.co.uk/forums/showthread.php?p=160386)
Alright, I guess this comes close enough to constitute an answer: the following link specifies two approaches: http://www.techrepublic.com/blog/microsoft-office/an-access-query-that-returns-every-nth-record/
The first approach assumes that you have an ID value and uses DCount (similar to #mikeY's solution).
The second approach assumes you're OK creating a VBA function that will run once for EACH record in the recordset, and will need to be manually reset (with some VBA) every time you want to run the count - because it uses a "static" value to run its counter.
As long as you have reasonable numbers (hundreds, not thousands) or records, the second approach looks like the easiest/most powerful to me.
This function can be called from each record if available from a module.
Example: incrementingCounterTimeFlaged(10,[anyField]) should provide your query rows an int incrementing from 0.
'provides incrementing int values 0 to n
'resets to 0 some seconds after first call
Function incrementingCounterTimeFlaged(resetAfterSeconds As Integer,anyfield as variant) As Integer
Static resetAt As Date
Static i As Integer
'if reset date < now() set the flag and return 0
If DateDiff("s", resetAt, Now()) > 0 Then
resetAt = DateAdd("s", resetAfterSeconds, Now())
i = 0
incrementingCounterTimeFlaged = i
'if reset date > now increments and returns
Else
i = i + 1
incrementingCounterTimeFlaged = i
End If
End Function
autoincrement in SQL
SELECT (Select COUNT(*) FROM table A where A.id<=b.id),B.id,B.Section FROM table AS B ORDER BY B.ID Asc
You can use ROW_NUMBER() which is in SQL Server 2008
SELECT ROW_NUMBER() OVER (ORDER By ID DESC) RowNum,
ID,
Section
FROM myTable
Then RowNum displays sequence of row numbers.
I'm using Access via OleDb. I have a table with columns ID, GroupID, Time and Place. An application inserts new records into the table, unfortunately the Place isn't calculated correctly.
I want to update each record in a group with its correct place according to its time ascending.
So assume the following data:
ID GroupId Time Place
Chuck 1 10:01 2
Alice 1 09:01 3
Bob 1 09:31 1
should result in:
ID GroupId Time Place
Chuck 1 10:01 3
Alice 1 09:01 1
Bob 1 09:31 2
I could come up with a solution using a cursor but that's AFAIK not possible in Access.
I just did a search on performing "ranking in Access" and I got this support.microsoft result.
It seems you create a query with a field that has the following expression:
Place: (Select Count(*) from table1 Where [Time] < [table1alias].[Time]) + 1
I can't test this, so I hope it works.
Using this you may be able to do (where queryAbove is the above query):
UPDATE table1
SET [Place] = queryAbove.[Place]
FROM queryAbove
WHERE table1.ID = queryAbove.ID
It's a long shot but please give it a go.
I don't think time is a number or time formatted column, time is unfortunately a text string containing the numbers and dilimetrs of the time format. This is why sorting after the time column is illegal. Removing the dilimiters ":" and "," casting to integer and then sorting numirically could do the job