DO we have a way to get first record considering the time.
example
get first record today, get first record yesterday, get first record day before yesterday ...
Note: I want to get all records considering the time
sample expected output should be
first_record_today,
first_record_yesterday,..
As I understand the question, the "first" record per day is the earliest one.
For that, we can use RANK and do the PARTITION BY the day only, truncating the time.
In the ORDER BY clause, we will sort by the time:
SELECT sub.yourdate FROM (
SELECT yourdate,
RANK() OVER
(PARTITION BY DATE_TRUNC('DAY',yourdate)
ORDER BY DATE_TRUNC('SECOND',yourdate)) rk
FROM yourtable
) AS sub
WHERE sub.rk = 1
ORDER BY sub.yourdate DESC;
In the main query, we will sort the data beginning with the latest date, meaning today's one, if available.
We can try out here: db<>fiddle
If this understanding of the question is incorrect, please let us know what to change by editing your question.
A note: Using a window function is not necessary according to your description. A shorter GROUP BY like shown in the other answer can produce the correct result, too and might be absolutely fine. I like the window function approach because this makes it easy to add further conditions or change conditions which might not be usable in a simple GROUP BY, therefore I chose this way.
EDIT because the question's author provided further information:
Here the query fetching also the first message:
SELECT sub.yourdate, sub.message FROM (
SELECT yourdate, message,
RANK() OVER (PARTITION BY DATE_TRUNC('DAY',yourdate)
ORDER BY DATE_TRUNC('SECOND',yourdate)) rk
FROM yourtable
) AS sub
WHERE sub.rk = 1
ORDER BY sub.yourdate DESC;
Or if only the message without the date should be selected:
SELECT sub.message FROM (
SELECT yourdate, message,
RANK() OVER (PARTITION BY DATE_TRUNC('DAY',yourdate)
ORDER BY DATE_TRUNC('SECOND',yourdate)) rk
FROM yourtable
) AS sub
WHERE sub.rk = 1
ORDER BY sub.yourdate DESC;
Updated fiddle here: db<>fiddle
Related
I have a table 'exam_table' containing : User_ID, Exam_date, Exam_status.
Exam_status = ['Success' or 'Fail']
The question is :
Based on the above data, propose an SQL
query to finds the 5 candidates with the most failures. In case
of equality, we wish to obtain first the students whose date of first exam is the most distant in time.
I found the 5 candidates with the most failures but I still don't know how to sort them according to exam_date in case of equality.
Do you have any suggestions? Thank you in advance for helping !
Your order by is a clause which has ordering criteria separated by ,. So you can easily add another criteria, like below:
SELECT User_ID, count(exam_status) as nb_Failures
FROM exam_table
GROUP BY User_ID
ORDER BY nb_Failures, min(exam_date)
LIMIT 5;
UPDATED:
corrected by the date of the first exam:
SELECT
user_id,
MIN (exam_date) AS first_exam_date,
SUM (
CASE exam_status
WHEN 'Failed' THEN 1
ELSE 0
END
) AS nb_failures
FROM exam_table
GROUP BY user_id
ORDER BY nb_failures DESC, first_exam_date ASC
LIMIT 5;
or like this:
SELECT
user_id,
MIN (exam_date) AS first_exam_date,
COUNT(exam_status) AS nb_failures
FROM exam_table
WHERE exam_status = 'Failed'
GROUP BY user_id
ORDER BY nb_failures DESC, first_exam_date ASC
LIMIT 5;
PS: aggregate functions must also be applied to the date
PPS: but the first and second queries have different results. In the first, the date of the first exam is selected, in principle, it does not matter if it is successful or not. The second selects only the date of the first failed exam.
I have been trying to write a query to perfect this instance but cant seem to do the trick because I am still receiving duplicated. Hoping I can get help how to fix this issue.
SELECT DISTINCT
1.Client
1.ID
1.Thing
1.Status
MIN(1.StatusDate) as 'statdate'
FROM
SAMPLE 1
WHERE
[]
GROUP BY
1.Client
1.ID
1.Thing
1.status
My output is as follows
Client Id Thing Status Statdate
CompanyA 123 Thing1 Approved 12/9/2019
CompanyA 123 Thing1 Denied 12/6/2019
So although the query is doing what I asked and showing the mininmum status date per status, I want only the first status date. I have about 30k rows to filter through so whatever does not run overload the query and have it not run. Any help would be appreciated
Use window functions:
SELECT s.*
FROM (SELECT s.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY statdate) as seqnum
FROM SAMPLE s
WHERE []
) s
WHERE seqnum = 1;
This returns the first row for each id.
Use whichever of these you feel more comfortable with/understand:
SELECT
*
FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY statusdate) as rn
FROM sample
WHERE ...
) x
WHERE rn = 1
The way that one works is to number all rows sequentially in order of StatusDate, restarting the numbering from 1 every time ID changes. If you thus collect all the number 1's togetyher you have your set of "first records"
Or can coordinate a MIN:
SELECT
*
FROM
sample s
INNER JOIN
(SELECT ID, MIN(statusDate) as minDate FROM sample WHERE ... GROUP BY ID) mins
ON s.ID = mins.ID and s.StatusDate = mins.MinDate
WHERE
...
This one prepares a list of all the ID and the min date, then joins it back to the main table. You thus get all the data back that was lost during the grouping operation; you cannot simultaneously "keep data" and "throw away data" during a group; if you group by more than just ID, you get more groups (as you have found). If you only group by ID you lose the other columns. There isn't any way to say "GROUP BY id, AND take the MIN date, AND also take all the other data from the same row as the min date" without doing a "group by id, take min date, then join this data set back to the main dataset to get the other data for that min date". If you try and do it all in a single grouping you'll fail because you either have to group by more columns, or use aggregating functions for the other data in the SELECT, which mixes your data up; when groups are done, the concept of "other data from the same row" is gone
Be aware that this can return duplicate rows if two records have identical min dates. The ROW_NUMBER form doesn't return duplicated records but if two records have the same minimum StatusDate then which one you'll get is random. To force a specific one, ORDER BY more stuff so you can be sure which will end up with 1
I have the following problem with my data on a DB2 database. I want to create an overview when a machine was used for a project with a begin and end date.
The following data is available:
||Machine name||Description||Project||Start date|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|07-03-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|16-03-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|24-04-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|07-05-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_2|13-05-2016|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|22-05-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|12-06-2017|
The result that I'm looking for is:
|Machine name||Description||Project||Start date||Last date|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|07-03-2017|07-05-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_2|13-05-2016|13-05-2017|
|Mach1|DB2_AIX|Team_1_PERS|TEST_1|22-05-2017|12-06-2017|
Does anybody have an idea how to create this result with a statement?
This is a classic gaps-and-islands problem, and the standard solutions will work just fine:
WITH Grouped_Run AS (SELECT name, description, project, test, executedOn,
ROW_NUMBER() OVER(ORDER BY executedOn) -
ROW_NUMBER() OVER(PARTITION BY name, description, project, test ORDER BY executedOn) AS groupingId
FROM Machine)
SELECT name, description, project, test, MIN(executedOn) as testStart
FROM Grouped_Run
GROUP BY name, description, project, test, groupingId
ORDER BY testStart
Fiddle example
(it's a little unclear if the group is going to be the whole row, but that's adjustable)
....will produce the results you're looking for.
Note that depending on what specific version you're on, there may be other/faster ways to achieve these results.
It seems like you're trying to get the first and last of "Start date". Write a GROUP BY query with MIN(Start date) and another with MAX(Start date) then union the results. You'll have to select DISTINCT or do another GROUP BY to eliminate the duplicates that will occur when there's only one date.
I only have basic SQL skills. I'm working in SQL in Navicat. I've looked through the threads of people who were also trying to get latest date, but not yet been able to apply it to my situation.
I am trying to get the latest date for each name, for each chemical. I think of it this way: "Within each chemical, look at data for each name, choose the most recent one."
I have tried using max(date(date)) but it needs to be nested or subqueried within chemical.
I also tried ranking by date(date) DESC, then using LIMIT 1. But I was not able to nest this within chemical either.
When I try to write it as a subquery, I keep getting an error on the ( . I've switched it up so that I am beginning the subquery a number of different ways, but the error returns near that area always.
Here is what the data looks like:
1
Here is one of my failed queries:
SELECT
WELL_NAME,
CHEMICAL,
RESULT,
APPROX_LAT,
APPROX_LONG,
DATE
FROM
data_all
ORDER BY
CHEMICAL ASC,
date( date ) DESC (
SELECT
WELL_NAME,
CHEMICAL,
APPROX_LAT,
APPROX_LONG,
DATE
FROM
data_all
WHERE
WELL_NAME = WELL_NAME
AND CHEMICAL = CHEMICAL
AND APPROX_LAT = APPROX_LAT
AND APPROX_LONG = APPROX_LONG,
LIMIT 2
)
If someone does have a response, it would be great if it is in as lay language as possible. I've only had one coding class. Thanks very much.
Maybe something like this?
SELECT WELL_NAME, CHEMICAL, MAX(DATE)
FROM data_all
GROUP BY WELL_NAME, CHEMICAL
If you want all information, then use the ANSI-standard ROW_NUMBER():
SELECT da.*
FROM (SELECT da.*
ROW_NUMBER() OVER (PARTITION BY chemical, name ORDER BY date DESC) as senum
FROM data_all da
) da
WHERE seqnum = 1;
I am trying to count each item in a database table, that is deployments. I have counted the total number of items 3879, by doing this:
use Bamboo_Version6
go
SELECT Count(STARTED_DATE)
FROM [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
But I have been struggling to get the number of items each day until the start. I have tried using some of the other similar answers to this like:
select STARTED_Date, count(deploymentID)
from [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
WHERE STARTED_Date>=dateadd(day,datediff(day,0,STARTED_Date)- 7,0)
GROUP BY STARTED_Date
But this will return every id, and a 1 beside it because the dates have times which are making it unique, so I tried doing this: CONVERT(varchar(12),STARTED_DATE,110) to try and fix the problem but it still happens. How can I count this without, getting all the id's or every id as 1 each time?
Remove the time component:
select cast(STARTED_Date as date) as dte, count(deploymentID)
from [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]
group by cast(STARTED_Date as date)
order by dte;
I'm not sure what the WHERE clause is supposed to be doing, so I just removed it. If it is useful, add it back in.
I have another efficient way of doing this, may be try this with an over clause
SELECT cast(STARTED_DATE AS DATE) AS Deployment_date,
COUNT(deploymentID) OVER ( PARTITION BY cast(STARTED_DATE AS DATE) ORDER BY STARTED_DATE) AS NumberofDeployments
FROM [Bamboo_Version6].[dbo].[DEPLOYMENT_RESULT]