I have a table 'task' with three relevant fields: date_created, date_updated, and is_closed.
I have a simple query that counts the number of tasks created:
SELECT task.date_created, count(task.is_closed)
FROM task
GROUP BY task.date_created
ORDER BY task.date_created
What I'd like is to also have the number of tasks closed per day. For our purposes, a task's final updated date is when is_closed='true'
So, the final table should look like
date opened closed
04/01/13 8 6
04/02/13 9 5
I think you need to do this as two subqueries. Here is one approach, using full outer join:
select coalesce(a.date_created, c.date_updated) as thedate,
coalesce(a.opened, 0) as opened,
coalesce(c.closed, 0) as closed
from (select date_created, count(*) as opened
from task
group by date_created
) a full outer join
(select date_updated, count(*) as closed
from task
where is_closed = 1 -- or whatever the value is
group by date_updated
) c
on a.date_created = c.date_updated
The full outer join guarantees that all dates are present, even when you have only closes or opens.
You can use a CASE statement with your COUNT:
SELECT task.date_created,
count(1) opened,
count(case when is_closed = 'true' then 1 end) closed
FROM task
GROUP BY task.date_created
ORDER BY task.date_created
SQL Fiddle Demo
Given your comments, here is an approach using a couple Common Table Expressions:
WITH OPENED AS (
SELECT date_created, count(1) opened
FROM task
GROUP BY date_created
) ,
CLOSED AS (
SELECT date_updated, count(1) closed
FROM task
WHERE is_closed = 'true'
GROUP BY date_updated
)
SELECT D.YourDateField, o.opened, c.closed
FROM YourDateTable D
LEFT JOIN Opened O ON D.YourDateField = O.Date_Created
LEFT JOIN CLosed C ON D.YourDateField = C.Date_Created
As Gordon points out, a FULL OUTER JOIN would also work. I just prefer using a dates table to seed from. Create the table once, and use it wherever you may need to.
Related
Im trying to create ie query to show itens with MAX DATE, but I donĀ“t know how !
Follow the script and result:
Select
results.severity As "Count_severity",
tasks.name As task,
results.host,
to_timestamp(results.date)::date
From
tasks Inner Join
results On results.task = tasks.id
Where
tasks.name Like '%CORP 0%' And
results.severity >= 7 And
results.qod > 70
I need to show only tasks with the last date of each one.
Can you help me ?
You seem to be using Postgres (as suggested by the use of casting operator ::). If so - and I follow you correctly - you can use distinct on:
select distinct on(t.name)
r.severity, t.name as task, r.host, to_timestamp(r.date::bigint)::date
from tasks t
inner join results r on r.task = t.id
where t.name like '%corp 0%' and r.severity >= 7 and r.qod > 70
order by t.name, to_timestamp(r.date::bigint)::date desc
This guarantees one row per task only; which row is picked is controlled by the order by clause, so the above gets the row with the greatest date (time portion left apart). If there are ties, it is undefined which row is returned. You might want to adapt the order by clause to your exact requirement, if it is different than what I understood.
On the other hand, if you want top ties, then use window functions:
select *
from (
select r.severity, t.name as task, r.host, to_timestamp(r.date::bigint)::date,
rank() over(partition by t.name order by to_timestamp(r.date::bigint)::date desc) rn
from tasks t
inner join results r on r.task = t.id
where t.name like '%corp 0%' and r.severity >= 7 and r.qod > 70
) t
where rn = 1
This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
GROUP BY with MAX(DATE) [duplicate]
(6 answers)
Oracle SQL query: Retrieve latest values per group based on time [duplicate]
(2 answers)
Closed 5 years ago.
It's been marked as a duplicate and seems to be explained a bit in the linked questions, but I'm still trying to get the separate DEBIT and CREDIT columns on the same row.
I've created a View and I am currently self joining it. I'm trying to get the max Header_ID for each date.
My SQL is currently:
SELECT DISTINCT
TAB1.id,
TAB1.glperiods_id,
MAX(TAB2.HEADER_ID),
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
FROM
IQMS.V_TEST_GLBATCH_GJ TAB1
LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT
WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL
GROUP BY
TAB1.id,
TAB1.glperiods_id,
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
ORDER BY TAB1.batch_date
And the output for this is (37 rows in total):
I'm joining the table onto itself to get DEBIT and CREDIT on the same line. How do I select only the rows with the max HEADER_ID per BATCH_DATE ?
Update
For #sagi
Those highlighted with the red box are the rows I want and the ones in blue would be the ones I'm filtering out.
Fixed mistake
I recently noticed I had joined my table onto itself without making sure TAB2 ACCT='3648-00-0'.
The corrected SQL is here:
SELECT DISTINCT
TAB1.id,
TAB1.glperiods_id,
Tab1.HEADER_ID,
TAB1.batch_date,
TAB1.debit,
TAB2.credit,
TAB1.descrip
FROM
IQMS.V_TEST_GLBATCH_GJ TAB1
LEFT OUTER JOIN
IQMS.V_TEST_GLBATCH_GJ TAB2
ON
TAB1.ID = TAB2.ID AND TAB1.BATCH_DATE = TAB2.BATCH_DATE AND TAB2.ACCT ='3648-00-0'AND TAB1.GLPERIODS_ID = TAB2.GLPERIODS_ID AND TAB1.DESCRIP = TAB2.DESCRIP AND TAB1.DEBIT <> TAB2.CREDIT
WHERE
TAB1.ACCT = '3648-00-0'
AND
TAB1.DESCRIP NOT LIKE '%INV%'
AND TAB1.DEBIT IS NOT NULL
ORDER BY TAB1.BATCH_DATE
Use window function like ROW_NUMBER() :
SELECT s.* FROM (
SELECT t.*,
ROW_NUMBER() OVER(PARTITION BY t.batch_id ORDER BY t.header_id DESC) as rnk
FROM YourTable t
WHERE t.ACCT = '3648-00-0'
AND t.DESCRIP NOT LIKE '%INV%'
AND t.DEBIT IS NOT NULL) s
WHERE s.rnk = 1
This is an analytic function that rank your record by the values provided in the OVER clause.
PARTITION - is the group
ORDER BY - Who's the first of this group (first gets 1, second 2, ETC)
It is a lot more efficient then joins(Your problem could have been solved in many ways) , and uses the table only once.
This question already has answers here:
How to return a default value when no rows are returned from the select statement
(7 answers)
Closed 8 years ago.
I need some help with the following query.
Select date, Source,count(*) as TOTALCOUNT
from Table A
where date = '2014-10-15' and Source = 'EMAIL'
group by date,source
There is no EMAIL source for a particular day. So, it gives no rows.
But I want to get 0 in Totalcount even if EMAIL is not there for that day but can present next day.
It should be like,
Date,Source, Totalcount
15/10/14,Email,0
I used ISNULL function not working as no rows has been resulted.
You could perform a join with a "constant" query. It's an ugly hack, but it should do the trick:
SELECT c.date, c.source, COALESCE(totalcount, 0)
FROM (SELECT '2014-10-15' AS date, 'EMAIL' AS source) c
LEFT JOIN (SELECT date, source, COUNT(*) AS totalcount
FROM a
GROUP BY date, source) a
ON a.date = c.date AND a.source = c.source
This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 8 years ago.
I am writing a program for amateur radio. Some callsigns will appear more than once in the data but the qsodate will be different. I only want the first occurrence of a call sign after a given date.
The query
select distinct
a.callsign,
a.SKCC_Number,
a.qsodate,
b.name,
a.SPC,
a.Band
from qso a, skccdata b
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
and b.callsign = a.callsign
order by a.QSODate
The problem:
Because contacts occur on different dates, I get all of the contacts - I have tried adding min(a.qsodate) to get only the first but then I run into all sorts of issues regarding grouping.
This query will be in a stored procedure, so creating temp tables or cursors will not be a problem.
You can use the ROW_NUMBER() to get the first row with the first date, like this:
WITH CTE
AS
(
select
a.callsign,
a.SKCC_Number,
a.qsodate,
b.name,
a.SPC,
a.Band,
ROW_NUMBER() OVER(PARTITION BY a.callsign ORDER BY a.QSODate) AS RN
from qso a,skccdata b
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
and b.callsign = a.callsign
)
SELECT *
FROM CTE
WHERE RN = 1;
ROW_NUMBER() OVER(PARTITION BY a.callsign ORDER BY a.QSODate) will give you a ranking number for each group of callsign ordered by QSODate, then the WHERE RN = 1 will eliminate all the rows except the first one which has the minimum QSODate.
Have you tried starting your query with SELECT TOP 1 ...(fields) Then you will only get one row. You can use TOP x .... for x number of rows, or TOP 50 PERCENT for the top half of the rows, etc. Then you can eliminate DISTINCT in this case
EDIT: misunderstood question. How about this?
select
a.callsign,
a.SKCC_Number,
a.qsodate,
(SELECT TOP 1 b.name FROM skccdata b WHERE b.callsign = a.callsign) as NAME,
a.SPC,
a.Band
from qso a
where SKCC_Number like '%[CTS]%'
AND QSODate > = '2014-08-01'
GROUP BY a.QSODate, a.callsign, a.SKCC_Number, a.SPC, a.Band
order by a.QSODate
and add callsign to your where clause to isolate callsigns
I have a table "defects" in the following format:
id status stat_date line div area
1 Open 09/21/09 F A cube
1 closed 01/01/10 F A cube
2 Open 10/23/09 B C Back
3 Open 11/08/09 S B Front
3 closed 12/12/09 S B Front
My problem is that I want to write a query that just extracts the "Open" defects. If I write a query to simply extract all open defects, then I get the wrong result because there are some defects,
that have 2 records associated with it. For example, with the query that I wrote I would get defect id#s 1 and 3 in my result even though they are closed. I hope I have explained my problem well. Thank you.
Use:
SELECT t.*
FROM DEFECTS t
JOIN (SELECT d.id,
MAX(d.stat_date) 'msd'
FROM DEFECTS d
GROUP BY d.id) x ON x.id = t.id
AND x.msd = t.stat_date
WHERE t.status != 'closed'
The join is getting the most recent date for each id value.
Join back to the original table on based on the id and date in order to get only the most recent rows.
Filter out those rows with the closed status to know the ones that are currently open
So you want to get the most recent row per id and of those, only select those that are open. This is a variation of the common greatest-n-per-group problem.
I would do it this way:
SELECT d1.*
FROM defects d1
LEFT OUTER JOIN defects d2
ON (d1.id = d2.id AND d1.stat_date < d2.stat_date)
WHERE d2.id IS NULL
AND d1.status = 'Open';
Select *
from defects d
where status = 'Open'
and not exists (
select 1 from defects d1
where d1.status = 'closed'
and d1.id = d.id
and d1.stat_date > d.stat_date
)
This should get what you want. I wouldn't have a record for open and closing a defect, rather just a single record to track a single defect. But that may not be something you can change easily.
SELECT id FROM defects
WHERE status = 'OPEN' AND id NOT IN
(SELECT id FROM defects WHERE status = 'closed')
This query handles multiple opens/closes/opens, and only does one pass through the data (i.e. no self-joins):
SELECT * FROM
(SELECT DISTINCT
id
,FIRST_VALUE(status)
OVER (PARTITION BY id
ORDER BY stat_date desc)
as last_status
,FIRST_VALUE(stat_date)
over (PARTITION BY id
ORDER BY stat_date desc)
AS last_stat_date
,line
,div
,area
FROM defects)
WHERE last_status = 'Open';