Counting an already counted column in SQL (db2) - sql

I'm pretty new to SQL and have this problem:
I have a filled table with a date column and other not interesting columns.
date | name | name2
2015-03-20 | peter | pan
2015-03-20 | john | wick
2015-03-18 | harry | potter
What im doing right now is counting everything for a date
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
what i want to do now is counting the resulting lines and only returning them if there are less then 10 resulting lines.
What i tried so far is surrounding the whole query with a temp table and the counting everything which gives me the number of resulting lines (yeah)
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select count(*)
from temp_count
What is still missing the check if the number is smaller then 10.
I was searching in this Forum and came across some "having" structs to use, but that forced me to use a "group by", which i can't.
I was thinking about something like this :
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
having count(*) < 10
maybe im too tired to think of an easy solution, but i can't solve this so far
Edit: A picture for clarification since my english is horrible
http://imgur.com/1O6zwoh
I want to see the 2 columned results ONLY IF there are less then 10 rows overall

I think you just need to move your having clause to the inner query so that it is paired with the GROUP BY:
with temp_count (date, counter) as
(
select date, count(*)
from testtable
where date >= current date - 10 days
group by date
having count(*) < 10
)
select *
from temp_count
If what you want is to know whether the total # of records (after grouping), are returned, then you could do this:
with temp_count (date, counter) as
(
select date, counter=count(*)
from testtable
where date >= current date - 10 days
group by date
)
select date, counter
from (
select date, counter, rseq=row_number() over (order by date)
from temp_count
) x
group by date, counter
having max(rseq) >= 10
This will return 0 rows if there are less than 10 total, and will deliver ALL the results if there are 10 or more (you can just get the first 10 rows if needed with this also).

In your temp_count table, you can filter results with the WHERE clause:
with temp_count (date, counter) as
(
select date, count(distinct date)
from testtable
where date >= current date - 10 days
group by date
)
select *
from temp_count
where counter < 10

Something like:
with t(dt, rn, cnt) as (
select dt, row_number() over (order by dt) as rn
, count(1) as cnt
from testtable
where dt >= current date - 10 days
group by dt
)
select dt, cnt
from t where 10 >= (select max(rn) from t);
will do what you want (I think)

Related

Selecting the difference between dates in a stored procedure using a subquery

I can't get my head around whether this is even possible, but I feel like I might have done it before and lost that bit of code. I am trying to craft a select statement that contains an inner join on a subquery to show the number of days between two dates from the same table.
A simple example of the data structure would look like:
Name ID Date Day Hours
Bill 1 3/3/20 Thursday 8
Fred 2 4/3/20 Monday 6
Bill 1 8/3/20 Tuesday 2
Based on this data, I want to select each row plus an extra column which is the number of days between the date from each row for each ID. Something like:
Select * from tblData
Inner join (datediff(Select Top(1) Date from tblData where Date < Date), Date) And ID = ID)
or for simplicity:
Select * from tblData
Inner join (datediff(Select Top(1) Date from tblData where Date < 8/3/20), 8/3/20) And ID = 1)
The resulting dataset would look like:
Name ID Date Day Hours DaysBtwn
Bill 1 3/3/20 Thursday 8 4 (Assuming there was an earlier row in the table)
Fred 2 4/3/20 Monday 6 5 (Assuming there was an earlier row in the table)
Bill 1 8/3/20 Tuesday 2 5 (Based on the previous row date being 3/3/20 for Bill)
Does this make sense and am I trying to do this the wrong way? I want to do this for about 600000 rows in table and therefore efficiency is the key, so if there is a better way to do this, i'm open to suggestions.
You can use lag():
select t.*, datediff(day, lag(date) over(partition by id order by date), date) diff
from mytable t
I think you just want lag():
select t.*,
datediff(day,
lag(date) over (partition by name order by date),
date
) as diff
from tblData t;
Note: If you want to filter the data so rows in the result set are used for the lag() but not in the result set, then use a subquery:
select t.*
from (select t.*,
datediff(day,
lag(date) over (partition by name order by date),
date
) as diff
from tblData t
) t
where date < '2020-08-03';
Also note the use of the date constant as a string in YYYY-MM-DD format.

SQL - Find the two closest date after a specific date

Dear Stack Overflow community,
I am looking for the patient id where the two consecutive dates after the very first one are less than 7 days.
So differences between 2nd and 1st date <= 7 days
and differences between 3rd and 2nd date <= 7 days
Example:
ID Date
1 9/8/2014
1 9/9/2014
1 9/10/2014
2 5/31/2014
2 7/20/2014
2 9/8/2014
For patient 1, the two dates following it are less than 7 days apart.
For patient 2 however, the following date are more than 7 days apart (50 days).
I am trying to write an SQL query that just output the patient id "1".
Thanks for your help :)
You want to use lead(), but this is complicated because you want this only for the first three rows. I think I would go for:
select t.*
from (select t.*,
lead(date, 1) over (partition by id order by date) as next_date,
lead(date, 2) over (partition by id order by date) as next_date_2,
row_number() over (partition by id order by date) as seqnum
from t
) t
where seqnum = 1 and
next_date <= date + interval '7' day and
next_date2 <= next_date + interval '7' day;
You can try using window function lag()
select * from
(
select id,date,lag(date) over(order by date) as prevdate
from tablename
)A where datediff(day,date,prevdate)<=7

How to query database for rows from next 5 days

How can I make a query in SQL Server to query for all rows for the next 5 days.
The problem is that it has to be days with records, so the next 5 days, might become something like, Today, Tomorrow, some day in next month, etc...
Basically I want to query the database for the records for the next non empty X days.
The table has a column called Date, which is what I want to filter.
Why not split the search into 2 queries. First one searches for the date part, the second uses that result to search for records IN the dates returned by the first query.
#Anagha is close, just a little modification and it is OK.
SELECT *
FROM TABLE
WHERE DATE IN (
SELECT DISTINCT TOP 5 DATE
FROM TABLE
WHERE DATE >= referenceDate
ORDER BY DATE
)
You can use following SQL query where 5 different dates are fetched at first then all rows for those selected dates are displayed
declare #n int = 5;
select *
from myData
where
datecol in (
SELECT distinct top (#n) cast(datecol as date) as datecol
FROM myData
WHERE datecol >= '20180101'
ORDER BY datecol
)
Try this:
select date from table where date in (select distinct top 5 date
from table where date >= getdate() order by date)
If your values are dates, you can use `dense_rank():
select t.*
from (select t.*, dense_rank() over (order by datecol) as seqnum
from t
where datecol >= cast(getdate() as date)
) t
where seqnum <= 5;
If the column has a time component and you still want to define days by midnight-to-midnight (as suggested by the question), just convert to date:
select t.*
from (select t.*,
dense_rank() over (order by cast(datetimecol as date)) as seqnum
from t
where datetimecol >= cast(getdate() as date)
) t
where seqnum <= 5;

SQL Count of date values that dont match hour and day in select statement

I have a problem unique to a business process. My user needs to know how many dates, counted, are before a specific end time that do not match on the hour or the day.
Here is an example.
AAA, 2016-03-15 16:00:28.967, 2016-03-15 16:02:58.487, 2016-03-17 14:01:24.243
In the example above id AAA has 3 entries. I need to count only the ones that don't have a matching hour and day. So the actual count should come out to be 2.
I have to do this all in SQL and can't use a CTE. It needs to be either a sub select or some type of join.
Something like this.
SELECT id, date, (
SELECT COUNT(*)
FROM x
WHERE day!=day
AND hour!=hour AND date < z
) AS DateCount
Results would be AAA, 2
I am thinking some type of recursive comparison but I am not sure how to accomplish this without a CTE.
In SQL Server you can try something like this:
SELECT id, CONVERT(VARCHAR(13), [date], 120) AS [Date], COUNT(*) AS DateCount
FROM YourTable
WHERE [date] < #ENDDATE
GROUP BY id, CONVERT(VARCHAR(13), [date], 120)
SELECT a AS current_a, COUNT(*) AS b,day AS day, hour as hour,
(SELECT COUNT(*)
FROM t
WHERE day != day
AND hour != hour
AND date < z ) as datecount
FROM t GROUP BY a ORDER by b DESC

Modulo Time in SQL Server 2005 - Return data every n hours

I have something like this:
SELECt *
FROM (
SELECT prodid, date, time, tmp, rowid
FROM live_pilot_plant
WHERE date BETWEEN CONVERT(DATETIME, '3/19/2012', 101)
AND CONVERT(DATETIME, '3/31/2012', 101)
) b
WHERE b.rowid % 400 = 0
FYI: The reason for the convert in the where clause, is because my date is stored as a varchar(10), I had to convert it to datetime in order to get the correct range of data. (I tried a bunch of different things and this worked)
I'm wondering how I can return the data I want every 4 hours during those selected dates. I have data collected approximately every 5 seconds (with some breaks in data) - ie data wasn't collected during a 2 hour period, but then continues at 5 second increments.
In my example I just used a modulo with my rowid - and the syntax works, but as I mentioned above there are some periods where data isnt collected so using logic like: if you take data every 5 seconds and multiple that by 4 hours you can approximately say how many rows are in between wont work.
My time column is a varchar column and is in the form hh:mm:ss
My ideal output is:
| prodid | date | time | tmp |
| 4 | 3/19/2012 | 10:00:00 | 2.3 |
| 7 | 3/19/2012 | 14:00:24 | 3.2 |
As you can see I can be a bit off (in terms of seconds) - I more so need the approximate value in terms of time.
Thank you in advance.
This should work
select prodid, date, time, tmp, rowid
from live_pilot_plant as lpp
inner join (
select min(prodid) as prodid -- is prodid your PK?? if not change it to rowid or whatelse is your PK
from live_pilot_plant
WHERE date BETWEEN CONVERT(DATETIME, '3/19/2012', 101) -- or whatever you want
AND CONVERT(DATETIME, '3/31/2012', 101) -- for better performance it is on the inner select
group by date,
floor( -- floor makes the trick
convert(float,convert(datetime, time)) -- assumes "time" column is a varchar containing data like '19:23:05'
* 6 -- 6 comes form 24 hours / 4 hours
)
) as filter on lpp.prodid = filter.prodid -- if prodid is not the PK also correct here.
A side note for everyone else who have date + time data in only one datetime field, suppose named "when_it_was", the group by can be as simple as:
group by floor(when_it_was * 6) -- again, 6 comes from 24/4
something along the lines of the following should work. Basically create date + time partitions, each partition representing a block of 4 hours and pick the record with the highest rank from each partition
select * from (
select *,
row_number() over (partition by date,cast(left( time, charindex( ':', time) - 1) as int) / 4 order by
date, time) as ranker from live_pilot_plant
) Z where ranker = 1
Assuming rowid is a PK and increased with date/time. Just convert time field to 4 hours interval number substring(time,1,2))/4 and select MIN(rowid) from each of 4 hours groups in a day:
select prodid, date, time, tmp, rowid from live_pilot_plant where rowid in
(
select min(rowid)
from live_pilot_plant
WHERE CONVERT(DATETIME, date, 101) BETWEEN CONVERT(DATETIME, '3/19/2012', 101)
AND CONVERT(DATETIME, '3/31/2012', 101)
group by date,convert(int,substring(time,1,2))/4
)
order by CONVERT(DATETIME, date, 101),time