Add one for every row that fulfills where criteria between period - sql

I have a Postgres table that I'm trying to analyze based on some date columns.
I'm basically trying to count the number of rows in my table that fulfill this requirement, and then group them by month and year. Instead of my query looking like this:
SELECT * FROM $TABLE WHERE date1::date <= '2012-05-31'
and date2::date > '2012-05-31';
it should be able to display this for the months available in my data so that I don't have to change the months manually every time I add new data, and so I can get everything with one query.
In the case above I'd like it to group the sum of rows which fit the criteria into the year 2012 and month 05. Similarly, if my WHERE clause looked like this:
date1::date <= '2012-06-31' and date2::date > '2012-06-31'
I'd like it to group this sum into the year 2012 and month 06.

This isn't entirely clear to me:
I'd like it to group the sum of rows
I'll interpret it this way: you want to list all rows "per month" matching the criteria:
WITH x AS (
SELECT date_trunc('month', min(date1)) AS start
,date_trunc('month', max(date2)) + interval '1 month' AS stop
FROM tbl
)
SELECT to_char(y.mon, 'YYYY-MM') AS mon, t.*
FROM (
SELECT generate_series(x.start, x.stop, '1 month') AS mon
FROM x
) y
LEFT JOIN tbl t ON t.date1::date <= y.mon
AND t.date2::date > y.mon -- why the explicit cast to date?
ORDER BY y.mon, t.date1, t.date2;
Assuming date2 >= date1.
Compute lower and upper border of time period and truncate to month (adding 1 to upper border to include the last row, too.
Use generate_series() to create the set of months in question
LEFT JOIN rows from your table with the declared criteria and sort by month.
You could also GROUP BY at this stage to calculate aggregates ..

Here is the reasoning. First, create a list of all possible dates. Then get the cumulative number of date1 up to a given date. Then get the cumulative number of date2 after the date and subtract the results. The following query does this using correlated subqueries (not my favorite construct, but handy in this case):
select thedate,
(select count(*) from t where date1::date <= d.thedate) -
(select count(*) from t where date2::date > d.thedate)
from (select distinct thedate
from ((select date1::date as thedate from t) union all
(select date2::date as thedate from t)
) d
) d
This is assuming that date2 occurs after date1. My model is start and stop dates of customers. If this isn't the case, the query might not work.

It sounds like you could benefit from the DATEPART T-SQL method. If I understand you correctly, you could do something like this:
SELECT DATEPART(year, date1) Year, DATEPART(month, date1) Month, SUM(value_col)
FROM $Table
-- WHERE CLAUSE ?
GROUP BY DATEPART(year, date1),
DATEPART(month, date1)

Related

PL-SQL query to calculate customers per period from start and stop dates

I have a PL-SQL table with a structure as shown in the example below:
I have customers (customer_number) with insurance cover start and stop dates (cover_start_date and cover_stop_date). I also have dates of accidents for those customers (accident_date). These customers may have more than one row in the table if they have had more than one accident. They may also have no accidents. And they may also have a blank entry for the cover stop date if their cover is ongoing. Sorry I did not design the data format, but I am stuck with it.
I am looking to calculate the number of accidents (num_accidents) and number of customers (num_customers) in a given time period (period_start), and from that the number of accidents-per-customer (which will be easy once I've got those two pieces of information).
Any ideas on how to design a PL-SQL function to do this in a simple way? Ideally with the time periods not being fixed to monthly (for example, weekly or fortnightly too)? Ideally I will end up with a table like this shown below:
Many thanks for any pointers...
You seem to need a list of dates. You can generate one in the query and then use correlated subqueries to calculate the columns you want:
select d.*,
(select count(distinct customer_id)
from t
where t.cover_start_date <= d.dte and
(t.cover_end_date > d.date + interval '1' month or t.cover_end_date is null)
) as num_customers,
(select count(*)
from t
where t.accident_date >= d.dte and
t.accident_date < d.date + interval '1' month
) as accidents,
(select count(distinct customer_id)
from t
where t.accident_date >= d.dte and
t.accident_date < d.date + interval '1' month
) as num_customers_with_accident
from (select date '2020-01-01' as dte from dual union all
select date '2020-02-01' as dte from dual union all
. . .
) d;
If you want to do arithmetic on the columns, you can use this as a subquery or CTE.

Appending the result query in bigquery

I am doing a query where the query will append the data from previous date as the outcome in BigQuery.
So, the result data for today will be higher than yesterdays as the data is appending by days.
So far, what I only managed to get the outcome is the data by days (where you can see the number of ID declining and is not appending from previous day) as this result:
What should I do to add appending function in the query so each day will get the result of data from the previous day in bigquery?
code:
WITH
table1 AS (
SELECT
ID,
...
FROM t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
),
table2 AS (
SELECT
ID,
COUNTIF((rating < 7) as bad,
COUNTIF((rating >= 7 AND SAFE_CAST(NPS_Rating as INT64) < 9) as intermediate,
COUNTIF((rating as good
FROM
t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
)
SELECT
DATE_SUB('2020-01-31', INTERVAL 31 DAY) as date,
*
FROM table1
FULL OUTER JOIN table2 USING (ID)
If you have counts that you want to accumulate, then you want a cumulative sum. The query would look something like this:
select datecol, count(*), sum(count(*)) over (order by datecol)
from t
group by datecol
order by datecol;

How can I get a user's activity count for today and this month in a single SELECT query

In my table I have:
Activity : Date
---------------
doSomething1 : June 1, 2020
doSomething2 : June 14, 2020
I want to be able to make a query so that I can get the following result (assuming today is June 1, 2020):
Today : ThisMonth
1 : 2
I looked at group by but I wasn't sure how to do that without a lot of additional code and I think there's very likely a much simpler solution that I'm missing. Something that will just return a single row with two results. Is this possible and if so how?
you can write subqueries to get data in single row,
Select today , month
from
(
( query to get today's count ) as today,
( query to get month's count ) as month
) t;
yes, u can do group by on dates to get todays nd months count.
Hope this will give u some perception to go on.
Is this what you want?
select array_agg(activity) filter (where date = current_date) as today,
array_agg(activity) filter (where date <> current_date) as rest_of_month
from t
where date_trunc('month', date) = current_date;
This uses arrays so it can handle more than one activity in either category.
Assume you want to query based on a particular date -
select count(case when d.date = :p_query_date then 0 end) day_count
,count(0) month_count
from d -- your table name
where d.date between date_trunc('month', :p_query_date)
and date_trunc('month', :p_query_date + interval '1 month') - interval '1 day'
The above query assumes you have index defined on d.date column. If you have index defined on date_trunc('month', date), the query condition can be simplified to:
date_trunc('month', d.date) = date_trunc('month', :p_query_date)

Return just the last day of each month with SQL

I have a table that contains multiple records for each day of the month, over a number of years. Can someone help me out in writing a query that will only return the last day of each month.
SQL Server (other DBMS will work the same or very similarly):
SELECT
*
FROM
YourTable
WHERE
DateField IN (
SELECT MAX(DateField)
FROM YourTable
GROUP BY MONTH(DateField), YEAR(DateField)
)
An index on DateField is helpful here.
PS: If your DateField contains time values, the above will give you the very last record of every month, not the last day's worth of records. In this case use a method to reduce a datetime to its date value before doing the comparison, for example this one.
The easiest way I could find to identify if a date field in the table is the end of the month, is simply adding one day and checking if that day is 1.
where DAY(DATEADD(day, 1, AsOfDate)) = 1
If you use that as your condition (assuming AsOfDate is the date field you are looking for), then it will only returns records where AsOfDate is the last day of the month.
Use the EOMONTH() function if it's available to you (E.g. SQL Server). It returns the last date in a month given a date.
select distinct
Date
from DateTable
Where Date = EOMONTH(Date)
Or, you can use some date math.
select distinct
Date
from DateTable
where Date = DATEADD(MONTH, DATEDIFF(MONTH, -1, Date)-1, -1)
In SQL Server, this is how I usually get to the last day of the month relative to an arbitrary point in time:
select dateadd(day,-day(dateadd(month,1,current_timestamp)) , dateadd(month,1,current_timestamp) )
In a nutshell:
From your reference point-in-time,
Add 1 month,
Then, from the resulting value, subtract its day-of-the-month in days.
Voila! You've the the last day of the month containing your reference point in time.
Getting the 1st day of the month is simpler:
select dateadd(day,-(day(current_timestamp)-1),current_timestamp)
From your reference point-in-time,
subtract (in days), 1 less than the current day-of-the-month component.
Stripping off/normalizing the extraneous time component is left as an exercise for the reader.
A simple way to get the last day of month is to get the first day of the next month and subtract 1.
This should work on Oracle DB
select distinct last_day(trunc(sysdate - rownum)) dt
from dual
connect by rownum < 430
order by 1
I did the following and it worked out great. I also wanted the Maximum Date for the Current Month. Here is what I my output is. Notice the last date for July which is 24th. I pulled it on 7/24/2017, hence the result
Year Month KPI_Date
2017 4 2017-04-28
2017 5 2017-05-31
2017 6 2017-06-30
2017 7 2017-07-24
SELECT B.Year ,
B.Month ,
MAX(DateField) KPI_Date
FROM Table A
INNER JOIN ( SELECT DISTINCT
YEAR(EOMONTH(DateField)) year ,
MONTH(EOMONTH(DateField)) month
FROM Table
) B ON YEAR(A.DateField) = B.year
AND MONTH(A.DateField) = B.Month
GROUP BY B.Year ,
B.Month
SELECT * FROM YourTableName WHERE anyfilter
AND "DATE" IN (SELECT MAX(NameofDATE_Column) FROM YourTableName WHERE
anyfilter GROUP BY
TO_CHAR(NameofDATE_Column,'MONTH'),TO_CHAR(NameofDATE_Column,'YYYY'));
Note: this answer does apply for Oracle DB
Here's how I just solved this. day_date is the date field, calendar is the table that holds the dates.
SELECT cast(datepart(year, day_date) AS VARCHAR)
+ '-'
+ cast(datepart(month, day_date) AS VARCHAR)
+ '-'
+ cast(max(DATEPART(day, day_date)) AS VARCHAR) 'DATE'
FROM calendar
GROUP BY datepart(year, day_date)
,datepart(month, day_date)
ORDER BY 1

How do I get a maximium daily value of a numerical field over a year in SQL

How do I get a maximium daily value of a numerical field over a year in MS-SQL
This would query the daily maximum of value over 2008:
select
datepart(dayofyear,datecolumn)
, max(value)
from yourtable
where '2008-01-01' <= datecolumn and datecolumn < '2009-01-01'
group by datepart(dayofyear,datecolumn)
Or the daily maximum over each year:
select
datepart(year,datecolumn),
, datepart(dayofyear,datecolumn)
, max(value)
from yourtable
group by datepart(year,datecolumn), datepart(dayofyear,datecolumn)
Or the day(s) with the highest value in a year:
select
Year = datepart(year,datecolumn),
, DayOfYear = datepart(dayofyear,datecolumn)
, MaxValue = max(MaxValue)
from yourtable
inner join (
select
Year = datepart(year,datecolumn),
, MaxValue = max(value)
from yourtable
group by datepart(year,datecolumn)
) sub on
sub.Year = yourtable.datepart(year,datecolumn)
and sub.MaxValue = yourtable.value
group by
datepart(year,datecolumn),
datepart(dayofyear,datecolumn)
You didn't mention which RDBMS or SQL dialect you're using. The following will work with T-SQL (MS SQL Server). It may require some modifications for other dialects since date functions tend to change a lot between them.
SELECT
DATEPART(dy, my_date),
MAX(my_number)
FROM
My_Table
WHERE
my_date >= '2008-01-01' AND
my_date < '2009-01-01'
GROUP BY
DATEPART(dy, my_date)
The DAY function could be any function or combination of functions which gives you the days in the format that you're looking to get.
Also, if there are days with no rows at all then they will not be returned. If you need those days as well with a NULL or the highest value from the previous day then the query would need to be altered a bit.
Something like
SELECT dateadd(dd,0, datediff(dd,0,datetime)) as day, MAX(value)
FROM table GROUP BY dateadd(dd,0, datediff(dd,0,datetime)) WHERE
datetime < '2009-01-01' AND datetime > '2007-12-31'
Assuming datetime is your date column, dateadd(dd,0, datediff(dd,0,datetime)) will extract only the date part, and then you can group by that value to get a maximum daily value. There might be a prettier way to get only the date part though.
You can also use the between construct to avoid the less than and greater than.
Group on the date, use the max delegate to get the highest value for each date, sort on the value, and get the first record.
Example:
select top 1 theDate, max(theValue)
from TheTable
group by theDate
order by max(theValue) desc
(The date field needs to only contain a date for this grouping to work, i.e. the time component has to be zero.)
If you need to limit the query for a specific year, use a starting and ending date in a where claues:
select top 1 theDate, max(theValue)
from TheTable
where theDate between '2008-01-01' and '2008-12-13'
group by theDate
order by max(theValue) desc