How to select all dates in SQL query - sql

SELECT oi.created_at, count(oi.id_order_item)
FROM order_item oi
The result is the follwoing:
2016-05-05 1562
2016-05-06 3865
2016-05-09 1
...etc
The problem is that I need information for all days even if there were no id_order_item for this date.
Expected result:
Date Quantity
2016-05-05 1562
2016-05-06 3865
2016-05-07 0
2016-05-08 0
2016-05-09 1

You can't count something that is not in the database. So you need to generate the missing dates in order to be able to "count" them.
SELECT d.dt, count(oi.id_order_item)
FROM (
select dt::date
from generate_series(
(select min(created_at) from order_item),
(select max(created_at) from order_item), interval '1' day) as x (dt)
) d
left join order_item oi on oi.created_at = d.dt
group by d.dt
order by d.dt;
The query gets the minimum and maximum date form the existing order items.
If you want the count for a specific date range you can remove the sub-selects:
SELECT d.dt, count(oi.id_order_item)
FROM (
select dt::date
from generate_series(date '2016-05-01', date '2016-05-31', interval '1' day) as x (dt)
) d
left join order_item oi on oi.created_at = d.dt
group by d.dt
order by d.dt;
SQLFiddle: http://sqlfiddle.com/#!15/49024/5

Friend, Postgresql Count function ignores Null values. It literally does not consider null values in the column you are searching. For this reason you need to include oi.created_at in a Group By clause
PostgreSql searches row by row sequentially. Because an integral part of your query is Count, and count basically stops the query for that row, your dates with null id_order_item are being ignored. If you group by oi.created_at this column will trump the count and return 0 values for you.
SELECT oi.created_at, count(oi.id_order_item)
FROM order_item oi
Group by io.created_at
From TechontheNet (my most trusted source of information):
Because you have listed one column in your SELECT statement that is not encapsulated in the count function, you must use a GROUP BY clause. The department field must, therefore, be listed in the GROUP BY section.
Some info on Count in PostgreSql
http://www.postgresqltutorial.com/postgresql-count-function/
http://www.techonthenet.com/postgresql/functions/count.php

Solution #1 You need Date Table where you stored all date data. Then do a left join depending on period.
Solution #2
WITH DateTable AS
(
SELECT DATEADD(dd, 1, CONVERT(DATETIME, GETDATE())) AS CreateDateTime, 1 AS Cnter
UNION ALL
SELECT DATEADD(dd, -1, CreateDateTime), DateTable.Cnter + 1
FROM DateTable
WHERE DateTable.Cnter + 1 <= 5
)
Generate Temporary table based on your input and then do a left Join.

Related

Show all results in date range replacing null records with zero

I am querying the counts of logs that appear on particular days. However on some days, no log records I'm searching for are recorded. How can I set count to 0 for these days and return a result with the full set of dates in a date range?
SELECT r.LogCreateDate, r.Docs
FROM(
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs, to_char(l.log_create_date,'YYYY-MM-DD') AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING to_char(l.log_create_date , 'YYYY-MM-DD') BETWEEN '2020-01-01' AND '2020-01-07'
GROUP BY to_char(l.log_create_date,'YYYY-MM-DD')
ORDER BY to_char(l.log_create_date,'YYYY-MM-DD') DESC
) r
ORDER BY r.logcreatedate
Current Result - Id like to include the 01, 04, 05 with 0 docs.
LOGCREATEDATE
Docs
2020-01-02
7
2020-01-03
3
2020-01-06
6
2020-01-07
1
You need a full list of dates first, then outer join the log data to that. There are several ways to generate the list of dates but now common table expressions (cte) are an ANSI standard way to do this, like so:
with cte (dt) as (
select to_date('2020-01-01','yyyy-mm-dd') as dt from dual -- start date here
union all
select dt + 1 from cte
where dt + 1 < to_date('2020-02-01','yyyy-mm-dd') -- finish (the day before) date here
)
select to_char(cte.dt,'yyyy-mm-dd') as LogCreateDate
, r.Docs
from cte
left join (
SELECT SUM(TO_NUMBER(REPLACE(ld.log_detail, 'Total Documents:' , ''))) AS Docs
, trunc(l.log_create_date) AS LogCreateDate
FROM iwbe_log l
LEFT JOIN iwbe_log_detail ld ON ld.log_id = l.log_id
HAVING trunc(l.log_create_date) BETWEEN to_date('2020-01-01','yyyy-mm-dd' AND to_date('2020-01-07','yyyy-mm-dd')
GROUP BY trunc(l.log_create_date)
) r on cte.dt = r.log_create_date
order by cte.dt
also, when dealing with dates I prefer to not convert them to strings until final output which allows you to still get proper date order and maximum query efficiency.

How can I get the count to display zero for months that have no records

I am pulling transactions that happen on an attribute (attribute ID 4205 in table 1235) by the date that a change happened to the attribute (found in the History table) and counting up the number of changes that occurred by month. So far I have
SELECT TOP(100) PERCENT MONTH(H.transactiondate) AS Month, COUNT(*) AS Count
FROM hsi.rmObjectInstance1235 AS O LEFT OUTER JOIN
hsi.rmObjectHistory AS H ON H.objectID = O.objectID
WHERE H.attributeid = 4205) AND Year(H.transaction date) = '2020'
GROUP BY MONTH(H.transactiondate)
And I get
Month Count
---------------
1 9
2 4
3 11
4 14
5 1
I need to display a zero for months June - December instead of excluding those months.
One option uses a recursive query to generate the dates, and then brings the original query with a left join:
with all_dates as (
select cast('2020-01-01' as date) dt
union all
select dateadd(month, 1, dt) from all_dates where dt < '2020-12-01'
)
select
month(d.dt) as month,
count(h.objectid) as cnt
from all_dates d
left join hsi.rmobjecthistory as h
on h.attributeid = 4205
and h.transaction_date >= d.dt
and h.transaction_date < dateadd(month, 1, d.dt)
and exists (select 1 from hsi.rmObjectInstance1235 o where o.objectID = h.objectID)
group by month(d.dt)
I am quite unclear about the intent of the table hsi.rmObjectInstance1235 in the query, as none of its column are used in the select and group by clauses; it it is meant to filter hsi.rmobjecthistory by objectID, then you can rewrite this as an exists condition, as shown in the above solution. Possibly, you might as well be able to just remove that part of the query.
Also, note that
top without order by does not really make sense
top (100) percent is a no op
As a consequence, I removed that row-limiting clause.

cross join to get all dates and hours and avoid duplicate values

We have 2 tables:
sales
hourt (only 1 field (hourt) of numbers: 0 to 23)
The goal is to list all dates and all 24 hours for each day and group hours that have sales. For hours that do not have sales, zero will be shown.
This query cross joins the sales table with the hourt table and does list all dates and 24 hours. However, there are also many duplicate rows. How can we avoid the duplicates?
We're using Amazon Redshift (based on Postgres 8.0).
with h as (
SELECT
a.purchase_date,
CAST(DATE_PART("HOUR", AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS
DATETIME), "0:00"), "PST")) as INTEGER) AS Hour,
COUNT(a.quantity) AS QtyCount,
SUM(a.quantity) AS QtyTotal,
SUM((a.price) AS Price
FROM sales a
GROUP BY CAST(DATE_PART("HOUR",
AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS DATETIME), "0:00"),
"PST")) as INTEGER),
DATE_FORMAT(AT_TIME_ZONE(AT_TIME_ZONE(CAST(a.purchase_date AS DATETIME),
"0:00"), "PST"), "yyyy-MM-dd")
ORDER by a.purchase_date
),
hr as (
SELECT
CAST(hourt AS INTEGER) AS hourt
FROM hourt
),
joined as (
SELECT
purchase_date,
hourt,
QtyCount,
QtyTotal,
Price
FROM h
cross JOIN hr
)
SELECT *
FROM joined
Order by purchase_date,hourt
Sample Tables:
Before the cross join, query returned correct sales and grouped hours, as seen in the below table.
Desired results table:
Need to create a series of all the hour values and left join your data back to that. Comments inline explain the logic.
WITH data AS (-- Do the basic aggregation first
SELECT DATE_TRUNC('hour',a.purchase_date) purchase_hour --Truncate timestamp to the hour is simpler
,COUNT(a.quantity) AS QtyCount
,SUM(a.quantity) AS QtyTotal
,SUM((a.price) AS Price
FROM sales a
GROUP BY DATE_TRUNC('hour',a.purchase_date)
ORDER BY DATE_TRUNC('hour',a.purchase_date)
-- SELECT '2017-01-13 12:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
-- UNION ALL SELECT '2017-01-13 15:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
-- UNION ALL SELECT '2017-01-14 21:00:00'::TIMESTAMP purchase_hour, 1 qty_count, 1 qty_total, 119 price
)
,time_range AS (--Calculate the start and end **date** values
SELECT DATE_TRUNC('day',MIN(purchase_hour)) start_date
, DATE_TRUNC('day',MAX(purchase_hour))+1 end_date
FROM data
)
,hr AS (--Generate all hours between start and end
SELECT (SELECT start_date
FROM time_range
LIMIT 1) --Limit 1 so the optimizer knows it's not a correlated subquery
+ ((n-1) --Make the series start at zero so we don't miss the starting value
* INTERVAL '1 hour') AS "hour"
FROM (SELECT ROW_NUMBER() OVER () n
FROM stl_query --Can use any table here as long as it enough rows
LIMIT 100) series
WHERE "hour" < (SELECT end_date FROM time_range LIMIT 1)
)
--Use NVL to replace missing values with zeroes
SELECT hr.hour AS purchase_hour --Timestamp like `2017-01-13 12:00:00`
, NVL(data.qty_count, 0) AS qty_count
, NVL(data.qty_total, 0) AS qty_total
, NVL(data.price, 0) AS price
FROM hr
LEFT JOIN data
ON hr.hour = data.purchase_hour
ORDER BY hr.hour
;
I achieved the desired results by using Left Join (table A with table B) instead of Cross Join of these two tables:
Table A has all the dates and hours
Table B is the first part of the original query

Calculating business days in Teradata

I need help in business days calculation.
I've two tables
1) One table ACTUAL_TABLE containing order date and contact date with timestamp datatypes.
2) The second table BUSINESS_DATES has each of the calendar dates listed and has a flag to indicate weekend days.
using these two tables, I need to ensure business days and not calendar days (which is the current logic) is calculated between these two fields.
My thought process was to first get a range of dates by comparing ORDER_DATE with TABLE_DATE field and then do a similar comparison of CONTACT_DATE to TABLE_DATE field. This would get me a range from the BUSINESS_DATES table which I can then use to calculate count of days, sum(Holiday_WKND_Flag) fields making the result look like:
Order# | Count(*) As DAYS | SUM(WEEKEND DATES)
100 | 25 | 8
However this only works when I use a specific order number and cant' bring all order numbers in a sub query.
My Query:
SELECT SUM(Holiday_WKND_Flag), COUNT(*) FROM
(
SELECT
* FROM
BUSINESS_DATES
WHERE BUSINESS.Business BETWEEN (SELECT ORDER_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
AND
(SELECT CONTACT_DATE FROM ACTUAL_TABLE
WHERE ORDER# = '100'
)
TEMP
Uploading the table structure for your reference.
SELECT ORDER#, SUM(Holiday_WKND_Flag), COUNT(*)
FROM business_dates bd
INNER JOIN actual_table at ON bd.table_date BETWEEN at.order_date AND at.contact_date
GROUP BY ORDER#
Instead of joining on a BETWEEN (which always results in a bad Product Join) followed by a COUNT you better assign a bussines day number to each date (in best case this is calculated only once and added as a column to your calendar table). Then it's two Equi-Joins and no aggregation needed:
WITH cte AS
(
SELECT
Cast(table_date AS DATE) AS table_date,
-- assign a consecutive number to each busines day, i.e. not increased during weekends, etc.
Sum(CASE WHEN Holiday_WKND_Flag = 1 THEN 0 ELSE 1 end)
Over (ORDER BY table_date
ROWS Unbounded Preceding) AS business_day_nbr
FROM business_dates
)
SELECT ORDER#,
Cast(t.contact_date AS DATE) - Cast(t.order_date AS DATE) AS #_of_days
b2.business_day_nbr - b1.business_day_nbr AS #_of_business_days
FROM actual_table AS t
JOIN cte AS b1
ON Cast(t.order_date AS DATE) = b1.table_date
JOIN cte AS b2
ON Cast(t.contact_date AS DATE) = b2.table_date
Btw, why are table_date and order_date timestamp instead of a date?
Porting from Oracle?
You can use this query. Hope it helps
select order#,
order_date,
contact_date,
(select count(1)
from business_dates_table
where table_date between a.order_date and a.contact_date
and holiday_wknd_flag = 0
) business_days
from actual_table a

sql count statement with multiple date ranges

I have two table with different appointment dates.
Table 1
id start date
1 5/1/14
2 3/2/14
3 4/5/14
4 9/6/14
5 10/7/14
Table 2
id start date
1 4/7/14
1 4/10/14
1 7/11/13
2 2/6/14
2 2/7/14
3 1/1/14
3 1/2/14
3 1/3/14
If i had set date ranges i can count each appointment date just fine but i need to change the date ranges.
For each id in table 1 I need to add the distinct appointment dates from table 2 BUT only
6 months prior to the start date from table 1.
Example: count all distinct appointment dates for id 1 (in table 2) with appointment dates between 12/1/13 and 5/1/14 (6 months prior). So the result is 2...4/7/14 and 4/10/14 are within and 7/1/13 is outside of 6 months.
So my issue is that the range changes for each record and i can not seem to figure out how to code this.For id 2 the date range will be 9/1/14-3/2/14 and so on.
Thanks everyone in advance!
Try this out:
SELECT id,
(
SELECT COUNT(*)
FROM table2
WHERE id = table1.id
AND table2.start_date >= DATEADD(MM,-6,table1.start_date)
) AS table2records
FROM table1
The DATEADD subtracts 6 months from the date in table1 and the subquery returns the count of related records.
I think what you want is a type of join.
select t1.id, count(t2.id) as numt2dates
from table1 t1 left outer join
table2 t2
on t1.id = t2.id and
t2.startdate between dateadd(month, -6, t1.startdate) and t1.startdate
group by t1.id;
The exact syntax for the date arithmetic depends on the database.
Thank you this solved my issue. Although this may not help you since you are not attempting to group by date. But the answer gave me the insights to resolve the issue I was facing.
I was attempting to gather the total users a date criteria that had to be evaluated by multiple fields.
WITH data AS (
SELECT generate_series(
(date '2020-01-01')::timestamp,
NOW(),
INTERVAL '1 week'
) AS date
)
SELECT d.date, (SELECT COUNT(DISTINCT h.id) AS user_count
FROM history h WHERE h.startDate < d.date AND h.endDate > d.date
ORDER BY 1 DESC) AS total_records
FROM data d ORDER BY d.date DESC
2022-05-16, 15
2022-05-09, 13
2022-05-02, 13
...