Group By Week on Redshift

Group By Week on Redshift - sql

I'm trying to count my table by week but the DATE_TRUNCT('week',date) function considers Monday as the start of the week but I need for the week to start in Sunday.
This is the query, which runs properly but with starting in Mondays...
SELECT DATE_TRUNC('week',myDate) AS Reference,
column1 AS Item1,
column2 AS Item2,
COUNT(*) AS Volume,
COUNT(CASE WHEN status = 'status1' THEN 1 END) AS Status1,
COUNT(CASE WHEN status = 'status2' THEN 1 END) AS Status2,
COUNT(CASE WHEN status = 'status2' AND fase = '1' THEN 1 END) AS Fase1,
COUNT(CASE WHEN status = 'status2' AND fase = '2' THEN 1 END) AS Fase2,
COUNT(CASE WHEN status = 'status2' AND fase = '3' THEN 1 END) AS Fase3
FROM myTable
WHERE DATE_TRUNC('week',myDate) = DATE_TRUNC('week',TO_DATE('12/25/2016 00:00:00','MM/dd/yyyy'))
GROUP BY 1,
2,
3;
So far I only tried another query which doesnt even run and I dont know why, it just says "syntax error at or near "integer" :
SELECT DATE_TRUNC('week',myDate) - integer '1' AS Reference,
column1 AS Item1,
column2 AS Item2,
COUNT(*) AS Item3,
COUNT(CASE WHEN status = 'status1' THEN 1 END) AS Status1,
COUNT(CASE WHEN status = 'status2' THEN 1 END) AS Status2,
COUNT(CASE WHEN status = 'status2' AND fase = '1' THEN 1 END) AS Fase1,
COUNT(CASE WHEN status = 'status2' AND fase = '2' THEN 1 END) AS Fase2,
COUNT(CASE WHEN status = 'status2' AND fase = '3' THEN 1 END) AS Fase3
FROM myTable
WHERE myDate between ( DATE_TRUNC('week', TO_DATE('12/25/2016 00:00:00','MM/dd/yyyy' ) - integer '1' ) and ( DATE_TRUNC('week', TO_DATE('12/25/2016 00:00:00','MM/dd/yyyy' ) ) + integer '5' )
GROUP BY 1,
2,
3;
Also, even if this query runned propely, it would show the count of the week 18/Dec - 24/Dec and not the week 25/Dec - 31/Dec in the case 25/Dec. The same would happen in other days if they are Sundays.
EDIT:
I just found the solution in this blog:
https://blog.modeanalytics.com/date-trunc-sql-timestamp-function-count-on/
It was introducing the date_trunct function and someone asked the same question in the comments. This is my solved query for future reference to others:
SELECT date_trunc('WEEK',(myDate + interval '1 day'))- interval '1 day' AS Reference
column1 AS Item1,
column2 AS Item2,
COUNT(*) AS Volume,
COUNT(CASE WHEN status = 'status1' THEN 1 END) AS Status1,
COUNT(CASE WHEN status = 'status2' THEN 1 END) AS Status2,
COUNT(CASE WHEN status = 'status2' AND fase = '1' THEN 1 END) AS Fase1,
COUNT(CASE WHEN status = 'status2' AND fase = '2' THEN 1 END) AS Fase2,
COUNT(CASE WHEN status = 'status2' AND fase = '3' THEN 1 END) AS Fase3
FROM myTable
WHERE ( date_trunc('WEEK',(myDate + interval '1 day'))- interval '1 day') = ( DATE_TRUNC('week',TO_DATE('12/24/2016 00:00:00','MM/dd/yyyy') + interval '1 day' ) - interval '1 day' )
GROUP BY 1,
2,
3;

I couldn't find any simple way to set week as Sunday to Saturday. But you can try this:
select date_trunc('week', myDate + 1) - 1 as Reference,
...
from myTable
where ...
group by date_trunc('week', myDate + 1), ...
The trick here is just shift by one day while doing group by.

Related

Regarding cast/safe_cast in BigQuery for yesterday data

im trying to get yesterday's data of the value MainTable.Profit_Amount and I get an error:
No matching signature for operator = for argument types: TIMESTAMP, DATE. Supported signature: ANY = ANY
Orderdate is a TIMESTAMP so I tried SAFE_CAST but no luck.
What should I change? Im refering to the last line in the first part [the line with the date_add]:
SELECT *,
SUM(CASE WHEN MainTable.status = 2 THEN MainTable.Price ELSE 0 END) AS NB,
SUM(CASE WHEN MainTable.status = 3 THEN MainTable.Price ELSE 0 END) AS Cancelled,
SUM(CASE WHEN MainTable.status = 2 THEN MainTable.Profit_Amount ELSE 0 END) AS NR,
SUM(CASE WHEN MainTable.status = 3 THEN MainTable.Profit_Amount ELSE 0 END) AS GR_Cancelled,
ROUND(SUM(CASE WHEN MainTable.NewCheck = 0 then (MainTable.Profit_Amount) ELSE 0 END),2) as Existing_Suppliers_NR,
ROUND(SUM(CASE WHEN MainTable.NewCheck = 1 then (MainTable.Profit_Amount) ELSE 0 END),2) as New_Suppliers_NR,
SUM(CASE WHEN SAFE_CAST(MainTable.OrderDate AS STRING) = DATE_ADD(CURRENT_DATE(), interval -1 day) THEN MainTable.Profit_Amount else 0 end) AS YESTERDAY_NR
FROM
(
SELECT SNAP.opportunity_agent_name AS AgentName, SNAP.order_item_cda_price AS Price, ROUND(SNAP.order_item_cda_margin_percentage,2) AS Profit_Percantage, ROUND(SNAP.order_item_cda_margin_nis,2) AS Profit_Amount, SNAP.opportunity_meta_title AS OppName, SNAP.opportunity_date_live AS DateLive, SUPP.is_new AS NewCheck, SNAP.order_status AS status, SNAP.order_creation_date AS OrderDate, SNAP.opportunity_supplier_name AS SupplierName
FROM `Reports_for_BI.order_item_snapshot_clean_view_for_analysis` as SNAP, `grouponi_groupon.tb_suppliers` as SUPP
WHERE EXTRACT(MONTH from SNAP.order_creation_date at time zone "UTC") = EXTRACT(MONTH FROM DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY))
AND EXTRACT(YEAR from SNAP.order_creation_date at time zone "UTC") = EXTRACT(YEAR FROM DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY))
AND EXTRACT(DATE from SNAP.order_creation_date at time zone "UTC") != CURRENT_DATE()
AND SNAP.opportunity_agent_id in (453,714,571,68,98)
AND SNAP.order_status in (2,3)
AND SNAP.order_item_supplier_id = SUPP.id) AS MainTable
Group by AgentName, Price, Profit_Percantage, Profit_Amount,OppName, DateLive, NewCheck, Status, OrderDate, SupplierName

How do I get SUM by date without losing a calculation that relies on summing instance

CODE:
SELECT occurred_at,
SUM(CASE WHEN activity_name = 'create' THEN 1 ELSE 0 END) AS has_create,
SUM(CASE WHEN activity_name = 'Resolve' THEN 1 ELSE 0 END) AS has_resolve
FROM Activity_Data_Table$
WHERE (occurred_at > CURRENT_TIMESTAMP - 31)
AND (activity_name IN ('create', 'Resolve'))
GROUP BY session_id, occurred_at
HAVING (MAX(CASE WHEN activity_name = 'create' THEN 1 ELSE 0 END) <> 0)
Trying to identify % of tickets by day, that create and resolve in same session. Tickets can't resolve before they're created. Timing doesn't matter, only if they are in same 'session'. Tickets can definitely be created in one session and resolved in another. There are other statuses, they don't matter. Don't care if something is resolved but from a different session. So the finished result would contain an additional column with % of below data with has_resolve/has_create at the daily level.
Current output from above:

Something like this:
WITH dat
AS
(
SELECT occurred_at,
SUM(CASE WHEN activity_name = 'create' THEN 1 ELSE 0 END) AS has_create,
SUM(CASE WHEN activity_name = 'Resolve' THEN 1 ELSE 0 END) AS has_resolve
FROM Activity_Data_Table$
WHERE (occurred_at > CURRENT_TIMESTAMP - 31)
AND (activity_name IN ('create', 'Resolve'))
GROUP BY session_id, occurred_at
HAVING (MAX(CASE WHEN activity_name = 'create' THEN 1 ELSE 0 END) <> 0)
)
SELECT occurred_at, SUM(CASE WHEN has_create + has_resolve = 2 THEN 1 ELSE 0 END) * 1.0 / COUNT(*) AS pct_resolved_in_same_session
FROM dat
GROUP BY occurred_at;

Subquery In Postgresql not matching data expected

When the query below is used I get 7 records from the database.
SELECT pickup_date::date,
SUM(CASE WHEN paid ='Yes' THEN price ELSE 0 END) AS TotalMoMoPaid
from requests
where order_status = 'Done'
and payment_mode = 'MoMo'
and pickup_date::date >= current_timestamp::date - INTERVAL '7 days'
GROUP BY pickup_date::date,paid,order_status,price
When the same query is used as a sub query, I get 2 records which is not what I expect,
SELECT pickup_date::date,
sub.TotalMoMoPaid,
SUM(CASE WHEN order_status ='Done' THEN price ELSE 0 END) AS "TotalCashSales"
from (
SELECT paid as subPaid,
order_status as subStatus,
price as subPrice,
SUM(CASE WHEN paid ='Yes' THEN price ELSE 0 END) AS TotalMoMoPaid
from requests
where order_status = 'Done'
and payment_mode = 'MoMo'
and pickup_date::date >= current_timestamp::date - INTERVAL '7 days'
GROUP BY pickup_date::date,subPaid,subStatus,subPrice
) AS sub, requests
where order_status ='Done'
and payment_mode = 'Cash'
and pickup_date::date >= current_timestamp::date - INTERVAL '7 days'
GROUP BY sub.TotalMoMoPaid,subPaid,pickup_date::date
ORDER BY sub.TotalMoMoPaid,pickup_date::date

This query should work instead of using subqueries;
SELECT pickup_date::date,
SUM(CASE WHEN payment_mode = 'MoMo' and paid = 'Yes' THEN price ELSE 0 END) AS TotalMoMoPaid,
SUM(CASE WHEN payment_mode = 'Cash' and paid = 'Yes' THEN price ELSE 0 END) AS TotalCashSales
FROM requests
WHERE order_status = 'Done'
and pickup_date::date >= current_timestamp::date - INTERVAL '7 days'
GROUP BY pickup_date::date,paid,order_status,price

How to combine 2 queries in same table with different group by ? (ORACLE)

I need to calculate On Time Arrival and Departure. Query to get On Time Departure:
SELECT DEPAIRPORT as AIRPORT,
COUNT(case when A.STATUS = 'Scheduled' and
A.ACTUAL_BLOCKOFF is not null then 1 else NULL END) as SCHEDULED,
COUNT(case when ((A.ACTUAL_BLOCKOFF+ interval '7' hour) - (A.SCHEDULED_DEPDT+ interval '7' hour))*24*60 <= '+000000015 00:00:00.000000000' and
A.ACTUAL_BLOCKOFF is not null then 1 else NULL END) as ONTIME
FROM TABLE A GROUP BY DEPAIRPORT
and Query to calculate On Time Arrival:
SELECT COUNT(case when ((A.ACTUAL_BLOCKON + interval '7' hour) - (A.SCHEDULED_ARRDT+ interval '7' hour))*24*60 <= '+000000015 00:00:00.000000000' and
A.ACTUAL_BLOCKON is not null then 1 else NULL END) as ARRONTIME
FROM TABLE A
GROUP BY ARRIVALAIRPORT
How to combine these queries into 1 single query so I can display it like this table:
Name #Schedule #OnTimeDeparture #ArrivalOntime
AIRPORTX 41 35 20

Without the sample data and expected output, it is difficult to tell what exactly you want. If you want to combine the two datasets, you may put them in with clauses and the then join them together(LEFT JOIN or INNER JOIN based on the output required for cases where arrival has happened yet or not)
WITH dep
AS (SELECT depairport AS airport,
count(CASE
WHEN a.status = 'Scheduled'
AND a.actual_blockoff IS NOT NULL THEN 1
END) AS scheduled,
count(CASE
WHEN( ( a.actual_blockoff + interval '7' hour ) - (
a.scheduled_depdt + interval '7' hour ) ) *
24 *
60
<=
'+000000015 00:00:00.000000000'
AND a.actual_blockoff IS NOT NULL THEN 1
END) AS ontime
FROM tablea
GROUP BY depairport),
arr
AS (SELECT arrivalairport AS airport,
count(CASE
WHEN( ( a.actual_blockon + interval '7' hour ) - (
a.scheduled_arrdt + interval '7' hour ) ) *
24 *
60
<=
'+000000015 00:00:00.000000000'
AND a.actual_blockon IS NOT NULL THEN 1
END) AS arrontime
FROM tablea
GROUP BY arrivalairport)
SELECT dep.airport AS Name,
dep.scheduled AS "#Schedule",
dep.ontime AS "#OnTimeDeparture",
arr.arrontime AS "#ArrivalOntime"
FROM dep
left join arr -- Or Inner join depending on the expected output.
ON ( dep.airport = arr.airport );

You can use something like this:
select
max(SCHEDULED) as SCHEDULED,
max(ONTIME) as ONTIME,
max(ARRONTIME) as ARRONTIME
from (select
count(case when ... ) over(partition by DEPAIRPORT) as SCHEDULED,
count(case when ... ) over(partition by DEPAIRPORT) as ONTIME,
count(case when ... ) over(partition by ARRIVALAIRPORT) as ARRONTIME
from a );
But I guess that your question is not complete. Also you need a key for join different flights.

Limit SQL query to days

I use this SQL query to make status report by day:
CREATE TABLE TICKET(
ID INTEGER NOT NULL,
TITLE TEXT,
STATUS INTEGER,
LAST_UPDATED DATE,
CREATED DATE
)
;
Query:
SELECT t.created,
COUNT(CASE WHEN t.status = '1' THEN 1 END) as cnt_status1,
COUNT(CASE WHEN t.status = '2' THEN 1 END) as cnt_status2,
COUNT(CASE WHEN t.status = '3' THEN 1 END) as cnt_status3,
COUNT(CASE WHEN t.status = '4' THEN 1 END) as cnt_status4
FROM ticket t
GROUP BY t.created
How I can limit this query to last 7 days?
Also I would like to get the results split by day. Fow example I would like to group the first dates for 24 hours, second for next 24 hours and etc.
Expected result:

This might help:
SELECT TO_CHAR(t.created, 'YYYY-MM-DD') AS created_date,
COUNT(CASE WHEN t.status = '1' THEN 1 END) as cnt_status1,
COUNT(CASE WHEN t.status = '2' THEN 1 END) as cnt_status2,
COUNT(CASE WHEN t.status = '3' THEN 1 END) as cnt_status3,
COUNT(CASE WHEN t.status = '4' THEN 1 END) as cnt_status4
FROM ticket t
WHERE t.created >= SYSDATE-7
GROUP BY TO_CHAR(t.created, 'YYYY-MM-DD')
ORDER BY created_date;
I used the oracle function for date conversion. I'm sure you'll find the corresponding one for postgresql.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Group By Week on Redshift - sql

I couldn't find any simple way to set week as Sunday to Saturday. But you can try this: select date_trunc('week', myDate + 1) - 1 as Reference, ... from myTable where ... group by date_trunc('week', myDate + 1), ... The trick here is just shift by one day while doing group by.

Related

Regarding cast/safe_cast in BigQuery for yesterday data

How do I get SUM by date without losing a calculation that relies on summing instance

Subquery In Postgresql not matching data expected

How to combine 2 queries in same table with different group by ? (ORACLE)

Limit SQL query to days

Categories

Resources