I want to basically find out how many users paid within 15 mins, 30 mins and 60 mins of my payment_time and trigger_time
I have the following query
with redshift_direct() as conn:
trigger_time_1 = pd.read_sql(f"""
with new_data as
(
select
cycle_end_date
, prime_tagging_by_issuer_and_product
, u.user_id
, settled_status
, delay,
ots_created_at + interval '5:30 hours' as payment_time
,case when to_char(cycle_end_date,'DD') = '15' then 'Odd' else 'Even' end as cycle_order
from
settlement_summary_from_snapshot s
left join (select distinct user_phone_number, user_id from user_events where event_name = 'UserCreatedEvent') u
on u.user_id = s.user_id
and cycle_type = 'end_cycle'
and cycle_end_date > '2021-11-30' and cycle_end_date < '2022-01-15'
)
select
bucket_id
, cycle_end_date, d.cycle_order
, date(cycle_end_date) as t_cycle_end_date
,d.prime_tagging_by_issuer_and_product
,source
,status as cause
,split_part(campaign_name ,'|', 1) as campaign
,split_part(campaign_name ,'|', 2) as sms_cycle_end_date
,split_part(campaign_name ,'|', 3) as day
,split_part(campaign_name ,'|', 4) as type
,to_char(to_date(split_part(campaign_name ,'|', 2) , 'DD/MM/YYYY'), 'YYYY-MM-DD') as campaign_date,
d.payment_time, payload_event_timestamp + interval '5:30 hours' as trigger_time
,count( s.user_id) as count
from sms_callback_events s
inner join new_data d
on s.user_id = d.user_id
where bucket_id > 'date_2021_11_30' and bucket_id < 'date_2022_01_15'
and campaign_name like '%RC%'
and event_name = 'SmsStatusUpdatedEvent'
group by 1,2,3,4,5,6,7,8,9,10,11,12,13,14
""",conn)
How do i achieve making 3 columns with number of users who paid within 15mins, 30 mins and 60 mins after trigger_time in this query? I was doing it with Pandas but I want to find a way to do it here itself. Can someone help?
I wrote my own DATEDIFF function, which returns an integer value of differencing between two dates, difference by day, by month, by year, by hour, by minute and etc. You can use this function on your queries.
DATEDIFF Function SQL Code on GitHub
Sample Query about using our DATEDIFF function:
select
datediff('minute', mm.start_date, mm.end_date) as diff_minute
from
(
select
'2022-02-24 09:00:00.100'::timestamp as start_date,
'2022-02-24 09:15:21.359'::timestamp as end_date
) mm;
Result:
---------------
diff_minute
---------------
15
---------------
Related
I have a query that will return the ratio of issuances from (issuances from specific network with specific time period / total issuances). so the issuances from specific network with a specific time period divided to total issuances from all networks. Right now it returns the ratios of issuances only from last year (year-to-date I mean), I want to include several time periods in it such as one month ago, 2 month ago etc. LEFT JOIN usually works but I couldn't figure it out for this one. How do I do it?
Here is the query:
SELECT IR1.network,
count(*) / ((select count(*) FROM issuances_extended
where status = 'completed' and
issued_at >= date_trunc('year',current_date)) * 1.) as issuance_ratio_ytd
FROM issuances_extended as IR1 WHERE status = 'completed' and
(issued_at >= date_trunc('year',current_date))
GROUP BY
IR1.network
order by IR1.network
I would break your query into CTEs something like this:
with periods (period_name, period_range) as (
values
('YTD', daterange(date_trunc('year', current_date), null)),
('LY', daterange(date_trunc('year', current_date - 'interval 1 year'),
date_trunc('year', current_date))),
('MTD', daterange(date_trunc('month', current_date - 'interval 1 month'),
date_trunc('month', current_date));
-- Add whatever other intervals you want to see
), period_totals as ( -- Get period totals
select p.period_name, p.period_range, count(*) as total_issuances
from periods p
join issuances_extended i
on i.status = 'completed'
and i.issued_at <# p.period_range
)
select p.period_name, p.period_range,
i.network, count(*) as network_issuances,
1.0 * count(*) / p.total_issuances as issuance_ratio
from period_totals p
join issuances_extended i
on i.status = 'completed'
and i.issued_at <# p.period_range
group by p.period_name, p.period_range, i.network, p.total_issuances;
The problem with this is that you get rows instead of columns, but you can use a spreadsheet program or reporting tool to pivot if you need to. This method simplifies the calculations and lets you add whatever period ranges you want by adding more values to the periods CTE.
Something like this? Obviously not tested
SELECT
IR1.network,
count(*)/((select count(*) FROM issuances_extended
where status = 'completed' and
issued_at between mon.t and current_date ) * 1.) as issuance_ratio_ytd
FROM
issuances_extended as IR1 ,
(
SELECT
generate_series('2022-01-01'::date,
'2022-07-01'::date, '1 month') AS t)
AS mon
WHERE
status = 'completed' and
(issued_at between mon.t and current_date)
GROUP BY
IR1.network
ORDER BY
IR1.network
I've managed to join these tables, so I am answering my question for those who would need some help. To add more tables all you have to do is put new queries in LEFT JOIN and acknowledge them in the base query (IR3, IR4, blabla etc.)
SELECT
IR1.network,
count(*) / (
(
select
count(*)
FROM
issuances_extended
where
status = 'completed'
and issued_at >= date_trunc('year', current_date)
) * 1./ 100
) as issuances_ratio_ytd,
max(coalesce(IR2.issuances_ratio_m0, 0)) as issuances_ratio_m0
FROM
issuances_extended as IR1
LEFT JOIN (
SELECT
network,
count(*) / (
(
select
count(*)
FROM
issuances_extended
where
status = 'completed'
and issued_at >= date_trunc('month', current_date)
) * 1./ 100
) as issuances_ratio_m0
FROM
issuances_extended
WHERE
status = 'completed'
and (issued_at >= date_trunc('month', current_date))
GROUP BY
network
) AS IR2 ON IR1.network = IR2.network
WHERE
status = 'completed'
and (issued_at >= date_trunc('year', current_date))
GROUP BY
IR1.network,
IR2.issuances_ratio_m0
order by
IR1.network
I need help with a query where a need to count de consecutive days like this
select
a.numcad, a.datapu , f.datapu , nvl(to_char(f.datapu, 'DD'),0)dia,
row_number() over (partition by a.numcad, f.datapu order by f.datapu)particao
from
ronda.r066apu a
left join (select t.numcad, t.numemp, t.datacc, t.datapu
from ronda.r070acc t
where t.datacc >= '21/01/2022'
and t.datacc <= trunc(sysdate)
group by t.numcad, t.numemp, t.datacc, t.datapu)f
on a.numemp = f.numemp
and a.numcad = f.numcad
and a.datapu = f.datapu
where a.numcad = 2675
and A.DATAPU >= '21/01/2022'
and A.DATAPU <= trunc(sysdate)
group by a.numcad, a.datapu, f.datapu, f.datacc
order by a.datapu
result is
between 24/01/2022 and 04/02/2022
is 12 days i need know this count , but i will ways get the '21/mes/year'
You can try:
SELECT TO_DATE('2022-01-24', 'YYYY-MM-DD') -
TO_DATE('2022-02-04', 'YYYY-MM-DD')
FROM dual
This returns 21, for example...
My table structure like this
root_tstamp
userId
2022-01-26T00:13:24.725+00:00
d2212
2022-01-26T00:13:24.669+00:00
ad323
2022-01-26T00:13:24.629+00:00
adfae
2022-01-26T00:13:24.573+00:00
adfa3
2022-01-26T00:13:24.552+00:00
adfef
...
...
2021-01-26T00:12:24.725+00:00
d2212
2021-01-26T00:15:24.669+00:00
daddfe
2021-01-26T00:14:24.629+00:00
adfda
2021-01-26T00:12:24.573+00:00
466eff
2021-01-26T00:12:24.552+00:00
adfafe
I want to get the number of users in the current year and in previous year like below using SQL.
Date
Users
previous_year
2022-01-01
10
5
2022-01-02
20
15
and the query I have used is:
with base as (
select
date(root_tstamp) as current_date
, count(distinct userid) as signup_counts
from table1
group by 1
)
select
t1.current_date
, t1.signup_counts as signups_this_year
, t2.signup_counts as signups_last_year
, t1.signup_counts - t2.signups_counts as difference
from base t1
left join base t2 on t1.current_date = t2.current_date + interval '1 year'
group by t1.current_date
order by t1.current_date Desc
But I getting this error:
ERROR: column t2.signups_counts does not exist
It's because you have t2.signup_counts is misspelled as t2.signups_counts.
Another note is that your query only has a GROUP BY on current_date and since the other columns are not aggregates you've to include these columns too.
Here is the modified query:
with base as (
select
date(root_tstamp) as current_date
, count(distinct userid) as signup_counts
from table1
group by 1
)
select
t1.current_date
, t1.signup_counts as signups_this_year
, t2.signup_counts as signups_last_year
, t1.signup_counts - t2.signup_counts as difference
from base t1
left join base t2 on t1.current_date = t2.current_date - interval '1 year'
group by t1.current_date, t2.signup_counts, t1.signup_counts
order by t1.current_date Desc
I want to retrieve records where cash deposits are more than 4 totaling to 1000000 during a day and continues for more than 5 days.
I have came up with below query.
SELECT COUNT(a.txamt) AS "txcount"
, SUM(a.txamt) AS "txsum"
, b.custcd
, a.txdate
FROM tb_transactions a
INNER JOIN tb_accounts b
ON a.acctno = b.acctno
WHERE a.cashflowtype = 'CR'
GROUP BY b.custcd, a.txdate
HAVING COUNT(a.txamt)>4 and SUM(a.txamt)>='1000000'
ORDER BY a.txdate;
But I'm stuck on how to fetch the records if the pattern continues for 5 days.
How to achieve the desired result?
Something like:
SELECT *
FROM (
SELECT t.*,
COUNT( txdate ) OVER ( PARTITION BY custcd
ORDER BY txdate
RANGE BETWEEN INTERVAL '0' DAY PRECEDING
AND INTERVAL '4' DAY FOLLOWING ) AS
num_days
FROM (
select count(a.txamt) as "txcount",
sum(a.txamt) as "txsum",
b.custcd,
a.txdate
from tb_transactions a inner join tb_accounts b on a.acctno=b.acctno
where a.cashflowtype='CR'
group by b.custcd, a.txdate
having count(a.txamt)>4 and sum(a.txamt)>=1000000
) t
)
WHERE num_days = 5
order by a.txdate;
I have a Stored Procedure that retrieves employee daily summary intime - outtime:
SELECT ads.attendancesumid,
ads.employeeid,
ads.date,
ads.day, -- month day number
ads.intime,
ads.outtime
--employee shift intime and outtime
ss.intime,
ss.outtime
FROM employee_attendance_daily_summary ads
JOIN employee emp
ON emp.employeeid = ads.employeeid
JOIN setup_shift ss
ON ss.shiftcode = emp.shiftcode
AND DATEPART(dw, ads.date) = ss.day
WHERE ads.employeeid = 4 -- just to filter one employee
The result of the query is something like this:
Each day is repeated 3 times because table setup_shift (employee shifts) has:
Monday to Sunday for 3 different shift types: DAY, AFTERNOON and NIGHT.
Here is the same info but with the shift type column:
What I need is to ONLY get 1 row per day but with the closest employee shift depending on the intime and outtime.
So the desire result should looks like this:
Any clue on how to do this? Appreciate it in advance.
I have also these case where intime is 00:00:00 but outtime has a value:
UPDATE:
HERE IS THE SQL FIDDLE
http://sqlfiddle.com/#!6/791cb/7
select ads.attendancesumid,
ads.employeeid,
ads.date,
ads.day,
ads.intime,
ads.outtime,
ss.intime,
ss.outtime
from employee_attendance_daily_summary ads
join employee emp
on emp.employeeid = ads.employeeid
join setup_shift ss
on ss.shiftcode = emp.shiftcode
and datepart(dw, ads.date) = ss.day
where ads.employeeid = 4
and ((abs(datediff(hh,
cast(ads.intime as datetime),
cast(ss.intime as datetime))) between 0 and 2) or
(ads.intime = '00:00:00' and
ss.intime =
(select min(x.intime)
from setup_shift x
where x.shiftcode = ss.shiftcode
and x.intime > (select min(y.intime)
from setup_shift y
where y.shiftcode = x.shiftcode))))
This would be much easier if the times were in seconds after midnight, rather than in a time, datetime, or string format. You can convert them using the formula:
select datepart(hour, intime) * 3600 + datepart(minute, intime) * 60 + datepart(second, intime)
(Part of this is just my own discomfort with all the nested functions needed to handle other data types.)
So, let me assume that you have a series of similar columns measured in seconds. You can then approach this problem by taking the overlap with each shift and choosing the shift with the largest overlap.
with t as (
<your query here>
),
ts as (
select t.*,
(datepart(hour, ads.intime) * 3600 + datepart(minute, ads.intime) * 60 +
datepart(second, ads.intime)
) as e_intimes,
. . .
from t
),
tss as (
select ts.*,
(case when e_intimes >= s_outtimes then 0
when e_outtimes <= s_inttimes then 0
else (case when e_outtimes < s_outtimes then e_outtimes else s_outtimes end) -
(case when e_intimes > s_intimes then e_intimes else s_intimes end)
end) as overlap
from ts
)
select ts.*
from (select ts.*,
row_number() over (partition by employeeid, date
order by overlap desc
) as seqnum
from ts
) ts
where seqnum = 1;
Try this man,I just take the minimum time difference of the each set datediff(mi,intime,shift_intime)
Select * from
(select
row_number() over(partition by employeeid
order by datediff(mi,intime,shift_intime) asc) as id,
attendance,employeeid,date,day,intime,outime,shiftintime,shiftoutime from table
)
where id=1