Want a SQL statement to count number - sql

I have a table have 3 columns id, open_time, close_time, the data looks like this:
then I want a SQL to get result like this:
the rule is : if the date equals to open time then New, if the date > open_time and date < close_time then Open, if the date equals close_time then Closed
how can I write the SQL in Oracle?

First build a table on-the-fly containing all dates from the minimum date in the table until today. You need a recursive query for this.
Then build a table on-the-fly for the three statuses.
Now cross join the two to get all combinations. These are the rows you want.
The rest is counting per day and status, which can be achieved with a join and grouping or with one or more subqueries. I'm showing the join:
with days(day) as
(
select min(open_time) as day from opentimes
union all
select day + 1 from days where day < trunc(sysdate)
)
, statuses as
(
select 'New' as status, 1 as sortkey from dual
union all
select 'Open' as status, 2 as sortkey from dual
union all
select 'Close' as status, 3 as sortkey from dual
)
select
d.day,
s.status,
count(case when (s.status = 'New' and d.day = o.open_time)
or (s.status = 'Open' and d.day = o.close_time)
or (s.status = 'Close' and d.day > cls.open_time and d.day < cls.close_time)
then 1 end) as cnt
from days d
cross join statuses s
join opentimes o on d.day between o.open_time and o.close_time
group by d.day, s.status
order by d.day, max(s.sortkey);

Related

Oracle query to fill in the missing data in the same table

I have a table in oracle which has missing data for a given id. I am trying to figure out the sql to fill in the data from start date: 01/01/2019 to end_dt: 10/1/2020. see the input data below. for status key the data can be filled based on its previous status key. see input:
expected output:
You can use a recursive query to generate the dates, then cross join that with the list of distinct ids available in the table. Then, use window functions to bring the missing key values:
with recursive cte (mon) as (
select date '2019-01-01' mon from dual
union all select add_months(mon, 1) from cte where mon < date '2020-10-01'
)
select i.id,
coalesce(
t.status_key,
lead(t.previous_status_key ignore nulls) over(partition by id order by c.mon)
) as status_key,
coalesce(
t.status_key,
lag(t.status_key ignore nulls, 1, -1) over(partition by id order by c.mon)
) previous_status_key,
c.mon
from cte c
cross join (select distinct id from mytable) i
left join mytable t on t.mon = c.mon and t.id = i.id
You did not give a lot of details on how to bring the missing status_keys and previous_status_keys. Here is what the query does:
status_key is taken from the next non-null previous_status_key
previous_status_key is taken from the last non-null status_key, with a default of -1
You can generate the dates and then use cross join and some additional logic to get the information you want:
with dates (mon) as (
select date '2019-01-01' as mon
from dual
union all
select mon + interval '1' month
from dates
where mon < date '2021-01-01'
)
select d.mon, i.id,
coalesce(t.status_key,
lag(t.status_key ignore nulls) over (partition by i.id order by d.mon)
) as status_key,
coalesce(t.previous_status_key,
lag(t.previous_status_key ignore nulls) over (partition by i.id order by d.mon)
) as previous_status_key
from dates d cross join
(select distinct id from t) i left join
t
on d.mon = t.mon and i.id = i.id;

Same output in two different lateral joins

I'm working on a bit of PostgreSQL to grab the first 10 and last 10 invoices of every month between certain dates. I am having unexpected output in the lateral joins. Firstly the limit is not working, and each of the array_agg aggregates is returning hundreds of rows instead of limiting to 10. Secondly, the aggregates appear to be the same, even though one is ordered ASC and the other DESC.
How can I retrieve only the first 10 and last 10 invoices of each month group?
SELECT first.invoice_month,
array_agg(first.id) first_ten,
array_agg(last.id) last_ten
FROM public.invoice i
JOIN LATERAL (
SELECT id, to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE id = i.id
ORDER BY invoice_date, id ASC
LIMIT 10
) first ON i.id = first.id
JOIN LATERAL (
SELECT id, to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE id = i.id
ORDER BY invoice_date, id DESC
LIMIT 10
) last on i.id = last.id
WHERE i.invoice_date BETWEEN date '2017-10-01' AND date '2018-09-30'
GROUP BY first.invoice_month, last.invoice_month;
This can be done with a recursive query that will generate the interval of months for who we need to find the first and last 10 invoices.
WITH RECURSIVE all_months AS (
SELECT date_trunc('month','2018-01-01'::TIMESTAMP) as c_date, date_trunc('month', '2018-05-11'::TIMESTAMP) as end_date, to_char('2018-01-01'::timestamp, 'YYYY-MM') as current_month
UNION
SELECT c_date + interval '1 month' as c_date,
end_date,
to_char(c_date + INTERVAL '1 month', 'YYYY-MM') as current_month
FROM all_months
WHERE c_date + INTERVAL '1 month' <= end_date
),
invocies_with_month as (
SELECT *, to_char(invoice_date::TIMESTAMP, 'YYYY-MM') invoice_month FROM invoice
)
SELECT current_month, array_agg(first_10.id), 'FIRST 10' as type FROM all_months
JOIN LATERAL (
SELECT * FROM invocies_with_month
WHERE all_months.current_month = invoice_month AND invoice_date >= '2018-01-01' AND invoice_date <= '2018-05-11'
ORDER BY invoice_date ASC limit 10
) first_10 ON TRUE
GROUP BY current_month
UNION
SELECT current_month, array_agg(last_10.id), 'LAST 10' as type FROM all_months
JOIN LATERAL (
SELECT * FROM invocies_with_month
WHERE all_months.current_month = invoice_month AND invoice_date >= '2018-01-01' AND invoice_date <= '2018-05-11'
ORDER BY invoice_date DESC limit 10
) last_10 ON TRUE
GROUP BY current_month;
In the code above, '2018-01-01' and '2018-05-11' represent the dates between we want to find the invoices. Based on those dates, we generate the months (2018-01, 2018-02, 2018-03, 2018-04, 2018-05) that we need to find the invoices for.
We store this data in all_months.
After we get the months, we do a lateral join in order to join the invoices for every month. We need 2 lateral joins in order to get the first and last 10 invoices.
Finally, the result is represented as:
current_month - the month
array_agg - ids of all selected invoices for that month
type - type of the selected invoices ('first 10' or 'last 10').
So in the current implementation, you will have 2 rows for each month (if there is at least 1 invoice for that month). You can easily join that in one row if you need to.
LIMIT is working fine. It's your query that's broken. JOIN is just 100% the wrong tool here; it doesn't even do anything close to what you need. By joining up to 10 rows with up to another 10 rows, you get up to 100 rows back. There's also no reason to self join just to combine filters.
Consider instead window queries. In particular, we have the dense_rank function, which can number every row in the result set according to groups:
SELECT
invoice_month,
time_of_month,
ARRAY_AGG(id) invoice_ids
FROM (
SELECT
id,
invoice_month,
-- Categorize as end or beginning of month
CASE
WHEN month_rank <= 10 THEN 'beginning'
WHEN month_reverse_rank <= 10 THEN 'end'
ELSE 'bug' -- Should never happen. Just a fall back in case of a bug.
END AS time_of_month
FROM (
SELECT
id,
invoice_month,
dense_rank() OVER (PARTITION BY invoice_month ORDER BY invoice_date) month_rank,
dense_rank() OVER (PARTITION BY invoice_month ORDER BY invoice_date DESC) month_rank_reverse
FROM (
SELECT
id,
invoice_date,
to_char(invoice_date, 'Mon-yy') AS invoice_month
FROM public.invoice
WHERE invoice_date BETWEEN date '2017-10-01' AND date '2018-09-30'
) AS fiscal_year_invoices
) ranked_invoices
-- Get first and last 10
WHERE month_rank <= 10 OR month_reverse_rank <= 10
) first_and_last_by_month
GROUP BY
invoice_month,
time_of_month
Don't be intimidated by the length. This query is actually very straightforward; it just needed a few subqueries.
This is what it does logically:
Fetch the rows for the fiscal year in question
Assign a "rank" to the row within its month, both counting from the beginning and from the end
Filter out everything that doesn't rank in the 10 top for its month (counting from either direction)
Adds an indicator as to whether it was at the beginning or end of the month. (Note that if there's less than 20 rows in a month, it will categorize more of them as "beginning".)
Aggregate the IDs together
This is the tool set designed for the job you're trying to do. If really needed, you can adjust this approach slightly to get them into the same row, but you have to aggregate before joining the results together and then join on the month; you can't join and then aggregate.

How to fill missing dates between empty records?

I am trying to fill dates between empty records but without success. Tried to do multiple selects method, tried to join, but it seems like I am missing the point. I would like to generate records with missing dates, to generate chart from this block of code. Firstly I would like to have dates filled "manually", later I will reorganise this code and swap that method for an argument.
Can someone help me with that expression?
SELECT
LOG_LAST AS "data",
SUM(run_cnt) AS "Number of runs"
FROM
dual l
LEFT OUTER JOIN "LOG_STAT" stat ON
stat."LOG_LAST" = l."CLASS"
WHERE
new_class = '$arg[klasa]'
--SELECT to_date(TRUNC (SYSDATE - ROWNUM), 'DD-MM-YYYY'),
--0
--FROM dual CONNECT BY ROWNUM < 366
GROUP BY
LOG_LAST
ORDER BY
LOG_LAST
//Edit:
LOG_LAST is just a column with date (for example: 25.04.2018 15:44:21), run_cnt is a column with just a simple number, LOG_STAT is a table that contains LOG_LAST and run_cnt, new_class is a column with name of the record I would like to list records even when they are no existing. For example: I have a records with date 24-09-2018, 23-09-2018, 20-09-2018, 18-09-2018, and I would like to list records even without names and run_cnt, but to generate missing dates in some period
try to fill with isnull:
SELECT
case when trim(LOG_LAST) is null then '01-01-2018'
else isnull(LOG_LAST,'01-01-2018')end AS data,
SUM(isnull(run_cnt,0)) AS "Number of runs"
FROM
dual l
LEFT OUTER JOIN "LOG_STAT" stat ON
stat."LOG_LAST" = l."CLASS"
WHERE
new_class = '$arg[klasa]'
--SELECT to_date(TRUNC (SYSDATE - ROWNUM), 'DD-MM-YYYY'),
--0
--FROM dual CONNECT BY ROWNUM < 366
GROUP BY
LOG_LAST
ORDER BY
LOG_LAST
What you want is more or less:
select d.day, sum(ls.run_cnt)
from all_dates d
left join log_stat ls on trunc(ls.log_last) = d.day
where ls.new_class = :klasa
group by d.day
order by d.day;
The all_dates table in above query is supposed to contain all dates beginning with the minimum klasa log_last date and ending with the maximum klasa log_last date. You get these dates with a recursive query.
with ls as
(
select trunc(log_last) as day, sum(run_cnt) as total
from log_stat
where new_class = :klasa
group by trunc(log_last)
)
, all_dates(day) as
(
select min(day) from ls
union all
select day + 1 from all_dates where day < (select max(day) from ls)
)
select d.day, ls.total
from all_dates d
left join ls on ls.day = d.day
order by d.day;
It's called data densification. From oracle doc Data Densification for Reporting, An example data densification
with ls as
(
select trunc(created) as day,object_type new_class, sum(1) as total
from user_objects
group by trunc(created),object_type
)
, all_dates(day) as
(
select min(day) from ls
union all
select day + 1 from all_dates where day < (select max(day) from ls)
)
select d.day, nvl(ls.total,0),new_class
from all_dates d
left join ls partition by (ls.new_class) on ls.day = d.day
order by d.day;

SQL count display 0s using 1 table

I have a table called requests, and Im looking to count how many requests where the column rideId is not null each day during the last week. I have the following query:
Select count(*), dayname(time) as Day
from request
where time >= (select current_timestamp - interval 7 day) and rideId is not null
group by dayname(time)
order by dayofweek(Day);
How can I make it so it shows me those days where there is no request with rideId and count should be 0
Table is: Request(userId, time, rideId)
Move the not null check into your count, and join to a calendar table to bring in the missing days.
SELECT
t1.dname,
COALESCE(t2.numRides, 0) AS numRides
FROM
(
SELECT 'Monday' AS dname, 2 AS dow UNION ALL
SELECT 'Tuesday', 3 UNION ALL
SELECT 'Wednesday', 4 UNION ALL
SELECT 'Thursday', 5 UNION ALL
SELECT 'Friday', 6 UNION ALL
SELECT 'Saturday', 7 UNION ALL
SELECT 'Sunday', 1
) t1
LEFT JOIN
(
SELECT DAYNAME(time) AS dname, COUNT(rideId) AS numRides
FROM request
WHERE time >= DATE_SUB(CURDATE(),INTERVAL 7 DAY)
GROUP BY DAYNAME(time)
) t2
ON t1.dname = t2.dname
ORDER BY t1.dow;
Select a.day, coalesce(b.cnt, 0) as cnt
from (--select all days here) a
left join
(select dayname(time) as day, count(*) as cnt
from requests
where some_condition
group by day) b
using a.day = b.day
order by day;

BigQuery Monthly Active Users?

I'm currently working off a query from this post. That query is written in Legacy SQL and will not work in my environment. I've modified the query to use the modern SQL functions and updated the SELECT date as date to use timestamp_micros.
I should also mention that the rows I'm trying to select are coming in from Firebase Analytics.
My Query:
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date,
SUM(CASE WHEN period = 7 THEN users END) as days_07,
SUM(CASE WHEN period = 14 THEN users END) as days_14,
SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date,
periods.period as period,
COUNT(DISTINCT user_dim.app_info.app_instance_id) as users
FROM `com_sidearm_fanapp_uiowa_IOS.*` as activity
CROSS JOIN
UNNEST(event_dim) as event
CROSS JOIN (
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date
FROM `com_sidearm_fanapp_uiowa_IOS.*`
CROSS JOIN
UNNEST(event_dim) as event
GROUP BY event.timestamp_micros
) as dates
CROSS JOIN (
SELECT
period
FROM
(
SELECT 7 as period
UNION ALL
SELECT 14 as period
UNION ALL
SELECT 30 as period
)
) as periods
WHERE
dates.date >= activity.date
AND
SAFE_CAST(FLOOR(TIMESTAMP_DIFF(dates.date, activity.date, DAY)/periods.period) AS INT64) = 0
GROUP BY 1,2
)
CROSS JOIN
UNNEST(event_dim) as event
GROUP BY date
ORDER BY date DESC
Column name period is ambiguous at [24:13] error.
to fix this particular error - you should fix below
CROSS JOIN (
SELECT
period
FROM
(SELECT 7 as period),
(SELECT 14 as period),
(SELECT 30 as period)
) as periods
so it should look like:
CROSS JOIN (
SELECT
period
FROM
(SELECT 7 as period UNION ALL
SELECT 14 as period UNION ALL
SELECT 30 as period)
) as periods
Answer on your updated question
Try below. I didn't have chance to test it but hope it can help you fix your query
SELECT
date,
SUM(CASE WHEN period = 7 THEN users END) as days_07,
SUM(CASE WHEN period = 14 THEN users END) as days_14,
SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
SELECT
activity.date as date,
periods.period as period,
COUNT(DISTINCT user) as users
FROM (
SELECT
event.timestamp_micros as date,
user_dim.app_info.app_instance_id as user
FROM `yourTable` CROSS JOIN UNNEST(event_dim) as event
) as activity
CROSS JOIN (
SELECT
event.timestamp_micros as date
FROM `yourTable` CROSS JOIN UNNEST(event_dim) as event
GROUP BY event.timestamp_micros
) as dates
CROSS JOIN (
SELECT period
FROM
(SELECT 7 as period UNION ALL
SELECT 14 as period UNION ALL
SELECT 30 as period)
) as periods
WHERE dates.date >= activity.date
AND SAFE_CAST(FLOOR(TIMESTAMP_DIFF(TIMESTAMP_MICROS(dates.date), TIMESTAMP_MICROS(activity.date), DAY)/periods.period) AS INT64) = 0
GROUP BY 1,2
)
GROUP BY date
ORDER BY date DESC