Count of On-Going Transaction in BigQuery - google-bigquery

I have this table:
book_name
borrow_date
return_date
A
2022-08-01
2022-08-03
B
2022-08-03
2022-09-01
C
2022-08-15
2022-09-25
D
2022-09-15
2022-09-18
E
2022-09-17
2022-10-15
And table of first date of the month
summary_month
2022-08-01
2022-09-01
2022-10-01
I would like to count how many books are currently borrowed based on the summary_month. The result I am looking for is:
summary_month
count_book
list_book
2022-08-01
3
A,B,C
2022-09-01
4
B,C,D,E
2022-10-01
1
E
I am stuck with only able to aggregate them based on the borrowed date with query:
count(distinct case when summary_month = date_trunc(borrow_date,month) then book_name end) count_book
Is it possible to get the result I am hoping for? Really need anyone's help and advice. Thank you.

Consider below option
select summary_month,
count(distinct book_name) as count_book,
string_agg(book_name) as list_book
from your_table, unnest(generate_date_array(
date_trunc(borrow_date, month),
date_trunc(return_date, month),
interval 1 month)
) as summary_month
group by summary_month
if applied to sample data in your question -output is

Something like this can work:
with
input as (
select 'A' book_name, cast('2022-08-01' as date) borrow_date , cast('2022-08-03' as date) return_date union all
select 'B', '2022-08-03', '2022-09-01' union all
select 'C', '2022-08-15', '2022-09-25' union all
select 'D', '2022-09-15', '2022-09-18' union all
select 'E', '2022-09-17', '2022-10-15'
),
list_month as (
select distinct
* except(days_borrowed),
date_trunc(days_borrowed, month) as month
from input,
unnest(generate_date_array(borrow_date, return_date)) as days_borrowed
)
select
month,
count(distinct book_name) as count_distinct_book,
string_agg(distinct book_name) as book_name_list
from list_month
group by 1
order by 1

Related

BigQuery Order By one financial year(52 weeks)

I have this dataset right now.
Date
Sales
Group
2022-11-02
xxxxxxxx
A
2022-11-03
xxxxxx
A
2022-11-03
xxxxxx
B
2021-11-03
xxxxxx
A
2021-11-04
xxxxxx
B
2021-11-04
xxxxxx
A
I want to order my data as this, where it will order the date by one year
Date
Sales
Group
2022-11-02
xxxxxxxx
A
2021-11-03
xxxxxx
A
2022-11-03
xxxxxx
A
2021-11-04
xxxxxx
A
2022-11-03
xxxxxx
B
2021-11-04
xxxxxx
B
(because they have 52 weeks of interval)
Is there a possible way to do it?
I want to avoid join!
Sorry just to make it clear, I need to make sure that the the first row['date'] and second row['date'] has exactly 52 weeks of interval
i.e. date_sub(second_row['date'],interval 52 week) == first row['date']
Really sorry for the confusing
Consider below instead of previous answer.
WITH sample_data AS (
SELECT DATE '2022-11-02' Date, 'xxxxxxxx' Sales, 'A' `Group` UNION ALL
SELECT '2022-11-03' Date, 'xxxxxx' Sales, 'A' `Group` UNION ALL
SELECT '2022-11-03' Date, 'xxxxxx' Sales, 'B' `Group` UNION ALL
SELECT '2021-11-03' Date, 'xxxxxx' Sales, 'A' `Group` UNION ALL
SELECT '2021-11-04' Date, 'xxxxxx' Sales, 'B' `Group` UNION ALL
SELECT '2021-11-04' Date, 'xxxxxx' Sales, 'A' `Group`
)
SELECT *
FROM sample_data
ORDER BY `Group`,
EXTRACT(WEEK FROM Date) || EXTRACT(DAYOFWEEK FROM Date),
EXTRACT(YEAR FROM Date) DESC;
Query results:

Grouping by Date inclusivity

Here is the data I'm working with here
Accountid
Month
123
08/01/2021
123
09/01/2021
123
03/01/2022
123
04/01/2022
123
05/01/2022
123
06/01/2022
I'm trying to insert into a new table where the data is like this
Accountid
Start Month
End Month
123
08/01/2021
09/01/2021
123
03/01/2022
06/01/2022
I'm not sure how to separate them with the gap, and group by the account id in this case.
Thanks in advance
In 12c+ you may also use match_recognize for gaps-and-islands problems to define grouping rules (islands) in a more readable and natural way.
select *
from input_
match_recognize(
partition by accountid
order by month asc
measures
first(month) as start_month,
last(month) as end_month
/*Any month followed by any number of subsequent month */
pattern(any_ next*)
define
/*Next is the month right after the previous one*/
next as months_between(month, prev(month)) = 1
)
ACCOUNTID
START_MONTH
END_MONTH
123
2021-08-01
2021-09-01
123
2022-03-01
2022-06-01
db<>fiddle here
That's a gaps and islands problem; one option to do it is:
Sample data:
SQL> with test (accountid, month) as
2 (select 123, date '2021-01-08' from dual union all
3 select 123, date '2021-01-09' from dual union all
4 select 123, date '2021-01-03' from dual union all
5 select 123, date '2021-01-04' from dual union all
6 select 123, date '2021-01-05' from dual union all
7 select 123, date '2021-01-06' from dual
8 ),
Query begins here:
9 temp as
10 (select accountid, month,
11 to_char(month, 'J') - row_number() Over
12 (partition by accountid order by month) diff
13 from test
14 )
15 select accountid,
16 min(month) as start_month,
17 max(month) as end_Month
18 from temp
19 group by accountid, diff
20 order by accountid, start_month;
ACCOUNTID START_MONT END_MONTH
---------- ---------- ----------
123 03/01/2021 06/01/2021
123 08/01/2021 09/01/2021
SQL>
Although related to MS SQL Server, have a look at Introduction to Gaps and Islands Analysis; should be interesting reading for you, I presume.

Expand a query from a date to a range of dates

I have a query as below:
SELECT
"2022-05-10 00:00:00 UTC" AS date_,
COUNT(salesId) AS total-sales
FROM
`project1.sales.sales-growth`
WHERE
(promoDate BETWEEN "2022-05-10 00:00:00 UTC"
AND "2022-05-11 00:00:00 UTC")
OR
(purchaseDate BETWEEN "2022-05-10 00:00:00 UTC"
AND "2022-05-11 00:00:00 UTC")
Which shows the total sale for a particular date (2022-05-11) as below:
date_ total-sales
2022-05-10 560
I am wondering how I can change the query to show all the May month sales per day (desired output):
date_ total-sales
2022-05-01 567
2022-05-02 687
2022-05-03 878
... ...
2022-05-31 500
One option: generate a date array for the target time range, group by those dates and compare those dates in the WHERE clause with your two date columns.
With an assumed table of yours:
WITH your_table AS
(
SELECT TIMESTAMP("2022-05-01 15:30:00+00") AS promoDate, NULL AS purchaseDate, 1 AS salesId
UNION ALL
SELECT NULL AS promoDate, TIMESTAMP("2022-05-01 18:30:00+00") AS purchaseDate, 1 AS salesId
UNION ALL
SELECT TIMESTAMP("2022-05-02 15:30:00+00") AS promoDate, NULL AS purchaseDate, 1 AS salesId
UNION ALL
SELECT TIMESTAMP("2022-05-03 15:30:00+00") AS promoDate, NULL AS purchaseDate, 1 AS salesId
UNION ALL
SELECT TIMESTAMP("2022-05-04 15:30:00+00") AS promoDate, NULL AS purchaseDate, 1 AS salesId
UNION ALL
SELECT NULL AS promoDate, TIMESTAMP("2022-05-04 18:30:00+00") AS purchaseDate, 1 AS salesId
)
SELECT
date_,
COUNT(salesId) AS total_sales
FROM
UNNEST(GENERATE_DATE_ARRAY("2022-05-01", "2022-05-31")) AS date_, your_table
WHERE
date_ = EXTRACT(DATE FROM promoDate)
OR
date_ = EXTRACT(DATE FROM purchaseDate)
GROUP BY
date_
Output:
Row
date_
total_sales
1
2022-05-01
2
2
2022-05-02
1
3
2022-05-03
1
4
2022-05-04
2

I want to get the count of roll number for each month

ID
STUDENT_ID
STATUS_DATE
1002
434120010026
25-FEB-22
1000
434120010026
03-MAY-03
1001
434120010026
25-FEB-22
1020
434120020023
18-MAR-22
1021
434120020025
18-MAR-22
1022
434120020025
16-MAR-22
Tried this
select count(*),
trunc(status_date, 'mm')
from test_studentattendance
group by trunc(status_date, 'mm');
got count of roll number in each month not the roll numbers.
COUNT(*)
TRUNC(STATUS_DATE,'MM')
1
01-MAY-03
2
01-FEB-22
3
01-MAR-22
COUNT(*) is counting the number of rows in each group and is counting students that appear in the same month multiple times.
To get the number on roll each month, you need to COUNT the DISTINCT identifier for each student (which, I assume would be student_id):
SELECT COUNT(DISTINCT student_id) AS number_of_students,
TRUNC(status_date, 'mm') AS month
FROM test_studentattendance
GROUP BY TRUNC(status_date, 'mm');
Which, for your sample data:
CREATE TABLE test_studentattendance (ID, STUDENT_ID, STATUS_DATE) AS
SELECT 1002, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1000, 434120010026, DATE '2003-05-03' FROM DUAL UNION ALL
SELECT 1001, 434120010026, DATE '2022-02-25' FROM DUAL UNION ALL
SELECT 1020, 434120020023, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1021, 434120020025, DATE '2022-03-18' FROM DUAL UNION ALL
SELECT 1022, 434120020025, DATE '2022-03-16' FROM DUAL;
Outputs:
NUMBER_OF_STUDENTS
MONTH
1
2022-02-01 00:00:00
1
2003-05-01 00:00:00
2
2022-03-01 00:00:00
db<>fiddle here

oracle count based on month

I am attempting to write Oracle SQL.
I am looking for solution something similar. Please find below data I have
start_date end_date customer
01-01-2012 31-06-2012 a
01-01-2012 31-01-2012 b
01-02-2012 31-03-2012 c
I want the count of customer in that date period. My result should look like below
Month : Customer Count
JAN-12 : 2
FEB-12 : 2
MAR-12 : 2
APR-12 : 1
MAY-12 : 1
JUN-12 : 1
One option would be to generate the months separately in another query and join that to your data table (note that I'm assuming that you intended customer A to have an end-date of June 30, 2012 since there is no June 31).
SQL> ed
Wrote file afiedt.buf
1 with mnths as(
2 select add_months( date '2012-01-01', level - 1 ) mnth
3 from dual
4 connect by level <= 6 ),
5 data as (
6 select date '2012-01-01' start_date, date '2012-06-30' end_date, 'a' customer from dual union all
7 select date '2012-01-01', date '2012-01-31', 'b' from dual union all
8 select date '2012-02-01', date '2012-03-31', 'c' from dual
9 )
10 select mnths.mnth, count(*)
11 from data,
12 mnths
13 where mnths.mnth between data.start_date and data.end_date
14 group by mnths.mnth
15* order by mnths.mnth
SQL> /
MNTH COUNT(*)
--------- ----------
01-JAN-12 2
01-FEB-12 2
01-MAR-12 2
01-APR-12 1
01-MAY-12 1
01-JUN-12 1
6 rows selected.
WITH TMP(monthyear,start_date,end_date,customer) AS (
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from data
union all
select LAST_DAY(start_date),
CAST(ADD_MONTHS(start_date, 1) AS DATE),
end_date,
customer
from TMP
where LAST_DAY(end_date) >= LAST_DAY(start_date)
)
SELECT TO_CHAR(MonthYear, 'MON-YY') TheMonth,
Count(Customer) Customers
FROM TMP
GROUP BY MonthYear
ORDER BY MonthYear;
SQLFiddle