use of week of year & subsquend in bigquery - google-bigquery

I need to show distinct users per week. I have a date-visit column, and a user id, it is a big table with 1 billion rows.
I can change the date column from the CSVs to year,month, day columns. but how do I deduce the week from that in the query.
I can calculate the week from the CSV, but this is a big process step.
I also need to show how many distinct users visit day after day, looking for workaround as there is no date type.
any ideas?

To get the week of year number:
SELECT STRFTIME_UTC_USEC(TIMESTAMP('2015-5-19'), '%W')
20

If you have your date as a timestamp (i.e microseconds since the epoch) you can use the UTC_USEC_TO_DAY/UTC_USEC_TO_WEEK functions. Alternately, if you have an iso-formatted date string (e.g. "2012/03/13 19:00:06 -0700") you can call PARSE_UTC_USEC to turn the string into a timestamp and then use that to get the week or day.
To see an example, try:
SELECT LEFT((format_utc_usec(day)),10) as day, cnt
FROM (
SELECT day, count(*) as cnt
FROM (
SELECT UTC_USEC_TO_DAY(PARSE_UTC_USEC(created_at)) as day
FROM [publicdata:samples.github_timeline])
GROUP BY day
ORDER BY cnt DESC)
To show week, just change UTC_USEC_TO_DAY(...) to UTC_USEC_TO_WEEK(..., 0) (the 0 at the end is to indicate the week starts on Sunday). See the documentation for the above functions at https://developers.google.com/bigquery/docs/query-reference for more information.

Related

In Bigquery SQL: How to fetch previous week, specified week and next week data?

Scenario: From bigquery, have to fetch the specified date's week data + its previous week data + its next future week data. Week starts is Wednesday.
Tried Query:
Select * from table
and extract(week(wednesday) from Calendar_Day) >= (extract(week(wednesday) from PARSE_DATE('%d/%m/%Y','21/10/2020')) - 1)
and extract(week(wednesday) from Calendar_Day) >= (extract(week(wednesday) from PARSE_DATE('%d/%m/%Y','21/10/2020') ))
and extract(week(wednesday) from Calendar_Day) <= (extract(week(wednesday) from PARSE_DATE('%d/%m/%Y','21/10/2020')) + 1)
But this is not working for me.
Need help in resolving this. Thanks in Advance!
EXTRACT the week as the code already does. and the year as the weeks repeat every year.
GROUP BY the week and year. At this point I find it handy to make a STRUCT from the remaining fields as it simplifies the remaining code.
make another query that uses the query which did the GROUP BY, I used a WITH. In this last query, LEAD and LAG the data with a WINDOW by week.
Here's an example from a public dataset.
WITH
data_by_week AS (
SELECT
EXTRACT(year FROM date) AS year,
EXTRACT(week(wednesday) FROM date) AS week,
struct(
SUM(new_tested) as total_new_tested,
sum(new_recovered) as total_new_recovered
) as week_data
FROM
`bigquery-public-data.covid19_open_data.covid19_open_data`
GROUP BY
year,
week )
SELECT
year,
week,
LAG(week_data) OVER window_by_week AS previous_week,
week_data AS current_week,
LEAD(week_data) OVER window_by_week AS following_week
FROM
data_by_week
WINDOW
window_by_week AS ( ORDER BY year, week)
ORDER BY
year,
week

Why do i get a different results from my weekly code vs per week code?

Why am I getting different results when I compare weekly results into using a code individually per week. Does it have something to do with the timestamp?
This is the code for all the weeks:
select date_trunc('week',date_joined) as week, COUNT(*) as count from auth_user
where date_joined>='01-01-2019' and date_joined<='31-03-2019'
group by week
order by week
This is the resulting table:
first result
This is the code for getting an individual week:
select COUNT(*) from auth_user where date_joined>='31-12-2018' and date_joined<='06-01-2019'
This is the result for the first week: second result
I'd say that date_joined is a timestamp, and your second query misses the entries from January 6th.
Try with
AND date_joined < '2019-01-07'
Also, you should use ISO notation: YYYY-MM-DD

How to get last and first date of every week from predefined date table (oracle SQL)

I have Table D_date in which all dates of a year, week number,quarter number etc attributes are defined. I just want to get first and last date of every week of year 2015.
Sample D_date tabe attached.
It is simple min/max if I understand you right
SELECT calendar_year_nbr, week, min(actual_date),max(actual_date)
FROM D_date
GROUP BY calendar_year_nbr, week
I just want to get first and last date of every week of year 2015.
Since you have precomputed values already stored in the table, you could directly use MIN and MAX as aggregate functions along with GROUP BY.
For example,
SELECT MIN(actual_date) min_date,
MAX(actual_date) max_date,
calendar_week_nbr
FROM d_date
WHERE calendar_year_nbr = 2015
GROUP BY calendar_week_nbr
ORDER BY min_date;
Another way is to use ROWNUM() OVER() analytic function.

How can I cross join the following query results with a table of dates

I am looking for a query which gives me the daily playing time. The start (first_date) and end date(last_update) are given as shown in the Table. The following query gives me the sum of playing time on given date. How can I extend it to get a table from first day to last day and plot the query data in it and show 0 on dates when no game is played.
SELECT startTime, SUM(duration) as sum
FROM myTable
WHERE startTime = endTime
GROUP BY startTime
To show date when no one play you will need create a table days with a date field day so you could do a left join. (100 years is only 36500 rows).
Using select Generate days from date range
This use store procedure in MSQL
I will assume if a play pass the midnight a new record begin. So I could simplify my code and remove the time from datetime field
SELECT d.day, SUM(duration) as sum
FROM
days d
left join myTable m
on CONVERT(date, m.starttime) = d.day
GROUP BY d.day
If I understand correctly, you could try:
SELECT SUM(duration) AS duration, date
FROM myTable
WHERE date <= 20140430
AND date => 20140401
GROUP BY date
This would get the total time played for each date between april 1 and april 30
As far as showing 0 for dates not in the table, I don't know.
Also, the table you posted doesn't show a duration column, but the query you posted does, so I went ahead and used it.

Data for specific date

My report gets data for the 1st of the current month. Let's say the 1st has still not come then how would I make the report show the data for the 1st of the previous month.
Thanks.
Simply use a select top 1 from your table, filtering by extract(day from yourDateColumn) = 1 to get only the rows with the data for the 1st day of any month, and order them in descending order by your date column (order by yourDateColumn desc), so that you always get the 1st day of the last available month in your table.
Docs for Oracle EXTRACT function