SQL count recurring values per week - sql

I've asked the same question for pandas: link
And now I'm struggling to do the same thing with Big Query SQL. This is what I'm trying to achieve:
I have a Table containing dates and ids that are grouped by weeks
items_per_week:
date id
2022-02-07 1
3
5
4
2022-02-14 2
1
3
2022-02-21 9
10
1
...
...
2022-05-16 ....
I want to count for each week how much of the id's are repeating from previous week
For example the desired output for the Table would be:
date count
2022-02-07 0
2022-02-14 2 # because id 1 and 3 are present in previous week
2022-02-21 1 # because id 1 is present in previous week
...
I tried grouping the id and counting for each id how many are repeating for each date but it didn't work out as planned.

Try doing self-join and counting the results:
SELECT t1.date
,COUNT(t2.id) as count
FROM Table t1
LEFT JOIN Table t2
ON t1.date = DATE_SUB(t2.date, INTERVAL 7 DAY) -- finding previous week
AND t1.id = t2.id -- identifying matching ids
GROUP BY 1
Couple assumptions here:
id is unique per week (i.e you can't have duplicates in a week of 2022-02-07)
date is iterated over 7 days period (i.e you have one date per week)

Related

Counting from monday to friday consecutively bigquery sql

I'm trying to find a result with two conditions in google bigquery.
Employees who worked from Monday through Friday consecutively will get an additional pay of 8 hour amount of wage.
Condition above is valid for workers who worked more than 15 hours (15 hrs <) per week.
id
date
hours
abc123
2020-01-05
12
abc123
2020-01-06
5
abc123
2020-01-07
14
abc123
2020-01-08
7
abc123
2020-01-09
6
abc123
2020-01-10
12
Thanks in advance.
Assuming you have one row per employee per day, then you can handle this by using window functions. The focus will be on Mondays, but the idea is to count the hours and days for a given row and the four days following.
So, to get the Mondays where a given id matches the conditions and is eligible for a bonus:
select id, date
from (select t.*,
count(*) over (partition by id
order by unix_date(date)
range between current row and 4 following
) as day_count,
sum(hours) over (partition by id
order by unix_date(date)
range between current row and 4 following
) as hours_count
from t
) t
where extract(dayofweek from date) = 2 and
day_count = 5 and
hours_count >= 15;

SQL : How to count number of times each ID exists continuously from previous period

My SQL data set is like this;
Date firm_id
======================
2010-01 1
2010-01 2
2010-01 3
----------------------
2010-02 1
2010-02 2
----------------------
2010-03 1
2010-03 2
2010-03 3
----------------------
2010-04 1
2010-04 3
How can I create a variable, name firm_age, to represent age of firms existing continuously from the previous period? like this,
Date firm_id firm_age
=================================
2010-01 1 0
2010-01 2 0
2010-01 3 0
-----------------------------------
2010-02 1 1
2010-02 2 1
-----------------------------------
2010-03 1 2
2010-03 2 2
2010-03 3 0
-----------------------------------
2010-04 1 3
2010-04 3 1
Thank you
This is a use case for the PACK operator from "Time & Relational Theory", which is not supported, at least not directly, in SQL.
You are trying to find [for each given row of the table] the smallest month such that there does not exist any intervening month between that smallest month and the month of the given row such that the company of the given row did not exist at that intervening month. Given two months, assessing the [non-]existence of such an intervening month is relatively trivial, however, finding the smallest month that makes the condition true for all intervening months is another order (*). I wouldn't try to do this completely in plain SQL.
(*) which set of months are you going to SELECT that "smallest month" from ? You cannot rely on the fact that all months will be mentioned in your table as there is always the slight theoretical possibility that one particular month, no companies existed at all. (This possibility also breaks any attack on the problem based on window functions ans row_numbers.)
This is a gaps-and-islands problem. You want "islands" where the values are sequential. Then you want to enumerate them. You can use row_number() for this:
select t.*,
row_number() over (partition by firm_id, date - seqnum * interval '1 month'
order by date
) as firm_age
from (select t.*,
row_number() over (partition by firm_id order by date) as seqnum
from t
) t;
Note that date functions are not standard across databases. This makes some assumptions about the data representation, but the idea for the processing should work in almost any database.

MS SQL - finding period a date belongs to given only period start date

Long time reader, first time poster.
I did search and could not find any other question matching this. all i could find involved having 2 date columns to compare. If this has indeed been answered before, I apologize for creating a duplicate post.
I have been given a table with 3 columns:
Period (int) - Name (varchar) - StartDate (datetime)
1 Period 1 2016-01-04 00:00:00.000
2 Period 2 2016-02-01 00:00:00.000
3 Period 3 2016-02-29 00:00:00.000
4 Period 4 2016-03-28 00:00:00.000
5 Period 5 2016-04-25 00:00:00.000
6 Period 6 2016-05-23 00:00:00.000
I unfortunately do not have the option of altering the table to add a second date column to contain the end date of each period. The end of each period is the date immediately preceding the listed StartDate.
I will be passed a date and need to find the period to which the date belongs.
Can anyone provide some direction?
Try this query:
SELECT TOP 1 *
FROM Table
WHERE StartDate >= #GivenDate
ORDER BY StartDate
Given the hint / comment from Tab Alleman above, I built the following:
select top 1 period from PeriodDetail
where StartDate <= getdate()
order by startdate desc
As GetDate (at this moment) returns 2/19/2016, the correct result is period 2.
Thank you all.

creating weekly buckets in date column in sql

How can i create weekly buckets for a date column.
My data looks like :
ID LOC DATE Amount
1 AAA 21-07-2015 3000
2 AAA 22-07-2015 1000
3 AAA 23-07-2015 0
4 AAA 27-07-2015 300
5 AAA 29-07-2015 700
I also have a Financial Year Calendar file containing the week start and end ranges and which week each bucket falls on.It looks like
Year WeekStart WeekEnd Week
2015 20-07-2015 26-07-2015 1
2015 27-07-2015 02-08-2015 2
so on till 2020...
The task here is I have to group all the line items in A table fall under each bucket and find the amount value per week.
Output:
ID LOC WEEk Amount
1 AAA 1 4000
2 AAA 2 1000
Not sure how to start the process itself or how to link these both files.Kindly need your help.
You need here Correlated Subqueries https://technet.microsoft.com/en-us/library/ms187638(v=sql.105).aspx. Let's assume data is in table data, calendar in table calendar. Then your query will look like
select
loc, week, sum(amount)
from
(select
(select top 1 week from calendar t1 where t1.WeekStart <= t2.date and t2.date <= t1.WeekEnd) as week,
loc,
amount
from
data t2) as subsel1
group by
loc, week

SQL: different aggregate values for different condition

I have such a difficult problem from my knowledge.
I need to write a sole query that perform this action:
write a query that for the day Monday extract the sum of computer
sold, and for the rest of the days just the sum of Laptop.
For example I have this table
Day LaptopSold DesktopSold Total
Monday 2 2 4
Tuesday 2 3 5
Monday 1 1 2
Wednesday 2 2 4
Tuesday 1 4 5
The result should be this:
Day QtySold
Monday 6
Tuesday 3
wed 2
I can achieve the goal just writing two separate queries with the Group By for the Day field, but in one query for me is impossible!!!
Could you help me, please!!!
Thanks in advance
Lu
You can select the additive field with a CASE:
SELECT DAY,
SUM(CASE DAY
WHEN 'MONDAY' THEN TOTAL
ELSE LAPTOPSOLD
END) AS QtySold
FROM TBL
GROUP BY DAY