How to write the SQL query to select an ID if all 7 day week numbers exists - sql

I have a table similar to the one in the attached picture.
The IDs in the ID-2 column have 1-7 day of week numbers in 'Day of week'column e.g ID-2 = 100.
However, sometimes there won't have all 7 day of week numbers. For example in ID-2= 500 has the Day of Week numbers 3,4 and 6 missing.
In my SQL query I want to select the distinct ID when only if it has all 1-7 day of week numbers.
I wonder if someone could guide me on this? In SQL what concept can be used for it, i.e. Case, partition, self join etc?

You can use group by and having. Assming that day_of_week is always between 1 and 7:
select id_2
from mytable
group by id_2
having count(distinct day_of_week) = 7
If there are no duplicates (id2, day_of_week) tuples, then the having clause can be more efficiently phrased as:
having count(*) = 7

Related

adjust recursive sql query to exclude holidays and weekends

I have a dataset like this called data_per_day
instructional_day
points
2023-01-24
2
2023-01-23
2
2023-01-20
1
2023-01-19
0
and so on. the table shows weekdays (days minus holidays and weekends) and the number of points someone has earned. 1 is the start of a streak and 0 is the end of a streak. 2 is max points after a streak has started.
I need to find how long is the latest streak. so in this case the result should be 3
I created a recursive cte but the query returns 2 as the streak count because i'm using lag mechanism with days. instead I need to adjust so that the instructional days are used rather than all dates.
RECURSIVE cte AS (
SELECT
student_unique_id,
instructional_day,
points,
1 AS cnt
FROM
`data_per_day`
WHERE
instructional_day = DATE_ADD(CURRENT_DATE('America/Chicago'), INTERVAL -1 DAY)
UNION ALL
SELECT
a.student_unique_id,
a.instructional_day,
a.points,
c.cnt+1
FROM (
SELECT
*
FROM
`data_per_day`
WHERE
points > 0 ) a
INNER JOIN
cte c
ON
a.student_unique_id = c.student_unique_id
AND a.instructional_day = c.instructional_day - INTERVAL '1' day )
SELECT
student_unique_id,
MAX(cnt) AS streak
FROM
cte --
WHERE
student_unique_id = "419"
GROUP BY
student_unique_id
How do I adjust the query?
This is not a trivial coding exercise, so I won't actually write the code and provide it.
What you have here is a gaps and islands question. You want to identify the largest "island" of days with points within a date range. Depending upon what dates are contained in your data, you may need to generate a list of sequential dates that meet your criteria.
One problem I see is that you are trying to combine the steps to generate the date range (the recursive CTE) with the points. You'll need to separate those steps.
Define the date range.
Generate the dates within the range.
Filter the dates with isweekday = 'no' and isholiday = 'no'. You will probably want to add a row number during this step.
[left] join the dates to your data, including coalesce(points, 0)
Filter the data to points > 0.
Identify the islands.
Identify the largest island per student.

How to run a query for multiple independent date ranges?

I would like to run the below query that looks like this for week 1:
Select week(datetime), count(customer_call) from table where week(datetime) = 1 and week(orderdatetime) < 7
... but for weeks 2, 3, 4, 5 and 6 all in one query and with the 'week(orderdatetime)' to still be for the 6 weeks following the week(datetime) value.
This means that for 'week(datetime) = 2', 'week(orderdatetime)' would be between 2 and 7 and so on.
'datetime' is a datetime field denoting registration.
'customer_call' is a datetime field denoting when they called.
'orderdatetime' is a datetime field denoting when they ordered.
Thanks!
I think you want group by:
Select week(datetime), count(customer_call)
from table
where week(datetime) = 1 and week(orderdatetime) < 7
group by week(datetime);
I would also point out that week doesn't take the year into account, so you might want to include that in the group by or in a where filter.
EDIT:
If you want 6 weeks of cumulative counts, then use:
Select week(datetime), count(customer_call),
sum(count(customer_call)) over (order by week(datetime)
rows between 5 preceding and current row) as running_sum_6
from table
group by week(datetime);
Note: If you want to filter this to particular weeks, then make this a subquery and filter in the outer query.

How to list records with conditional values and non-missing records

I have a view that produces the result shown in the image below. I need help with the logic.
Requirement:
List of all employees who achieved no less than 100% target in ALL Quarters in past two years.
"B" received 90% in two different quarters. An employee who received less than 100% should NOT be listed.
Notice that "A" didn't work for Q2-2016. An employee who didn't work for that quarter should NOT be listed.
"C" is the only one who worked full two years, and received 100% in each quarter.
Edit: added image link showing Employee name,Quarter, Year, and the score.
https://i.imgur.com/FIXR0YF.png
The logic is pretty easy, it's math with quarters that is a bit of a pain.
There are 8 quarters in the last two years, so you simply need to select all the employee names in the last two years with a target >= 100%, group by employee name, and apply a HAVING clause to limit the output to those employees with count(*) = 8.
To get the current year and quarter, you can use these expressions:
cast(extract('year' from current_date) as integer) as yr,
(cast(extract('month' from current_date) as integer)-1) / 3 + 1 as quarter;
Subtract 2 from the current year to find the previous year and quarter. The code will be clearer if you put these expressions in a subquery because you will need them multiple times for the quarter arithmetic. To do the quarter arithmetic you must extract the integer value of the quarter from the text values you have stored.
Altogether, the solution should look something like this:
select
employee
from
(select employee, cast(right(quarter,1) as integer) as qtr, year
from your_table
where target >= 100
) as tgt
cross join (
select
cast(extract('year' from current_date) as integer) as yr,
(cast(extract('month' from current_date) as integer)-1) / 3 + 1 as quarter
) as qtr
where
tgt.year between qtr.yr-1 and qtr.yr
or (tgt.year = qtr.yr - 2 and tgt.qtr > qtr.quarter)
group by
employee
having
count(*) = 8;
This is untested.
If you happen to be using Postgres and expect to be doing a lot of quarter arithmetic you may want to define a custom data type as described in A Year and Quarter Data Type for PostgreSQL

SQL statement to match dates that are the closest?

I have the following table, let's call it Names:
Name Id Date
Dirk 1 27-01-2015
Jan 2 31-01-2015
Thomas 3 21-02-2015
Next I have the another table called Consumption:
Id Date Consumption
1 26-01-2015 30
1 01-01-2015 20
2 01-01-2015 10
2 05-05-2015 20
Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.
So the problem is as follows, I would like to match each Id from the Names table with the Consumption table provided that the difference between the dates are the lowest, so we have: Dirk consumes on 27-01-2015 about 30. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.
While I know how to join, I do not know how to code the difference part.
Thanks.
DBMS is Microsoft SQL Server 2012.
I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.
This is how you could it in SQL Server:
SELECT Id, Name, AVG(Consumption)
FROM (
SELECT n.Id, Name, Consumption,
RANK() OVER (PARTITION BY n.Id
ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
FROM Names AS n
INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name
Using RANK with PARTITION BY n.Id and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date])) you can locate all matching records per Id: all records with the smallest difference in days are going to have rnk = 1.
Then, using AVG in the outer query, you are calculating the average value of Consumption between all matching records.
SQL Fiddle Demo

how to calculate separate averages for multiple columns

I have a table that has "months" as columns and "customer ID" as primary key.
I want to average all the values for each month separately for values not equal to 99999.
My current query for a single month is as follows and is working fine:
SELECT Avg([Table1]![Dec10]) AS Expr1
FROM Table1
WHERE ((([Table1]![Dec10])<>99999);
However, when I am trying to add the 2nd month, it is combining the first month's condition with the 2nd month's condition.
SELECT Avg([Table1]![Dec10) AS Expr1, Avg([Table1]![Dec11]) AS Expr2
FROM Table1
WHERE ((([Table1]![Dec10])<>99999) And ([Table1]![Dec11])<>99999);
I need to have each month separate, i.e. calculate the average of Dec10<>99999, and in the second column, calculate the average of Dec11<>99999.
You need to use a Group By clause in your query, and then you can separate your output by months.
In this case it would be convenient to use use GROUP BY.
If you have distinct month values e.g. "jan10", "feb10", "mar12" etc. you can group on the months, and then check that the values is not 99999.
SELECT avg(value), months
FROM tablename
WHERE value <> 99999
GROUP BY months
That is if you have the months stored as literals within a column, but from your database design this may be stored in an other way?
I need to have each month separate, i.e. calculate the average of Dec10<>99999, and in the second column, calculate the average of Dec11<>99999.
In Access 2010, for [Table1]...
CustomerID Dec10 Dec11
---------- ----- -----
1 1 5
2 2 99999
3 99999 0
4 3 7
...the query...
SELECT
DAvg("Dec10", "Table1", "Dec10<>99999") AS AvgOfDec10,
DAvg("Dec11", "Table1", "Dec11<>99999") AS AvgOfDec11
FROM (SELECT COUNT(*) AS n FROM Table1)
...produces:
AvgOfDec10 AvgOfDec11
---------- ----------
2 4