Is there a simple line (or two) of code that will pull records before a minimum date in another table? - sql

I want to pull Emergency room visits before a members first treatment date. Everyone as a different first treatment date and none occur before Jan 01 2012.
So if a member has a first treatment date of Feb 24 2013, I want to know how many times they visited the ER one year prior to that date.
These min dates are located in another table and I can not use the Min date in my DATEADD function. Thoughts?

One possible solution is to use a CTE to capture the visits between the dates your interested in and then join to that with your select.
Here is an example:
Rextester

Edit:
I just completely updated my answer. Sorry for the confusion.
So you have at least two tables:
Emergency room visits
Treatment information
Let's call these two tables [ERVisits] and [Treatments].
I suppose both tables have some id-field for the patient/member. Let's call it [MemberId].
How about this conceptual query:
WITH [FirstTreatments] AS
(
SELECT [MemberId], MIN([TreatmentDate]) AS [FirstTreatmentDate]
FROM [Treatments]
GROUP BY [MemberId]
)
SELECT V.[MemberId], T.[FirstTreatmentDate], COUNT(*) AS [ERVisitCount]
FROM [ERVisits] AS V INNER JOIN [FirstTreatments] AS T ON T.[MemberId] = V.[MemberId]
WHERE DATEDIFF(DAY, V.[VisitDate], T.[FirstTreatmentDate]) BETWEEN 0 AND 365
GROUP BY V.[MemberId], T.[FirstTreatmentDate]
This query should show the number of times a patient/member has visited the ER in the year before his/her first treatment date.
Here is a tester: https://rextester.com/UXIE4263

Related

How to write an SQL query to get max number of counts for the most number of travelling of a user within a month

I have been given a task by my manager to write a SQL query to select the max number of counts (no of records) for a user who has travelled the most within a month provided that if the user travels multiple places on the same date, then it should be counted as one. For instance, if you look at the following table design; according to this scenario, my query must return me a count of 2. Although traveller_id "1" has traveled three times within a month, but he traveled to Thailand and USA on the same date, that is why its count is reduced to 2.
I have also developed my logic for this query but I am unable to write it due to lack of syntax knowledge. I split up this query into 3 parts:
Select All records from the table within a month using the MONTH function of SQL
Select All distinct DateTime records from the above result so that the same DateTime gets eliminated.
Select max number of counts for the traveller who visited most places.
Please help me in completing my query. You can also use a different approach from mine.
You can use the count aggregation in a cte then select top(1):
with u as
(select traveller_id,
count(distinct visit_date) as n
from travellers_log
where visit_date between '2022-03-01' and '2022-03-31'
group by traveller_id)
select top(1) traveller_id, name, n from u inner join table_travellers
on u.traveller_id = table_travellers.id
order by n desc;

Cohort retention with SQL BigQuery

I am trying to create a retention table like the following using SQL in Big Query but with MONTHLY cohorts;
I have the following columns to use in my dataset, I am only using one table and it's name is 'curious-furnace-341507.TEST.Test_Dataset_-_Orders'
order_date
order_id
customer_id
2020-01-02
12345
6789
I do not need the new user column and the data goes through June 2020 I think ideally a cohort month column that lists January-June cohorts and then 5 periods across.
I have tried so many different things and keep getting errors in BigQuery I think I am approaching it all wrong. The online queries I am trying to pull from seem to use dates rather than months which is also causing some confusion as I think I need to truncate my date column to months only in the query?
Does anyone have a go-to query that will work in BigQuery for a retention table or can help me approach this? Thanks!
This may help you:
With cohorts AS (
SELECT
customer_id,
MIN(DATE(order_date)) AS cohort_date
FROM 'curious-furnace-341507.TEST.Test_Dataset_-_Orders'
GROUP BY 1)
SELECT
FORMAT_DATE("%Y-%m", c.cohort_date) AS cohort_mth,
t.customer_id AS cust_id,
DATE_DIFF(t.order_date, c.cohort_date, month) AS order_period,
FROM 'curious-furnace-341507.TEST.Test_Dataset_-_Orders' t
JOIN cohorts c ON t.customer_id = c.customer_id
WHERE cohort_date >= ('2020-01-01')
AND DATE_DIFF(t.order_date, c.cohort_date, month) <=5
GROUP BY 1, 2, 3
I typically do pivots and % calcs in excel/ sheets. So this will give just you the input data you need for that.
NOTE:
This will give you a count of unique customers who ordered in period X (ignores repeat orders in period).
This also has period 0 (ordered again in cohort_mth) which you may wish to keep/ exclude.

Records between 2 dates Oracle SQL

I am looking to filter out records between 2 dates. Here is a list of start and end dates. I need identify records that fall under the respective periods.
I am able to identify the records that fall in the first and last period i.e. first (9/07/2020 - 22/07/2020) and last (10/11/2020 - 23/12/2020) by using MIN and MAX. I am not able to find records that fall in between i.e. 2-11?
I have another table that shows a date when the records were updated. For instance,
I need to identify the records that falls under what periods. For instance,
Any kind of help would be appreciated!
Thanks
You need to do a join on your two tables. You don't give your table names, so this is a bit of guesswork, but try something like this:
select *
from period_table p
inner join record_table r on r.changed_date between p.startdate and p.enddate

TSQL query to find latest (current) record from period column when there are past present and future records

edited as requested:
My apologies. I've been dealing with this a bit and it's well and truly in my head, but not for the reader.
We have multiple records in table A which have multiple entries in the Period column. Say it's like a football schedule. Teams will have multiple dates/times in the Period column.
When we run query:
We want records selected for the most recent games only.
We don't want the earlier games.
We don't want the games "scheduled" and not yet played.
"Last game played" i.e. Period for teams are often on different days.
Table like:
Team Period
Reds 2021020508:00
Reds 2021011107:00
City 2021030507:00
Reds 2021032607:00
City 2021041607:00
Reds 2021050707:00
When I run query, I want to see the records for last game played regardless of date. So if I run the query on 27 Mar 2021, I want:
City 2021030507:00
Reds 2021032607:00
Keep in mind I used the above as an easily understandable example. In my case I have 1000s of "Teams" each of which may have 100+ different date entries in the Period column and I would like the solution to be applicable regardless of number of records, dates, or when the query is run.
What can I do?
Thanks!
So this gives you your desired output using the sample data, does it fulfil your requirement?
create table x (Team varchar(10), period varchar(20))
insert into x values
('Reds','2021020508:00'),
('Reds','2021011107:00'),
('City','2021030507:00'),
('Reds','2021032607:00'),
('City','2021041607:00'),
('Reds','2021050707:00')
select Team, Max(period) LastPeriod
from x
where period <=Format(GetDate(), 'yyyyMMddhh:mm')
group by Team
The string-formatted date you have order by text, so I think this would work
SELECT TOP 2 *
FROM tableA
WHERE period = FORMAT( GETDATE(), 'yyyyMMddhh:mm' )
ORDER BY period
Perhaps you want:
where period = (select max(t2.period) from t t2)
This returns all rows with the last period in the table.

GROUP BY with date range

I have a table with 4 columns, id, Stream which is text, Duration (int), and Timestamp (datetime). There is a row inserted for every time someone plays a specific audio stream on my website. Stream is the name, and Duration is the time in seconds that they are listening. I am currently using the following query to figure up total listen hours for each week in a year:
SELECT YEARWEEK(`Timestamp`), (SUM(`Duration`)/60/60) FROM logs_main
WHERE `Stream`="asdf" GROUP BY YEARWEEK(`Timestamp`);
This does what I expect... presenting a total of listen time for each week in the year that there is data.
However, I would like to build a query where I have a result row for weeks that there may not be any data. For example, if the 26th week of 2006 has no rows that fall within that week, then I would like the SUM result to be 0.
Is it possible to do this? Maybe via a JOIN over a date range somehow?
The tried an true old school solution is to set up another table with a bunch of date ranges that you can outer join with for the grouping (as in the other table would have all of the weeks in it with a begin / end date).
In this case, you could just get by with a table full of the values from YEARWEEK:
201100
201101
201102
201103
201104
And here is a sketch of a sql statement:
SELECT year_weeks.yearweek , (SUM(`Duration`)/60/60)
FROM year_weeks LEFT OUTER JOIN logs_main
ON year_weeks.yearweek = logs_main.YEARWEEK(`Timestamp`)
WHERE `Stream`="asdf" GROUP BY year_weeks.yearweek;
Here is a suggestion. might not be exactly what you are looking for.
But say you had a simple table with one column [year_week] that contained the values of 1, 2, 3, 4... 52
You could then theoretically:
SELECT
A.year_week,
(SELECT SUM('Duration')/60/00) FROM logs_main WHERE
stream = 'asdf' AND YEARWEEK('TimeStamp') = A.year_week GROUP BY YEARWEEK('TimeStamp'))
FROM
tblYearWeeks A
this obviously needs some tweaking... i've done several similar queries in other projects and this works well enough depending on the situation.
If your looking for a one table/sql based solution then that is deffinately something I would be interested in as well!