How to write an SQL query to get max number of counts for the most number of travelling of a user within a month - sql

I have been given a task by my manager to write a SQL query to select the max number of counts (no of records) for a user who has travelled the most within a month provided that if the user travels multiple places on the same date, then it should be counted as one. For instance, if you look at the following table design; according to this scenario, my query must return me a count of 2. Although traveller_id "1" has traveled three times within a month, but he traveled to Thailand and USA on the same date, that is why its count is reduced to 2.
I have also developed my logic for this query but I am unable to write it due to lack of syntax knowledge. I split up this query into 3 parts:
Select All records from the table within a month using the MONTH function of SQL
Select All distinct DateTime records from the above result so that the same DateTime gets eliminated.
Select max number of counts for the traveller who visited most places.
Please help me in completing my query. You can also use a different approach from mine.

You can use the count aggregation in a cte then select top(1):
with u as
(select traveller_id,
count(distinct visit_date) as n
from travellers_log
where visit_date between '2022-03-01' and '2022-03-31'
group by traveller_id)
select top(1) traveller_id, name, n from u inner join table_travellers
on u.traveller_id = table_travellers.id
order by n desc;

Related

Is there a simple line (or two) of code that will pull records before a minimum date in another table?

I want to pull Emergency room visits before a members first treatment date. Everyone as a different first treatment date and none occur before Jan 01 2012.
So if a member has a first treatment date of Feb 24 2013, I want to know how many times they visited the ER one year prior to that date.
These min dates are located in another table and I can not use the Min date in my DATEADD function. Thoughts?
One possible solution is to use a CTE to capture the visits between the dates your interested in and then join to that with your select.
Here is an example:
Rextester
Edit:
I just completely updated my answer. Sorry for the confusion.
So you have at least two tables:
Emergency room visits
Treatment information
Let's call these two tables [ERVisits] and [Treatments].
I suppose both tables have some id-field for the patient/member. Let's call it [MemberId].
How about this conceptual query:
WITH [FirstTreatments] AS
(
SELECT [MemberId], MIN([TreatmentDate]) AS [FirstTreatmentDate]
FROM [Treatments]
GROUP BY [MemberId]
)
SELECT V.[MemberId], T.[FirstTreatmentDate], COUNT(*) AS [ERVisitCount]
FROM [ERVisits] AS V INNER JOIN [FirstTreatments] AS T ON T.[MemberId] = V.[MemberId]
WHERE DATEDIFF(DAY, V.[VisitDate], T.[FirstTreatmentDate]) BETWEEN 0 AND 365
GROUP BY V.[MemberId], T.[FirstTreatmentDate]
This query should show the number of times a patient/member has visited the ER in the year before his/her first treatment date.
Here is a tester: https://rextester.com/UXIE4263

Calculation of weighted average counts in SQL

I have a query that I am currently using to find counts
select Name, Count(Distinct(ID)), Status, Team, Date from list
In addition to the counts, I need to calculate a goal based on weighted average of counts per status and team, for each day.
For example, if Name 1 counts are divided into 50% Status1-Team1(X) and 50% Status2-Team2(Y) yesterday, then today's goal for Name1 needs to be (X+Y)/2.
The table would look like this, with the 'Goal' field needed as the output:
What is the best way to do this in the same query?
I'm almost guessing here since you did not provide more details but maybe you want to do this:
SELECT name,status,team,data,(select sum(data)/(select count(*) from list where name = q.name)) FROM (SELECT Name, Count(Distinct(ID)) as data, Status, Team, Date FROM list) as q

SQL query that calculates historical average and checks if current value is greater multiple than 3

I am try to calculate the average since the last time stamp and pull all records where the average is greater than 3. My current query is:
SELECT AVG(BID)/BID AS Multiple
FROM cdsData
where Multiple > 3
and SqlUnixTime > 1492225582
group by ID_BB_RT;
I have a table cdsData and the unix time is april 15th converted. Finally I want the group by calculated within the ID as I show. I'm not sure why it's failing but it says that the field Multiple is unknown in the where clause.
I am try to calculate the average since the last time stamp and pull all records where the average is greater than 3.
I think your intention is correctly stated as follows, "I am trying to calculate the average since the last time stamp and select all rows where the average is greater than 3 times the individual bid".
In fact, a still better restatement of your objective would be, "I want to select all rows since the last time stamp, where the bid is less than 1/3rd the average bid".
For this, the steps are as follows:
1) A sub-query finds the average bid divided by 3, of rows since the last time stamp.
2) The outer query selects rows since the last time stamp, where the individual bid is < the value returned by the sub-query.
The following SQL statement does that:
SELECT BID
FROM cdsData
WHERE SqlUnixTime > 1492225582
AND BID <
(
SELECT AVG(BID) / 3
FROM cdsData
WHERE SqlUnixTime > 1492225582
)
ORDER BY BID;
1)
SQL is evaluated backwards, from right to left. So the where clause is parsed and evaluate prior to the select clause. Because of this the aliasing of AVG(BID)/BID to Multiple has not yet occurred.
You can try this.
SELECT AVG(BID)/BID AS Multiple
FROM cdsData
WHERE SqlUnixTime > 1492225582
GROUP BY ID_BB_RT Having (AVG(BID)/BID)>3 ;
Or
Select Multiple
From (SELECT AVG(BID)/BID AS Multiple
FROM cdsData
Where SqlUnixTime > 1492225582 group by ID_BB_R)X
Where multiple >3
2)
Once you corrected the above error, you will be having one more error:
Column 'BID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
To correct this you have to insert BID column in group by clause.

Is there a way to handle immutability that's robust and scalable?

Since bigquery is append-only, I was thinking about stamping each record I upload to it with an 'effective date' similar to how peoplesoft works, if anybody is familiar with that pattern.
Then, I could issue a select statement and join on the max effective date
select UTC_USEC_TO_MONTH(timestamp) as month, sum(amt)/100 as sales
from foo.orders as all
join (select id, max(effdt) as max_effdt from foo.orders group by id) as latest
on all.effdt = latest.max_effdt and all.id = latest.id
group by month
order by month;
Unfortunately, I believe this won't scale because of the big query 'small joins' restriction, so I wanted to see if anyone else had thought around this use case.
Yes, adding a timestamp for each record (or in some cases, a flag that captures the state of a particular record) is the right approach. The small side of a BigQuery "Small Join" can actually return at least 8MB (this value is compressed on our end, so is usually 2 to 10 times larger), so for "lookup" table type subqueries, this can actually provide a lot of records.
In your case, it's not clear to me what the exact query you are trying to run is.. it looks like you are trying to return the most recent sales times of every individual item - and then JOIN this information with the SUM of sales amt per month of each item? Can you provide more info about the query?
It might be possible to do this all in one query. For example, in our wikipedia dataset, an example might look something like...
SELECT contributor_username, UTC_USEC_TO_MONTH(timestamp * 1000000) as month,
SUM(num_characters) as total_characters_used FROM
[publicdata:samples.wikipedia] WHERE (contributor_username != '' or
contributor_username IS NOT NULL) AND timestamp > 1133395200
AND timestamp < 1157068800 GROUP BY contributor_username, month
ORDER BY contributor_username DESC, month DESC;
...to provide wikipedia contributions per user per month (like sales per month per item). This result is actually really large, so you would have to limit by date range.
UPDATE (based on comments below) a similar query that finds "num_characters" for the latest wikipedia revisions by contributors after a particular time...
SELECT current.contributor_username, current.num_characters
FROM
(SELECT contributor_username, num_characters, timestamp as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL)
AS current
JOIN
(SELECT contributor_username, MAX(timestamp) as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL AND timestamp > 1265073722 GROUP BY contributor_username) AS latest
ON
current.contributor_username = latest.contributor_username
AND
current.time = latest.time;
If your query requires you to use first build a large aggregate (for example, you need to run essentially an accurate COUNT DISTINCT) another option is to break this query up into two queries. The first query could provide the max effective date by month along with a count and save this result as a new table. Then, could run a sum query on the resulting table.
You could also store monthly sales records in separate tables, and only query the particular table for the months you are interested in, simplifying your monthly sales summaries (this could also be a more economical use of BigQuery). When you need to find aggregates across all tables, you could run your queries with multiple tables listed after the FROM clause.

MS Access SQL statement count usage

I am new to SQL. I was given a coursework to report data of usage over the last 2 month. Can someone help me with the SQL statement?
SELECT COUNT(Member_ID,Non_Member_Name) AS Pool_usage_last_2_months
FROM Use_of_pool
WHERE DATEDIFF(‘2012-04-21’,’2012-02-21’)
What I meant to do is to count the total number of member usage(member_ID) and non member usage(no ID,name only) from the last two months and then output the name and date and time,etc. on the same report. Is there any SQL statement to output that kind of information? Correction/Suggestions are welcomed.
You need a different WHERE clause. Assuming your Use_of_pool table includes a Date/Time field, date_field:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
If date_field values can include a time component other than midnight, advance the end date range by one day to capture all the possible Date/Time values from Apr. 21:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-22#
That should restrict the rows to match what I think you want. It should offer fast performance with an index on date_field.
I'm unclear about the count(s) you want ... whether it is to be one count for all visits (both member and non-member), or separate counts for members and non-members.
Edit: If each row of the table represents a visit by one person, you can simply count the rows to determine the number of visits during your selected time frame.
SELECT Count(*) AS CountOfVisits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Notice each visit by the same person will contribute to CountOfVisits, which is what I think you want. If you wanted to know how many different people visited, we will need a different approach.
Edit2:
It sounds like you can use Member_ID and Non_Member_Name to distinguish between member and nonmember visits. Member_ID is Null for nonmembers and non-Null for members. And Non_Member_Name is Null for members and non-Null for nonmembers.
If that is true, try this query to count member and nonmember visits separately.
SELECT
Sum(IIf(Member_ID Is Not Null, 1, 0)) AS member_visits,
Sum(IIf(Non_Member_Name Is Not Null, 1, 0)) AS non_member_visits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Aggregate functions of SQL use all the data in a column (more precisely, all the data your WHERE clause selects) to produce a single datum. COUNT gives you the number of data rows that matched your WHERE clause. So for example:
SELECT COUNT(*) AS Non_members FROM Use_of_pool WHERE Member_ID IS NULL
will give you the number of times the pool was used by a non-member, and
SELECT COUNT(DISTINCT Member_ID) AS Members FROM Use_of_pool
will give you the number of members who have used the pool at least once (the DISTINCT tells the database engine to ignore duplicates when counting).
You can expand the WHERE clause to further specify what you want to count. If "last two months" means the current and previous calendar month, you'll need:
... WHERE DateDiff("m",Date_field,Date())<=1
If it means a rolling 2-month period, I'd approximate that with 60 days and say
... WHERE DateDiff("d",Date_field,Date())<60
(Replace Date_field with the name of the field containing the date.)
If you want to count rows according to multiple different criteria, or output both aggregate data and individual data, you'll be best off using separate SELECT statements.