I am new to SQL. I was given a coursework to report data of usage over the last 2 month. Can someone help me with the SQL statement?
SELECT COUNT(Member_ID,Non_Member_Name) AS Pool_usage_last_2_months
FROM Use_of_pool
WHERE DATEDIFF(‘2012-04-21’,’2012-02-21’)
What I meant to do is to count the total number of member usage(member_ID) and non member usage(no ID,name only) from the last two months and then output the name and date and time,etc. on the same report. Is there any SQL statement to output that kind of information? Correction/Suggestions are welcomed.
You need a different WHERE clause. Assuming your Use_of_pool table includes a Date/Time field, date_field:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
If date_field values can include a time component other than midnight, advance the end date range by one day to capture all the possible Date/Time values from Apr. 21:
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-22#
That should restrict the rows to match what I think you want. It should offer fast performance with an index on date_field.
I'm unclear about the count(s) you want ... whether it is to be one count for all visits (both member and non-member), or separate counts for members and non-members.
Edit: If each row of the table represents a visit by one person, you can simply count the rows to determine the number of visits during your selected time frame.
SELECT Count(*) AS CountOfVisits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Notice each visit by the same person will contribute to CountOfVisits, which is what I think you want. If you wanted to know how many different people visited, we will need a different approach.
Edit2:
It sounds like you can use Member_ID and Non_Member_Name to distinguish between member and nonmember visits. Member_ID is Null for nonmembers and non-Null for members. And Non_Member_Name is Null for members and non-Null for nonmembers.
If that is true, try this query to count member and nonmember visits separately.
SELECT
Sum(IIf(Member_ID Is Not Null, 1, 0)) AS member_visits,
Sum(IIf(Non_Member_Name Is Not Null, 1, 0)) AS non_member_visits
FROM Use_of_pool
WHERE date_field >= #2012-02-21# AND date_field <= #2012-04-21#
Aggregate functions of SQL use all the data in a column (more precisely, all the data your WHERE clause selects) to produce a single datum. COUNT gives you the number of data rows that matched your WHERE clause. So for example:
SELECT COUNT(*) AS Non_members FROM Use_of_pool WHERE Member_ID IS NULL
will give you the number of times the pool was used by a non-member, and
SELECT COUNT(DISTINCT Member_ID) AS Members FROM Use_of_pool
will give you the number of members who have used the pool at least once (the DISTINCT tells the database engine to ignore duplicates when counting).
You can expand the WHERE clause to further specify what you want to count. If "last two months" means the current and previous calendar month, you'll need:
... WHERE DateDiff("m",Date_field,Date())<=1
If it means a rolling 2-month period, I'd approximate that with 60 days and say
... WHERE DateDiff("d",Date_field,Date())<60
(Replace Date_field with the name of the field containing the date.)
If you want to count rows according to multiple different criteria, or output both aggregate data and individual data, you'll be best off using separate SELECT statements.
Related
I have been given a task by my manager to write a SQL query to select the max number of counts (no of records) for a user who has travelled the most within a month provided that if the user travels multiple places on the same date, then it should be counted as one. For instance, if you look at the following table design; according to this scenario, my query must return me a count of 2. Although traveller_id "1" has traveled three times within a month, but he traveled to Thailand and USA on the same date, that is why its count is reduced to 2.
I have also developed my logic for this query but I am unable to write it due to lack of syntax knowledge. I split up this query into 3 parts:
Select All records from the table within a month using the MONTH function of SQL
Select All distinct DateTime records from the above result so that the same DateTime gets eliminated.
Select max number of counts for the traveller who visited most places.
Please help me in completing my query. You can also use a different approach from mine.
You can use the count aggregation in a cte then select top(1):
with u as
(select traveller_id,
count(distinct visit_date) as n
from travellers_log
where visit_date between '2022-03-01' and '2022-03-31'
group by traveller_id)
select top(1) traveller_id, name, n from u inner join table_travellers
on u.traveller_id = table_travellers.id
order by n desc;
I am try to calculate the average since the last time stamp and pull all records where the average is greater than 3. My current query is:
SELECT AVG(BID)/BID AS Multiple
FROM cdsData
where Multiple > 3
and SqlUnixTime > 1492225582
group by ID_BB_RT;
I have a table cdsData and the unix time is april 15th converted. Finally I want the group by calculated within the ID as I show. I'm not sure why it's failing but it says that the field Multiple is unknown in the where clause.
I am try to calculate the average since the last time stamp and pull all records where the average is greater than 3.
I think your intention is correctly stated as follows, "I am trying to calculate the average since the last time stamp and select all rows where the average is greater than 3 times the individual bid".
In fact, a still better restatement of your objective would be, "I want to select all rows since the last time stamp, where the bid is less than 1/3rd the average bid".
For this, the steps are as follows:
1) A sub-query finds the average bid divided by 3, of rows since the last time stamp.
2) The outer query selects rows since the last time stamp, where the individual bid is < the value returned by the sub-query.
The following SQL statement does that:
SELECT BID
FROM cdsData
WHERE SqlUnixTime > 1492225582
AND BID <
(
SELECT AVG(BID) / 3
FROM cdsData
WHERE SqlUnixTime > 1492225582
)
ORDER BY BID;
1)
SQL is evaluated backwards, from right to left. So the where clause is parsed and evaluate prior to the select clause. Because of this the aliasing of AVG(BID)/BID to Multiple has not yet occurred.
You can try this.
SELECT AVG(BID)/BID AS Multiple
FROM cdsData
WHERE SqlUnixTime > 1492225582
GROUP BY ID_BB_RT Having (AVG(BID)/BID)>3 ;
Or
Select Multiple
From (SELECT AVG(BID)/BID AS Multiple
FROM cdsData
Where SqlUnixTime > 1492225582 group by ID_BB_R)X
Where multiple >3
2)
Once you corrected the above error, you will be having one more error:
Column 'BID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
To correct this you have to insert BID column in group by clause.
I'm a receptionist keeping track of incoming calls in MS-Access 2010. The table has Date column. I can get count of calls per day but am having trouble with SQL to get average calls per day.
Assuming your table has one record per call, you can use a query like this, just replace the table and field names:
SELECT Avg(TotalCalls.DailyCalls) AS AverageCalls
FROM
(
SELECT MyTable.MyDateField, Count(MyTable.MyDateField) AS DailyCalls
FROM MyTable
WHERE MyDate > #1-Feb-2017# AND MyDate <= #28-Feb-2017#
GROUP BY MyTable.MyDateField
) AS TotalCalls
This won't take into account days that have no calls, just the ones that do. The WHERE clause is optional, but you might want to use that to pick a specific date range.
Since bigquery is append-only, I was thinking about stamping each record I upload to it with an 'effective date' similar to how peoplesoft works, if anybody is familiar with that pattern.
Then, I could issue a select statement and join on the max effective date
select UTC_USEC_TO_MONTH(timestamp) as month, sum(amt)/100 as sales
from foo.orders as all
join (select id, max(effdt) as max_effdt from foo.orders group by id) as latest
on all.effdt = latest.max_effdt and all.id = latest.id
group by month
order by month;
Unfortunately, I believe this won't scale because of the big query 'small joins' restriction, so I wanted to see if anyone else had thought around this use case.
Yes, adding a timestamp for each record (or in some cases, a flag that captures the state of a particular record) is the right approach. The small side of a BigQuery "Small Join" can actually return at least 8MB (this value is compressed on our end, so is usually 2 to 10 times larger), so for "lookup" table type subqueries, this can actually provide a lot of records.
In your case, it's not clear to me what the exact query you are trying to run is.. it looks like you are trying to return the most recent sales times of every individual item - and then JOIN this information with the SUM of sales amt per month of each item? Can you provide more info about the query?
It might be possible to do this all in one query. For example, in our wikipedia dataset, an example might look something like...
SELECT contributor_username, UTC_USEC_TO_MONTH(timestamp * 1000000) as month,
SUM(num_characters) as total_characters_used FROM
[publicdata:samples.wikipedia] WHERE (contributor_username != '' or
contributor_username IS NOT NULL) AND timestamp > 1133395200
AND timestamp < 1157068800 GROUP BY contributor_username, month
ORDER BY contributor_username DESC, month DESC;
...to provide wikipedia contributions per user per month (like sales per month per item). This result is actually really large, so you would have to limit by date range.
UPDATE (based on comments below) a similar query that finds "num_characters" for the latest wikipedia revisions by contributors after a particular time...
SELECT current.contributor_username, current.num_characters
FROM
(SELECT contributor_username, num_characters, timestamp as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL)
AS current
JOIN
(SELECT contributor_username, MAX(timestamp) as time FROM [publicdata:samples.wikipedia] WHERE contributor_username != '' AND contributor_username IS NOT NULL AND timestamp > 1265073722 GROUP BY contributor_username) AS latest
ON
current.contributor_username = latest.contributor_username
AND
current.time = latest.time;
If your query requires you to use first build a large aggregate (for example, you need to run essentially an accurate COUNT DISTINCT) another option is to break this query up into two queries. The first query could provide the max effective date by month along with a count and save this result as a new table. Then, could run a sum query on the resulting table.
You could also store monthly sales records in separate tables, and only query the particular table for the months you are interested in, simplifying your monthly sales summaries (this could also be a more economical use of BigQuery). When you need to find aggregates across all tables, you could run your queries with multiple tables listed after the FROM clause.
I have an SQLite database with the following fields for example:
date (yyyymmdd fomrat)
total (0.00 format)
There is typically 2 months of records in the database. Does anyone know a SQL query to find a weekly average?
I could easily just execute:
SELECT COUNT(1) as total_records, SUM(total) as total FROM stats_adsense
Then just divide total by 7 but unless there is exactly x days that are divisible by 7 in the db I don't think it will be very accurate, especially if there is less than 7 days of records.
To get a daily summary it's obviously just total / total_records.
Can anyone help me out with this?
You could try something like this:
SELECT strftime('%W', thedate) theweek, avg(total) theaverage
FROM table GROUP BY strftime('%W', thedate)
I'm not sure how the syntax would work in SQLite, but one way would be to parse out the date parts of each [date] field, and then specifying which WEEK and DAY boundaries in your WHERE clause and then GROUP by the week. This will give you a true average regardless of whether there are rows or not.
Something like this (using T-SQL):
SELECT DATEPART(w, theDate), Avg(theAmount) as Average
FROM Table
GROUP BY DATEPART(w, theDate)
This will return a row for every week. You could filter it in your WHERE clause to restrict it to a given date range.
Hope this helps.
Your weekly average is
daily * 7
Obviously this doesn't take in to account specific weeks, but you can get that by narrowing the result set in a date range.
You'll have to omit those records in the addition which don't belong to a full week. So, prior to summing up, you'll have to find the min and max of the dates, manipulate them such that they form "whole" weeks, and then run your original query with a WHERE that limits the date values according to the new range. Maybe you can even put all this into one query. I'll leave that up to you. ;-)
Those values which are "truncated" are not used then, obviously. If there's not enough values for a week at all, there's no result at all. But there's no solution to that, apparently.