SQL - how to include average of a column in a report - sql

Here's the scenario:
I have an Access project with tables consisting of the following data:
Activity table: ActivityName, ActivityPopularityRating
Volunteer table: VolunteerName, VolunteerRating
Each activity can have many volunteers and each volunteer has a rating.
I have to create a report which indicates for each activity, the names and ratings of the volunteers taking part in that activity, as well as the average VolunteerRating of the those volunteers taking part in each activity. An example is attached.
I have created the SQL query but I am not sure if I should generate the average value needed in the query, or if there is some function in Access that would allow me to do that in the report.
Here is my Query:
SELECT Activity.ActivityName,
Activity.ActivityPopularityRating,
StudentVolunteer.VolunteerName,
StudentVolunteer.VolunteerRating,
AVG(StudentVolunteer.VolunteerRating)
FROM Activity
INNER JOIN StudentVolunteer ON Activity.ActivityName = StudentVolunteer.ActivityName
GROUP BY Activity.ActivityName
All help is appreciated
Thanks

To add a total line (here showing the average rating) to your results, you would use UNION ALL in MsAccess (or ROLLUP in another DBMS). However, it is not necessary to add such line to your query. The data is already there (it is the avarage of the selected ratings, which can easily be calcualted from them).
So remove GROUP BY and the AVG line from your query and add AVG(VolunteerRating) in your report layer instead. The report will then calculate the avarage from the ratings.

Related

How to show the combinations of a column in SQL along with the aggregated results?

The question is on SQL:
Create a summary table to show how many customers use different GO-JEK services in a daily
basis, along with the combination of services used (Please see below screenshot for more
details).
Several conditions for this task:
Take only order_status = “Completed”
No repetition on details of order_type
You can combine it in any order but every combination is only allowed once. E.g.
same combinations like RIDE, CAR, SEND and CAR, RIDE, SEND are
unacceptable
Group each order_payment and its combination
I.e. aggregations by CASH, GOPAY, CASH&GOPAY (ALL)
Use date in UTC timezone
Dataset sample: link
I'm not sure on getting the order_type combinations. Please suggest.

How to create an aggregate table (data mart) that will improve chart performance?

I created a table named user_preferences where user preferences have been grouped by user_id and month.
Table:
Each month I collect all user_ids and assign all preferences:
city
district
number of rooms
the maximum price they can spend
The plan assumes displaying a graph showing users' shopping intentions like this:
The blue line is the number of interested users for the selected values in the filters.
The graph should enable filtering by parameters marked in red.
What you see above is a simplified form for clarifying the subject. In fact, there are many more users. Every month, the table increases by several hundred thousand records. The SQL query retrieving data (feeding) for chart lasts up to 50 seconds. It's far too much - I can't afford it.
So, I need to create a table (table/aggregation/data mart) where I will be able to insert the previously calculated numer of interested users for all combinations. Thanks to this, the end user will not have to wait for the data to count.
Details below:
Now the question is - how to create such a table in PostgreSQL?
I know how to write a SQL query that will calculate a specific example.
SELECT
month,
count(DISTINCT user_id) interested_users
FROM
user_preferences
WHERE
month BETWEEN '2020-01' AND '2020-03'
AND city = 'Madrid'
AND district = 'Latina'
AND rooms IN (1,2)
AND price_max BETWEEN 400001 AND 500000
GROUP BY
1
The question is - how to calculate all possible combinations? Can I write multiple nested loop in SQL?
The topic is extremely important to me, I think it will also be useful to others for the future.
I will be extremely grateful for any tips.
Well, base on your query, you have the following filters:
month
city
distirct
rooms
price_max
You can try creating a view with the following structure:
SELECT month
,city
,distirct
,rooms
,price_max
,count(DISTINCT user_id)
FROM user_preferences
GROUP BY month
,city
,distirct
,rooms
,price_max
You can make this view materialized. So, the query behind the view will not be executed when queried. It will behave like table.
When you are adding new records to the base table you will need to refresh the view (unfortunately, posgresql does not support auto-refresh like others):
REFRESH MATERIALIZED VIEW my_view;
or you can scheduled a task.
If you are using only exact search for each field, this will work. But in your example, you have criteria like:
month BETWEEN '2020-01' AND '2020-03'
AND rooms IN (1,2)
AND price_max BETWEEN 400001 AND 500000
In such cases, I usually write the same query but SUM the data from the materialized view. In your case, you are using DISTINCT and this may lead to counting a user multiple times.
If this is a issue, you need to precalculate too many combinations and I doubt this is the answer. Alternatively, you can try to normalize your data - this will improve the performance of the aggregations.

Calculating the difference in days from two records in an Access Database

I am creating an Access Database from a very complex Excel Spreadsheet. The process has been going well until I got to this problem. The solution is easy in Excel, but I cannot figure out how to do it in Access.
Here is what I had before in Excel.
I had a list of Customers on one sheet with multiple fields. I then had another sheet act as a report that would run a VBA macro to search through the table of all customers and list out every customer by name that was an inbound call from our contact center (Que Call), when that call came and then would calculate a third column for the number of days between calls. This last column is where I am running into difficulties translating to Access. In Excel, I would just have it do something like in cell C3 =SUM(B3-B2). Given that the table looked like this:
Column A Column B Column C
Row 1 Name Date Time Lapse
Row 2 Customer 1 7/1/2019 ----------
Row 3 Customer 2 7/2/2019 =SUM(B3-B2) <-- 1 day
Row 4 Customer 3 7/4/2019 =SUM(B4-B3) <-- 2 days
In Access:
I have a report that goes through my table of customers and lists off only those from our contact center (Que Call), but I can't figure out how to put in the calculation of time between calls as the design only allows me to affect one row. How do I make this calculation? Is it a SQL query that I need to do? I would prefer to not have to have a separate table for call center calls or a separate column in my customers table to calculate this as some customers are not from the call center. Can I just run a report or a query. Any advise or help would be greatly appreciated.
Current SQL Code:
SELECT
[Customers].FullName,
[Customers].ID,
[Customers].QueCall,
[Customers].Status,
[Customers].InterestLevel,
[Customers].State,
[Customers].Product,
[Customers].Created,
[Customers].LastContact,
[Customers].PrimaryNote
FROM
Customers
WHERE
((([Customers].QueCall)=True));
ORDER BY
[Customers].Created;
Describe exactly how it isn't working (error message, unexpected results, etc...)
It just lists out the customers and does not allow me to calculate the difference between when the records were created (ie when they were first contacted). I have found many things online about how to calculate the difference between two columns of the same record, but not between two different records; nor two different records that may not be sequentially after each other as there may be other non Que Call customers between records in the customer table.
Describe the desired results
I would like to have a column in the end report that shows how many days lapsed between records that were que calls.
Thank you in advance for any input that you may have.
Consider a correlated aggregate subquery where an inner query from same source, Customer, is correlate with outer query by same ID (assumed to be unique identifier) with date comparison (assumed to be Created field). Notice the use of table alias, c and sub for the correlation.
Use DateDiff for difference between dates. To use this query, place below query into the SQL mode of Query Designer and save the object to be used as recordsources to forms, reports, opened on its own, or used in application code as recordsets.
SELECT
c.FullName,
c.ID,
c.QueCall,
c.Status,
c.InterestLevel,
c.State,
c.Product,
c.Created,
c.LastContact,
c.PrimaryNote,
(SELECT TOP 1 SUM(DateDiff("d", sub.Created, c.Created))
FROM Customer sub
WHERE sub.ID = c.ID
AND sub.Created < c.Created
GROUP BY sub.Created
ORDER BY sub.Created DESC) AS TimeElapsed
FROM
Customers c
WHERE
(((c.QueCall)=True));
ORDER BY
c.Created;
Do be aware for large tables this correlated subquery can be taxing in time and performance. Allow time to complete and look into storing output in a temp table with a Make-Table Query to avoid re-run.

Count Distinct Records and

I have a very large database and I need to extract information from 3 columns:
I am trying to determine how many unique customers the user is processing.
Username. This is unique name.
CustomerNumber. A customer number will appear on many lines, as they could have ordered many products and each product is a line.
Date Range. I need to be able to define a date range.
The code I am tried and searched is counting the customer numbers, but not just the distinct customer number.
I have not tried the date range as yet.
I have attached 2 images to show an example of the database and the end result. We used a pivot table to produce this result, but the data changes all the time and we dont want to create a pivot table the whole time.
Image of Sample Data in Excel:
Image of Required Final Result
SELECT `'All data$'`.Username, Count(`'All data$'`.CustomerNumber)
FROM `C:\Users\rhynto\Desktop\Darren Qwix\QWIX_PICKED.xlsx`.`'All data$'` `'All data$'`
GROUP BY `'All data$'`.Username
I will appreciate any advice on this please.

Create Select distinct query with criteria of having the latest date

I have been struggling with creating a query in Access to select a distinct field with the criteria of having the newest entry in the database.
Heres a brief summary of how what my table conssists of. I have a table with surveying data collected from 2007 to the present. We have field with a survey marks name with corresponding adjustment data. In the corresponding data there is field with the adjusmtent date. Many of the marks have been occupied mutiple times and only want to retrieve the most recent occupation information.
Roughly i want to
SELECT DISTINCT STATUS_POINT_DESIGNATION
FROM __ALL_ADJUSTMENTS
WHERE [__ALL_ADJUSMENTS]![ADJ_DATE]=MAX(ADJ_DATE)
I seem to be getting confused how relate the select a distinct value with a constraint. Any Suggestions?
DH
Seems you could achieve your aim of getting the latest observation for each survey point by a summary function:
SELECT STATUS_POINT_DESIGNATION, Max(ADJ_DATE) AS LatestDate, Count(STATUS_POINT_DESIGNATION) AS Observations
FROM __ALL_ADJUSTMENTS
GROUP BY STATUS_POINT_DESIGNATION;