I'm trying to calculate the 90th percentile on a column of my data. MS Access doesn't have a PERCENTILE function, so I'm taking the top 100 values (of 1000 in total), and then taking the minimum of the values that are returned. I'm however having some difficulty using the MS Access MIN() function. My query currently looks as follows:
SELECT MIN(*)
FROM (SELECT TOP 100 ([table1].[field1] + [table1].[field2]) FROM [table1]);
This gives me the error:
Syntax error (missing operator) in query expression 'MIN(*'.
Why am I not allowed to use the asterisk with the MIN function? Am I calculating this value completely incorrectly?
First, you need an order by if you want to get the 90th percentile.
Second, you need a column name:
SELECT MIN(val)
FROM (SELECT TOP 90 PERCENT ([table1].[field1] + [table1].[field2]) as val
FROM [table1]
ORDER BY ([table1].[field1] + [table1].[field2]) ASC
) as t;
Related
Apologies if this has been asked. For what I thought was a relatively simple question, I couldn't seem to find the answer in my searches.
linesInserted and linesDeleted are two columns in the table. I'm trying to return rows where linesInserted + linesDeleted >= 200. The query I have is
SELECT *, (linesInserted + linesDeleted) as total
FROM table
WHERE total >= 200
GROUP BY id
This doesn't work as I'm getting an error saying:
Unknown column 'TOTAL' in where clause
I'm using RMySql for those curious.
If you sum over linesInserted + linesDeleted then there would be no problem:
SELECT id, sum(linesInserted + linesDeleted) as total
FROM table
GROUP BY id
having total >= 200
try like below
SELECT id, sum(linesInserted + linesDeleted) as total
FROM table
GROUP BY id
having sum(linesInserted + linesDeleted)>=200
Here sum is an aggregate function, so we couldn't use where to check the condition as the total is greater than or equal to 200. We should use the having to check the condition since we use sum to calculate the values.
I have a database function that I am able to get the correct results from, but when I try to use SUM to total the results it returns a much higher value.
SELECT
SUM([DATABASE].[dbo].[fn_GetCharges]([TABLE1_DATE],[TABLE1_CUST],[TABLE1_SITE],[TABLE1_SERV])) AS [GROSS_REVENUE]
FROM [DATABASE].[dbo].[TABLE1]
WHERE
[TABLE1_ROUT] = '1234'
AND [TABLE1_DATE] = '2018-05-08'
This returns a SUM value of about 15,740.
If I remove the SUM and GROUP BY the function then it shows each of the returned values, which I manually totalled to about 750.
I'm using Microsoft SQL Server 2014.
We don't know what is in your function, but you could probably output your raw data first and then sum it next.
;WITH cte AS (
SELECT
[DATABASE].[dbo].[fn_GetCharges]([TABLE1_DATE],[TABLE1_CUST],[TABLE1_SITE],[TABLE1_SERV]) AS [GROSS_REVENUE]
FROM [DATABASE].[dbo].[TABLE1]
WHERE
[TABLE1_ROUT] = '1234'
AND [TABLE1_DATE] = '2018-05-08'
GROUP BY whatever_you_grouped_by_that_gave_correct_results
)
SELECT SUM(gross_revenue) AS gross_revenue
FROM cte
Wondering if you used group by clause in the query since I don't see one posted.
Group By (columns) Having (condition). What you specified in the 'where' goes in the 'Having' clause
I'm making a cumulative query, which shows the evolution of clients in my database. To get these query, I use the year and the week of year they joined in the client database.
I have following query to search for relevant data:
SELECT DD.CAL_YEAR, DD.WEEK_OF_YEAR, SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY DD.CAL_DATE ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) AS "Number of account statements"
FROM CLIENT_DATABASE FAB
JOIN DIM_DATE DD ON FAB.BALANCE_DATE_ID = DD.ID
GROUP BY DD.CAL_YEAR, DD.WEEK_OF_YEAR;
But when I compile this query, I get following error:
Error: ORA-00979: not a GROUP BY expression
SQLState: 42000 ErrorCode: 979
How can I fix this?
Since you are grouping by DD.CAL_YEAR, DD.WEEK_OF_YEAR, you can't use DD.CAL_DATE in the order by clause of your cumulative sum function.
It's hard for me to say exactly what you are trying to do without fully understanding your data. But, logically, it does seem like you should be able to simply use DD.CAL_YEAR, DD.WEEK_OF_YEAR in the order by clause instead of DD.CAL_DATE, and still get the results the way you are expecting.
So something like this:
SUM(COUNT(DISTINCT FAB.ID)) OVER ( ORDER BY D.CAL_YEAR, DD.WEEK_OF_YEAR ...
Example data:
table A
part rating numReviews
A308 100 7
A308 98 89
I'm trying to get the average rating for the above data.
What it needs to be is the sum of rating*numReviews for each line divided by the total numReviews
This is what I'm trying but it's giving incorrect result (49.07, should be 98.15):
select part,
cast((AVG(rating*numReviews)/sum(numReviews)) as decimal(8,2)) as rating_average
from A group by part order by part
Can this be done in a single query? I'm using SQL Server
Just go back to the definition of weighted average, so use sum()s and division:
select part, sum(rating * numreviews) / sum(numreviews) as rating_average
from a
group by part
order by part;
You can convert this to a decimal if you like:
select part,
cast(sum(rating * numreviews) / sum(numreviews) as decimal(8, 2)) as rating_average
from a
group by part
order by part;
I want to use the AVG function in sql to return a working average for some values (ie based on the last week not an overall average). I have two values I am calculating, weight and restingHR (heart rate). I have the following sql statements for each:
SELECT AVG( weight ) AS average
FROM stats
WHERE userid='$userid'
ORDER BY date DESC LIMIT 7
SELECT AVG( restingHR ) AS average
FROM stats
WHERE userid='$userid'
ORDER BY date DESC LIMIT 7
The value I get for weight is 82.56 but it should be 83.35
This is not a massive error and I'm rounding it when I use it so its not too big a deal.
However for restingHR I get 45.96 when it should be 57.57 which is a massive difference.
I don't understand why this is going so wrong. Any help is much appreciated.
Thanks
Use a subquery to separate selecting the rows from computing the average:
SELECT AVG(weight) average
FROM (SELECT weight
FROM stats
WHERE userid = '$userid'
ORDER BY date DESC
LIMIT 7) subq
It seems you want to filter your data with ORDER BY date DESC LIMIT 7, but you have to consider, that the ORDER BY clause takes effect after everything else is done. So your AVG() function considers all values of restingHR from your $userId, not just the 7 latest.
To overcome this...okay, Barmar just posted a query.