Referring to dynamic columns in a postgres query? - sql

Let's say I have something like this:
select sum(points) as total_points
from sometable
where total_points > 25
group by username
I am unable to refer to total_points in the where clause because I get the following error: ERROR: column "total_points" does not exist. In this case I'd have no problem rewriting sum(points) in the where clause, but I'd like some way of doing what I have above.
Is there any way to store the result in a variable without using a stored procedure?
If I do rewrite sum(points) in the where clause, is postgres smart enough to not recalculate it?

SELECT SUM(points) AS total_points
FROM sometable
GROUP BY
username
HAVING SUM(points) > 25
PostgreSQL won't calculate the sum twice.

I believe PostgreSQL is like other brands of sql, where you need to do:
SELECT t.*
FROM (
SELECT SUM(points) AS total_points
FROM sometable
GROUP BY username
) t
WHERE total_points > 25
EDIT: Forgot to alias subquery.

You have error in statement:
select sum(points) as total_points
from sometable
where total_points > 25 -- <- error here
group by username
You can't limit rows by total_points, because sometable don't have that column. What you want is limit gouped resulting rows by total_points, computed for each group, so:
select sum(points) as total_points
from sometable
group by username
having sum(points) > 25
If you replace total_point in your example, then you simply chech if sum computed from all rows is bigger than 25 and then return all rows, grouped by username.
Edit:
Always remember order:
is FROM with JOIN's to get tables
is WHERE for limit rows from tables
is SELECT for limit columns
is GROUP BY for group rows into related groups
is HAVING for limit resulting groups
is ORDER BY for order results

Related

How to find AVG of Count in SQL

This is what I have
select avg(visit_count) from ( SELECT count(user_id) as visit_count from table )group by user_id;
But I get the below error
ERROR 1248 (42000): Every derived table must have its own alias
if I add alias
then I get avg for only one user_id
What I want is the avg of visit_count for all user ids
SEE the picture for reference
Example 3,2.5,1.5
It means that your subquery needs to have an alias.
Like this:
select avg(visit_count) from (
select count(user_id) as visit_count from table
group by user_id) a
Your subquery is missing an alias. I think this is the version you want:
SELECT AVG(visit_count)
FROM
(
SELECT COUNT(user_id) AS visit_count
FROM yourTable
GROUP BY user_id
) t;
Note that GROUP BY belongs inside the subquery, as you want to find counts for all users.

How to query a column created by aggregate function in hive?

In hive, I want to select the records with users>=40. My table column consist of field userid. So i used
select title,sum(rating),count(userid) from table_name where count(userid)>=40
group by title order by rating desc
But it showed error like you can't use count in where clause. Also i have tried using alias like
select title,sum(rating) as ratings,count(userid) as users where users>=40 group by title order by ratings desc
Here also i struck up with error showing users is not a column name in table.
I need to get title with maximum ratings having minimum 40 users
You want the having clause:
select title, sum(rating), count(userid)
rom table_name
group by title
having count(userid) >= 40
order by sum(rating) desc;
In Hive, you may need to use a column alias, though:
select title, sum(rating) as rating, count(userid) as cnt
rom table_name
group by title
having cnt >= 40
order by rating desc;

SQL - Select Query to get group by records whose sum(data) > 24

I need to select a UserID from the table whose sum of Data greater than 24.
I can able to select group and sum the records using
SELECT SUM(DATA),UserID FROM TableName GROUP BY UserID
But how can I select only the records for which SUM(DATA)>24
I have tried
SELECT SUM(DATA),UserID FROM #tempTimesheetValue where SUM(DATA)>24 GROUP BY UserID
But its not working.
Thanks in advance for suggestion..,
you can do this by below query:
select UserID, DATA from (
SELECT SUM(DATA) as DATA, UserID FROM #tempTimesheetValue GROUP BY UserID
) A where DATA > 24
The question might as well have the correct answer, which is;
SELECT SUM(DATA), UserID
FROM #tempTimesheetValue
GROUP BY UserID
HAVING SUM(DATA) > 24;
A subquery could be used, but it is unnecessary complication.

Specific sql query with count of records

I have simple table.
I need to build a SQL query and in result get the count of record where user re-played same game. So in this case we will have 3 in result..
You want a count group by
select user_id, count(*)
from your_table
group by user_id ;
You'll want to group by user_id and ensure the game count is greater than 1.
Assuming you're using SQL Server, please try something like:
SELECT user_id, count(*) as GamesPlayed
FROM table_name
GROUP BY user_id
HAVING count(*) > 1;

Some two columns using alias in order by

I have this query:
select
id,
count(1) as "visits",
count(distinct visitor_id) as "visitors"
from my_table
where timestamp > '2016-01-14'
group by id
order by "visits", "visitors"
It works.
If I change to this
select
id,
count(1) as "visits",
count(distinct visitor_id) as "visitors"
from my_table
where timestamp > '2016-01-14'
group by id
order by (("visits") + ("visitors"))
I get
column "visits" does not exist
If I change to
select
id,
count(1) as "visits",
count(distinct visitor_id) as "visitors"
from my_table
where timestamp > '2016-01-14'
group by id
order by count(1) + count(distinct visitor_id)
it works again.
Why does it work for example 1 and 3, but not for example 2? Is there any way to order by the sum of two column using their aliases?
The alternatives I could think of:
Create an outer select and order it, but that would create extra code and I would like to avoid that
Recalculate the values in the order by statement. But that would make the query more complex and maybe I would lose performance due to recalculating stuff.
PS: This query is a toy-query. The real one is much more complicated. I would like to reuse the value calculated in the select statement in the order by, but all summed up together.
Expression evaluation order is not defined. If your visits + visitors expression is evaluated before aliases you will get the error shown here above.
Instead of using the alias try using the actual column also try change the type to varchar or nvarchar, and by that I mean the following:
select
id,
count(1) as "visits",
count(distinct visitor_id) as "visitors"
from my_table
where timestamp > '2016-01-14'
group by id
order by (CAST(count(1) AS VARCHAR) + CAST(count(distinct visitor_id) AS VARCHAR))