SQL Server How to get number of unique values in a column and average score per values? - sql

I have a table like this
city metric_name metric_value id
Berlin likes 1 1a
Berlin dislikes 2 1a
Berlin comments 3 1a
Berlin likes 4 1b
Berlin dislikes 5 1b
Berlin comments 3 1b
Hamburg likes 1 1c
Hamburg dislikes 2 1c
Hamburg comments 3 1c
Hamburg likes 2 1d
Hamburg dislikes 4 1d
Hamburg comments 5 1d
and so on
My ideal result is this
city city_count_unique average_metric_score
Berlin 2 3 (sum metric_value / sum metric_names)
Hamburg 2 2,8
What I ve done
I got distinct count for every city and avg metric value
SELECT AVG(T.metric_value), T.city,
COUNT(*) AS 'city_count_unique'
FROM
(SELECT DISTINCT metric_value, city
FROM dbo.Table) as T
GROUP BY T.city
But it is false
Appreciate any help
updated
There is also an additional column id in varchar format

The answer here depends on this assumption:
You always have exactly 3 metrics per 'group' (i.e. likes, dislikes
and comments)
If that assumption is correct, the following will output what you are looking for:
SELECT city,
COUNT(metric_name) / 3 AS city_count_unique,
CAST(SUM(metric_value) AS FLOAT) / COUNT(metric_value) AS average_metric_score
FROM #Table
GROUP BY city
Output:
city city_count_unique average_metric_score
Berlin 2 3
Hamburg 2 2.83333333333333
How does this work?
By grouping on the city, we combine results for each city individually.
The count of metric_name gives the total metrics for that city (which is 6 in the case of your example). I divide this by 3 to give the unique count (based on the assumption I stated).
The average_metric_score calculation if the total of the metric_value for each city divided by the number of metrics (so 18 / 6 for Berlin). The reason for the CAST to FLOAT is to allow for a floating point answer. You could also use CONVERT if you prefer this to CAST.
Edit following OP update to question
OP edited the question to indicate that there is an ID column that allows the detection of metric grouping. This is an update to use that rather than assuming there are always 3 metrics per group.
SELECT city,
COUNT(id) AS city_count_unique,
CAST(SUM(metric_value_total) AS FLOAT) / SUM(metric_value_count) AS average_metric_score
FROM (
SELECT city,
id,
SUM(metric_value) metric_value_total,
COUNT(metric_value) AS metric_value_count
FROM #Table
GROUP BY city, id
) a
GROUP BY city

You seem to want:
SELECT city,
COUNT(DISTINCT id) as city_count_unique,
AVG(metric_value * 1.0) as average_metric_score
FROM t
GROUP BY city;

Related

SQL query count rows with the same entry

Given a dataset Roster_table as such:
Group ID
Group Name
Name
Phone
42
Red Dragon
Jon
123455678
32
Green Lizard
Liz
932143211
19
Blue Falcon
Ben
134554678
42
Red Dragon
Reed
432143211
42
Red Dragon
Brad
231314155
19
Blue Falcon
Chad
214124412
How do I get the following query output combining rows with the same Group ID from the dataset, and the new column Count in descending order:
Group ID
Group Name
Count
42
Red Dragon
3
19
Blue Falcon
2
32
Green Lizard
1
SELECT * FROM Roster_table
Please try this where alias tot_count is used in ORDER BY clause.
-- PostgreSQL(v11)
SELECT Group_ID
, MAX(Group_Name) Group_Name
, COUNT(1) tot_count
FROM Roster_table
GROUP BY Group_ID
ORDER BY tot_count DESC;
Please check from url https://dbfiddle.uk/?rdbms=postgres_11&fiddle=b66f9f0d40e804e89be12e3530fe00a0
Based on Rahul Biswas's answer:
Solution without using Max function
SELECT Group_ID, Group_Name, COUNT(*)
FROM Roster_table
GROUP BY Group_ID, Group_Name
ORDER BY COUNT(*) DESC
Credit goes to Eric S.

SQL query to get only rows match the condition based on two separated columns under one 'group by'

The simple SELECT query would return the data as below:
Select ID, User, Country, TimeLogged from Data
ID User Country TimeLogged
1 Samantha SCO 10
1 John UK 5
1 Andrew NZL 15
2 John UK 20
3 Mark UK 10
3 Mark UK 20
3 Steven UK 10
3 Andrew NZL 15
3 Sharon IRL 5
4 Andrew NZL 25
4 Michael AUS 5
5 Jessica USA 30
I would like to return a sum of time logged for each user grouped by ID
But for only ID numbers where both of these values Country = UK and User = Andrew are included within their rows.
So the output in the above example would be
ID User Country TimeLogged
1 John UK 5
1 Andrew NZL 15
3 Mark UK 30
3 Steven UK 10
3 Andrew NZL 15
First you need to identify which IDs you're going to be returning
SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew';
and based on that, you can then filter to aggregate the expected rows.
SELECT ID,
[User],
Country,
SUM(Timelogged) as Timelogged
FROM mytable
WHERE (Country='UK' OR [User]='Andrew')
AND ID IN( SELECT ID FROM MyTable WHERE Country='UK'
INTERSECT
SELECT ID FROM MyTable WHERE [User]='Andrew')
GROUP BY ID, [User], country;
So, you have described what you need to write almost perfectly but not quite. Your result table indicates that you want Country = UK OR User = Andrew, rather than AND
You need to select and group by, then include a WHERE:-
Select ID, User, Country, SUM(Timelogged) as Timelogged from mytable
WHERE Country='UK' OR User='Andrew'
Group by ID, user, country

Hive sql: count and avg

I'm recently trying to learn Hive and i have a problem with a sql consult.
I have a json file with some information. I want to get the average for each register. Better in example:
country times
USA 1
USA 1
USA 1
ES 1
ES 1
ENG 1
FR 1
then with next consult:
select country, count(*) from data;
I obtain:
country times
USA 3
ES 2
ENG 1
FR 1
then i should get next out:
country avg
USA 0,42 (3/7)
ES 0,28 (2/7)
ENG 0,14 (1/7)
FR 0,14 (1/7)
I don't know how i can obtain this out from the first table.
I tried:
select t1.country, avg(t1.tm),
from (
select country,count(*)as tm from data where not country is null group by country
) t1
group by t1.country;
but my out is wrong.
Thanks for help!! BR.
Divide the each group count by total count to get the result. Use Sub-Query to find the total number of records in your table
Try this
select t1.country, count(*)/IFNULL((select cast(count(*) as float) from data),0)
from data
group by t1.country;

query to find more than one name with different values

this is my table i need more than two names will appear as out put i used count in my query, but name timur has diff company so it cant count as 1 i need count as 2
Name ID Company Name CompanyID Role Name
Ahmed 73 King & Spalding 55 Counsel
Timur 78 Chance CIS Ltd 39 Partner
Timur 78 Clifford LLP 28 Counsel
Rahail 80 Reed Smith ltd 97 Partner
out put like this
Name ID Company Name CompanyID Role Name count
Ahmed 73 King & Spalding 55 Counsel 1
Timur 78 Chance CIS Ltd 39 Partner 2
Timur 78 Clifford LLP 28 Counsel 2
Rahail 80 Reed Smith ltd 97 Partner 1
I am assuming that name and ID match each other. So in case of duplicated names for different people, I am using ID for partitioning
SELECT
*,
count(*) over (partition by ID) as [count]
FROM yourtable
Use correlated sub-query:
select t.*, (select count(*) from tablename where name = t.name) as count
from tablename t
If you're using SQL Server 2005 or above then you can use a window function to achieve this easily:
SELECT
T.Name,
T.ID,
T.CompanyName,
T.CompanyID,
T.RoleName,
COUNT(*) OVER (PARTITION BY T.Name)
FROM
My_Table T

How to select students who got above average?

How to list all students who got above average grade of their group in SQL table? We have 6 group_ids so there six different average grades.
group_id student grade
1 James 85
1 Adam 96
2 Tom 56
2 Jane 89
2 Anny 90
Result:
group_id student grade
1 Adam 96
2 Jane 89
2 Anny 90
ashkufaraz's answer is closer but not quite right
select group_id,student,grade from students one where grade >
(select avg(grade) from students two where two.group_id = one.group_id)
The question is just tagged SQL, so this is an answer using standard SQL:
One option is to use a window function:
select group_id,student,grade
from (
select group_id,student,grade,
avg(grade) over (partition by group_id) as group_avg
from studends
) t
where grade > group_avg;
This has the additional benefit that you can also display the group average along with the result with no additional join or sub-select.