Counting occurrences in several columns - sql

I'm making an app that shows people movies to rate, a-la hot-or-not. I'd like to write a query that gets me the number of times a movie has been rated. The table for ratings looks like this:
| id | winner | loser |
| 1 | 1 | 2 |
| 2 | 2 | 3 |
| 3 | 1 | 3 |
I can get the number of times a movie has "won" by running a query like this:
SELECT winner, count(winner) AS number_of_wins
FROM movie_results
GROUP BY winner
ORDER BY number_of_wins DESC;
But I'd like to get another query that shows the total number of times a movie was pitched against other movies, i.e. the number of times a movie has appeared to be rated, whether it was rated above or below the other movie. What is the easiest way to achieve this, using only SQL queries?

Here is one method, using union all:
select movie, count(*) as nummatches, sum(win) as numwins
from ((select winner as movie, 1 as win from match_results) union all
(select loser, 0 from match_results)
) wl
group by movie;

You can do a full join between two derived tables where each table contains the number of losses and wins for each player.
select
coalesce(winner,loser) player,
coalesce(number_of_wins,0) number_of_wins,
coalesce(number_of_losses,0) number_of_losses,
coalesce(number_of_wins,0) + coalesce(number_of_losses,0) number_of_matches
from (
select winner, count(*) number_of_wins
from movie_results
group by winner
) winners full join (
select loser, count(*) number_of_losses
from movie_results
group by loser
) losers on losers.loser = winners.winner
http://sqlfiddle.com/#!15/980d6/3

Related

Printing Average Score for Unique ID's

I have a table
id | playerID | score
1 | 1 | 100
2 | 1 | 155
3 | 5 | 132
etc..
What I want to do is get the Average 'score' for each unique playerID. I can get the DISTINCT playerIDs
SELECT DISTINCT
playerID
FROM
dbo.scores
and I can get the average of all scores,
SELECT
AVG(score)
FROM
dbo.scores
but I can't seem to figure out how to combine the two.
Use group by!
SELECT playerID, AVG(score) AS avg_score
FROM dbo.scores
GROUP BY playerID
The use of dbo suggests that you are using SQL Server. SQL Server does an integer average of integer values, so the average of 0 and 1 is 0, not 0.5.
I would recommend:
SELECT s.playerID, AVG(10. * s.score) AS avg_score
FROM dbo.scores s
GROUP BY s.playerID

SQLite percentages with small values

So I have this table of subscribers of users and the country they are in.
UserID | Name | Country
-------+-------------------+------------
1 | Zaphod Beeblebrox | UK
2 | Arthur Dent | UK
3 | Gene Kelly | USA
4 | Nat King Cole | USA
I need to produce a list of all the users by percentage from each of the countries. I also need all the smaller member countries (under 1%) to be collapsed into an "OTHERS" category.
I can accomplish a simple "top x" of members trivially with a
SELECT COUNTRY, COUNT(*) AS POPULATION FROM SUBSCRIBERS GROUP BY COUNTRY ORDER BY POPULATION DESC LIMIT 10
and can generate the percentages by PHP server side code, but I don't quite know how to:
Do all of it in SQL including percentage calculations directly in the result
Club all under 1% members into a single OTHERS category.
So I need something like this:
Country | Population
--------+-----------
USA | 25.4%
Brazil | 12%
UK | 5%
OTHERS | 65%
Appreciate the help!
Here is query for this, I used a subquery to count the total number of rows and then used that to get the percentage value for each. The 'Others' category was generated in a separate query. Rows are sorted by descending population with the Others row last.
SELECT * FROM
(SELECT country , ROUND((100.0*COUNT(*)/count_all),1) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) > 1
GROUP BY country
ORDER BY population DESC)
UNION ALL
SELECT 'OTHERS', IFNULL(ROUND(100.0*COUNT(*)/count_all,1),0.0) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) <= 1
Ok I think I might have found a way to do this that's a hell of a lot quicker on execution speed:
SELECT territory,
Round(Sum(percentage), 3) AS Population
FROM (SELECT
Round((Count(*)*100.0)/(SELECT Count(*) FROM subscribers),3) AS Percentage,
CASE
WHEN ((Count(*)*100.0)/(SELECT Count(*) FROM subscribers)) > 2 THEN
country
ELSE 'Other'
END AS Territory
FROM subscribers
GROUP BY country
ORDER BY percentage DESC)
GROUP BY territory
ORDER BY population DESC;

select the highest record between two table

I have two table. One table contains graduation records and the second table contains post graduation records. A candidate must have graduation, but it is not necessarily to have post graduation.
My question is to select the post graduation record if the candidate has post graduation else only graduation.
table 1 graduation_table
rollno | degree | division
--------------------------
001 | B.tech | 1st
002 | B.sc | 1st
003 | BA | 1st
table 2 postgraduation_table
rollno | degree | division
--------------------------
002 | M.sc | 1st
the result must be
rollno | degree | division
--------------------------
001 | B.tech | 1st
002 | M.sc | 1st
003 | BA | 1st
You want all rows from graduation_table which do not have a row in postgraduation_table plus those in postgraduation_table. This can be expressed with a not exists and union query:
select gt.rollno, gt.degree, gt.division
from graduation_table gt
where not exists (select *
from postgraduation_table pg
where pg.rollno = gt.rollno)
union all
select rollno, degree, division
from postgraduation_table
order by rollno;
Online example: http://rextester.com/IFCQR67320
select
rollno,
case when p.degree is null then g.degree else p.degree end as degree,
case when p.division is null then g.division else p.division end as division
from
grad g
left join
post p using (rollno)
Or better as suggested in the comments:
select
rollno,
coalesce (p.degree, g.degree) as degree,
coalesce (p.division, g.division) as division
from
grad g
left join
post p using (rollno)
Take a union of both tables, and introduce a position column, to rank the relative importance of the two tables. The postgraduate table has a pos value of 1, and the graduate table has a value of 2. Then, apply ROW_NUMBER() over this union query and assign a row number to each rollno group of records (presumed to be either one or at most two records). Finally, perform one more outer subquery to retain the most important record, postgraduate first, graduate second.
SELECT rollno, degree, division
FROM
(
SELECT
rollno, degree, division,
ROW_NUMBER() OVER (PARTITION BY rollno ORDER BY pos) rn
FROM
(
SELECT p.*, 1 AS pos p FROM postgraduation_table
UNION ALL
SELECT p.*, 2 FROM graduation_table p
) t
) t
WHERE t.rn = 1;
This should make your needs :
SELECT dg.rollno, CASE WHEN pg IS NOT NULL THEN pg.degree ELSE gd.degree END AS degree, dg.division
FROM graduation_table AS dg
LEFT OUTER JOIN postgraduation_table AS pg USING (rollno)
GROUP BY dg.rollno, dg.division;
Hope this help.

SQL Server: how to divide the result of sum of total for every customer id

I have 4 tables like this (you can ignore table B because this problem did not use that table)
I want to show the sum of 'total' for each 'sales_id' from table 'sales_detail'
What I want (the result) is like this:
sales_id | total
S01 | 3
S02 | 2
S03 | 4
S04 | 1
S05 | 2
S05 | 3
I have tried with this query:
select
sum(total)
from
sales_detail
where
sales_id = any (select sales_id
from sales
where customer_id = any (select customer_id
from customer)
)
but the query returns a value if 15 because they are the sum of those rows of data.
I have tried to use "distinct" before sum
and the result is [ 1, 2, 3 ] because those are distinct of those rows of data (not sum of each sales_id)
It's all about subquery
You are just so far off track that a simple comment won't help. Your query only concerns one table, sales_detail. It has nothing to do with the other two.
And, it is just an aggregation query:
select sd.sales_id, sum(sd.total)
from sales_detail sd
group by sd.sales_id;
This is actually pretty close to what the question itself is asking.

How do I create a frequency distribution?

I'm trying to create a frequency distribution to show how many customers have transacted 1x, 2x, 3x, etc.
I have a database transactions and column user_id. Each row indicates a transaction, and if a user_id shows up in multiple rows, that user has done multiple transactions.
Now I'd like to get a list that looks something like this:
Tra. | Freq.
0 | 345
1 | 543
2 | 45
3 | 20
4 | 0
5 | 3
etc
Currently I have this, but it just shows a list of users and how many transactions they have had.
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
ORDER BY number_of_transactions DESC;
I did some digging and was suggested that generate_series might help, but I'm stuck and don't know how to move forward.
Use the first result as input to an outer query where you apply the count again, but this time grouping on number_of_transactions:
SELECT number_of_transactions, COUNT(*) AS freq
FROM (
SELECT user_id, COUNT(user_id) as number_of_transactions
FROM transactions
GROUP BY user_id
) A
GROUP BY number_of_transactions;
This would transform a result like:
user_id number_of_transactions
----------- ----------------------
1 2
2 1
3 2
4 4
to this:
number_of_transactions freq
---------------------- -----------
1 1
2 2
4 1