SQLite percentages with small values - sql

So I have this table of subscribers of users and the country they are in.
UserID | Name | Country
-------+-------------------+------------
1 | Zaphod Beeblebrox | UK
2 | Arthur Dent | UK
3 | Gene Kelly | USA
4 | Nat King Cole | USA
I need to produce a list of all the users by percentage from each of the countries. I also need all the smaller member countries (under 1%) to be collapsed into an "OTHERS" category.
I can accomplish a simple "top x" of members trivially with a
SELECT COUNTRY, COUNT(*) AS POPULATION FROM SUBSCRIBERS GROUP BY COUNTRY ORDER BY POPULATION DESC LIMIT 10
and can generate the percentages by PHP server side code, but I don't quite know how to:
Do all of it in SQL including percentage calculations directly in the result
Club all under 1% members into a single OTHERS category.
So I need something like this:
Country | Population
--------+-----------
USA | 25.4%
Brazil | 12%
UK | 5%
OTHERS | 65%
Appreciate the help!

Here is query for this, I used a subquery to count the total number of rows and then used that to get the percentage value for each. The 'Others' category was generated in a separate query. Rows are sorted by descending population with the Others row last.
SELECT * FROM
(SELECT country , ROUND((100.0*COUNT(*)/count_all),1) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) > 1
GROUP BY country
ORDER BY population DESC)
UNION ALL
SELECT 'OTHERS', IFNULL(ROUND(100.0*COUNT(*)/count_all,1),0.0) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) <= 1

Ok I think I might have found a way to do this that's a hell of a lot quicker on execution speed:
SELECT territory,
Round(Sum(percentage), 3) AS Population
FROM (SELECT
Round((Count(*)*100.0)/(SELECT Count(*) FROM subscribers),3) AS Percentage,
CASE
WHEN ((Count(*)*100.0)/(SELECT Count(*) FROM subscribers)) > 2 THEN
country
ELSE 'Other'
END AS Territory
FROM subscribers
GROUP BY country
ORDER BY percentage DESC)
GROUP BY territory
ORDER BY population DESC;

Related

Oracle Sql Developer (Select Count) Twoc olumns

I have to make a query that has the Total number of customers by country and city
country and city are columns that are inside the customer table
On my own I have managed to get the total number of customers per city like this:
SELECT city, COUNT (*)
FROM employees
GROUP BY city
ORDER BY city
But how do I get it together with the country?
looking for information I think it should be something like this and ordered from largest to smallest
Country
City
TOTAL_CUSTOMERS
USA
Kirkland
3
USA
London
2
UK
Redmond
2
UK
Seattle
1
UK
Tacoma
1
What we have been told is to say Total number of customers by country and city.
You simply add country to the column list and group by list:
SELECT country,city, COUNT(*)
FROM employees
GROUP BY country,city
ORDER BY COUNT(*) DESC

Retrieving most frequent value for each group in SQL Server

This is what I have:
AirlineName Departure_City No_of_DepartureCity Arrival_City No_of_ArrivalCity
---------------------------------------------------------------------------------------------------- -------------- ------------------- ------------ -----------------
Air Asia MY 2 JPN 2
Emirates Airlines MY 2 JPN 2
Malaysia Airlines MY 2 GER 2
Malaysia Airlines MY 1 JPN 1
Air Asia MY 1 KOR 1
This is what I want:
AirlineName Departure_City No_of_DepartureCity Arrival_City No_of_ArrivalCity
---------------------------------------------------------------------------------------------------- -------------- ------------------- ------------ -----------------
Air Asia MY 2 JPN 2
Emirates Airlines MY 2 JPN 2
Malaysia Airlines MY 2 GER 2
I have already written a query to retrieve the most frequent data for Departure_City and Arrival_City, but I can't make it grouped together and only show the most frequent data for each AirlineName.
This is my query so far:
SELECT Airline.AirlineName, Flight_Schedule.Departure_City, COUNT(Flight_Schedule.Departure_City) AS No_of_DepartureCity, Flight_Schedule.Arrival_City, COUNT(Flight_Schedule.Arrival_City) AS No_of_ArrivalCity
FROM Airline
LEFT JOIN Aircraft ON Airline.AirlineID = Aircraft.AirlineID
LEFT JOIN Flight_Schedule ON Aircraft.AircraftID = Flight_Schedule.AircraftID
GROUP BY Airline.AirlineName, Flight_Schedule.Departure_City, Flight_Schedule.Arrival_City
ORDER BY COUNT(Flight_Schedule.Departure_City)DESC , COUNT(Flight_Schedule.Arrival_City) DESC
You can make use of Rank or Dense_rank (If you want to select more than two rows having same number of cities) function
Demo
with CTE1 AS(
SELECT A.*,
RANK() OVER(PARTITION BY AirlineName ORDER BY No_of_ArrivalCity desc) as rn
FROM TABLE1 A)
SELECT * FROM CTE1 where rn = 1;
As you're grouping by lots of columns, instead of just 'AirlineName' it's grouping by all of the different values across those number of columns.
To return the number of AirlineName's and their frequency try this:
SELECT Airline.AirlineName, COUNT(*) AS [COUNT]
FROM Airline
GROUP BY Airline.AirlineName
ORDER BY COUNT(*) DESC
If you need the additional columns then your code is already correct, because of how you are grouping it and the individual values contained within the columns.

Is there a way to count group and subgroup using OVER?

I have a very large table with some information about countries including:
City Name | Province | Country | ...
Honolulu HI US
Hilo HI US
Kihei HI US
Annapolis MD US
Laurel MD US
Sidney LD AU
Camberra PP AU
Darwin PP AU
...
And I want my query to look like this (preferably using OVER function to spare performance):
Country | Count_C | Province | Count_P
US 5 MD 2
US 5 HI 3
AU 3 LD 1
AU 3 PP 2
...
I've already managed doing this, but not without losing performance with some subqueries (the query took very long to run in the large table)
Bad Code:
SELECT country_name AS Country
,Count(*) OVER (PARTITION BY country_name) AS Count_C
,province AS Province
,Count(*) OVER (PARTITION BY province) AS Count_P
FROM country_list
GROUP BY country_name
,province
ORDER BY 1 DESC
,4 DESC
I think you just want aggregation with one window function:
SELECT country_name as Country,
SUM(COUNT(*)) OVER (PARTITION BY country_name) as country_cnt,
province,
Count(*) as province_count
FROM country_list
GROUP BY country_name, province
ORDER BY Country DESC, Province DESC;

Reconciliation Automation Query

I have one database and time to time i change some part of query as per requirement.
i want to keep record of results of both before and after result of these queries in one table and want to show queries which generate difference.
For Example,
Consider following table
emp_id country salary
---------------------
1 usa 1000
2 uk 2500
3 uk 1200
4 usa 3500
5 usa 4000
6 uk 1100
Now, my before query is :
Before Query:
select count(emp_id) as count,country from table where salary>2000 group by country;
Before Result:
count country
2 usa
1 uk
After Query:
select count(emp_id) as count,country from table where salary<2000 group by country;
After Query Result:
count country
2 uk
1 usa
My Final Result or Table I want is:
column 1 | column 2 | column 3 | column 4 |
2 usa 2 uk
1 uk 1 usa
...... but if query results are same than it shouldn't show in this table.
Thanks in advance.
I believe that you can use the same approach as here.
select t1.*, t2.* -- if you need specific columns without rn than you have to list them here
from
(
select t.*, row_number() over (order by count) rn
from
(
-- query #1
select count(emp_id) as count,country from table where salary>2000 group by country;
) t
) t1
full join
(
select t.*, row_number() over (order by count) rn
from
(
-- query #2
select count(emp_id) as count,country from table where salary<2000 group by country;
) t
) t2 on t1.rn = t2.rn

Counting occurrences in several columns

I'm making an app that shows people movies to rate, a-la hot-or-not. I'd like to write a query that gets me the number of times a movie has been rated. The table for ratings looks like this:
| id | winner | loser |
| 1 | 1 | 2 |
| 2 | 2 | 3 |
| 3 | 1 | 3 |
I can get the number of times a movie has "won" by running a query like this:
SELECT winner, count(winner) AS number_of_wins
FROM movie_results
GROUP BY winner
ORDER BY number_of_wins DESC;
But I'd like to get another query that shows the total number of times a movie was pitched against other movies, i.e. the number of times a movie has appeared to be rated, whether it was rated above or below the other movie. What is the easiest way to achieve this, using only SQL queries?
Here is one method, using union all:
select movie, count(*) as nummatches, sum(win) as numwins
from ((select winner as movie, 1 as win from match_results) union all
(select loser, 0 from match_results)
) wl
group by movie;
You can do a full join between two derived tables where each table contains the number of losses and wins for each player.
select
coalesce(winner,loser) player,
coalesce(number_of_wins,0) number_of_wins,
coalesce(number_of_losses,0) number_of_losses,
coalesce(number_of_wins,0) + coalesce(number_of_losses,0) number_of_matches
from (
select winner, count(*) number_of_wins
from movie_results
group by winner
) winners full join (
select loser, count(*) number_of_losses
from movie_results
group by loser
) losers on losers.loser = winners.winner
http://sqlfiddle.com/#!15/980d6/3