Retrieving most frequent value for each group in SQL Server - sql

This is what I have:
AirlineName Departure_City No_of_DepartureCity Arrival_City No_of_ArrivalCity
---------------------------------------------------------------------------------------------------- -------------- ------------------- ------------ -----------------
Air Asia MY 2 JPN 2
Emirates Airlines MY 2 JPN 2
Malaysia Airlines MY 2 GER 2
Malaysia Airlines MY 1 JPN 1
Air Asia MY 1 KOR 1
This is what I want:
AirlineName Departure_City No_of_DepartureCity Arrival_City No_of_ArrivalCity
---------------------------------------------------------------------------------------------------- -------------- ------------------- ------------ -----------------
Air Asia MY 2 JPN 2
Emirates Airlines MY 2 JPN 2
Malaysia Airlines MY 2 GER 2
I have already written a query to retrieve the most frequent data for Departure_City and Arrival_City, but I can't make it grouped together and only show the most frequent data for each AirlineName.
This is my query so far:
SELECT Airline.AirlineName, Flight_Schedule.Departure_City, COUNT(Flight_Schedule.Departure_City) AS No_of_DepartureCity, Flight_Schedule.Arrival_City, COUNT(Flight_Schedule.Arrival_City) AS No_of_ArrivalCity
FROM Airline
LEFT JOIN Aircraft ON Airline.AirlineID = Aircraft.AirlineID
LEFT JOIN Flight_Schedule ON Aircraft.AircraftID = Flight_Schedule.AircraftID
GROUP BY Airline.AirlineName, Flight_Schedule.Departure_City, Flight_Schedule.Arrival_City
ORDER BY COUNT(Flight_Schedule.Departure_City)DESC , COUNT(Flight_Schedule.Arrival_City) DESC

You can make use of Rank or Dense_rank (If you want to select more than two rows having same number of cities) function
Demo
with CTE1 AS(
SELECT A.*,
RANK() OVER(PARTITION BY AirlineName ORDER BY No_of_ArrivalCity desc) as rn
FROM TABLE1 A)
SELECT * FROM CTE1 where rn = 1;

As you're grouping by lots of columns, instead of just 'AirlineName' it's grouping by all of the different values across those number of columns.
To return the number of AirlineName's and their frequency try this:
SELECT Airline.AirlineName, COUNT(*) AS [COUNT]
FROM Airline
GROUP BY Airline.AirlineName
ORDER BY COUNT(*) DESC
If you need the additional columns then your code is already correct, because of how you are grouping it and the individual values contained within the columns.

Related

Display the resortid, name of the resort ,booking count and the total amount collected in each resort

This is the query written by me:
select resort.resortid,
resort.resortname,
(count(booking.bookingid)) as "TOTALBOOKING",
booking.totalcharge as "TOTALAMOUNT"
from resort
join booking on booking.resortid=resort.resortid
group by resort.resortid,resort.resortname,booking.totalcharge
order by resort.resortid;
and output:
RESORTID RESORTNAME TOTALBOOKING TOTALAMOUNT
---------- ------------------------------ ------------ -----------
1001 TREE OF LIFE RESORT 1 30000
1001 TREE OF LIFE RESORT 1 48000
1002 PALETTE RESORTS 1 30000
1003 GRAND GRT 1 32000
1004 GREEN COCONUT 1 30000
Expected Output:
RESORTID RESORTNAME TOTALBOOKING TOTALAMOUNT
---------- ------------------------------ ------------ -----------
1001 TREE OF LIFE RESORT 2 78000
1002 PALETTE RESORTS 1 30000
1003 GRAND GRT 1 32000
1004 GREEN COCONUT 1 30000
How to add the count of totalbookings having same resortid ?
You can use the windowed version of count().
SELECT resort.resortid,
resort.resortname,
count(booking.bookingid) AS "TOTALBOOKING",
booking.totalcharge AS "TOTALAMOUNT",
count(booking.bookingid) OVER (PARTITION BY resort.resortid) AS "totalbookings having same resortid"
FROM resort
INNER JOIN booking
ON booking.resortid = resort.resortid
GROUP BY resort.resortid,
resort.resortname,
booking.totalcharge
ORDER BY resort.resortid;
I think that you want to sum() the totalcharge per resortid and resortname: so you need just the two latter columns in the group by clause (not totalcharge).
select
r.resortid,
r.resortname,
count(*) totalbooking ,
sum(b.totalcharge) as totalamount
from resort r
inner join booking b on b.resortid = r.resortid
group by r.resortid, r.resortname
order by r.resortid;
Note that table aliases make the query easier to write and read.
This gives you one row per resort, with the total number of bookings and the sum of totalcharge.

SQLite percentages with small values

So I have this table of subscribers of users and the country they are in.
UserID | Name | Country
-------+-------------------+------------
1 | Zaphod Beeblebrox | UK
2 | Arthur Dent | UK
3 | Gene Kelly | USA
4 | Nat King Cole | USA
I need to produce a list of all the users by percentage from each of the countries. I also need all the smaller member countries (under 1%) to be collapsed into an "OTHERS" category.
I can accomplish a simple "top x" of members trivially with a
SELECT COUNTRY, COUNT(*) AS POPULATION FROM SUBSCRIBERS GROUP BY COUNTRY ORDER BY POPULATION DESC LIMIT 10
and can generate the percentages by PHP server side code, but I don't quite know how to:
Do all of it in SQL including percentage calculations directly in the result
Club all under 1% members into a single OTHERS category.
So I need something like this:
Country | Population
--------+-----------
USA | 25.4%
Brazil | 12%
UK | 5%
OTHERS | 65%
Appreciate the help!
Here is query for this, I used a subquery to count the total number of rows and then used that to get the percentage value for each. The 'Others' category was generated in a separate query. Rows are sorted by descending population with the Others row last.
SELECT * FROM
(SELECT country , ROUND((100.0*COUNT(*)/count_all),1) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) > 1
GROUP BY country
ORDER BY population DESC)
UNION ALL
SELECT 'OTHERS', IFNULL(ROUND(100.0*COUNT(*)/count_all,1),0.0) ||'%' AS population
FROM (SELECT count(*) count_all FROM subscribers) AS sq,
subscribers s
WHERE (SELECT 100*count(*)/count_all
FROM subscribers s2
WHERE s2.country = s.country) <= 1
Ok I think I might have found a way to do this that's a hell of a lot quicker on execution speed:
SELECT territory,
Round(Sum(percentage), 3) AS Population
FROM (SELECT
Round((Count(*)*100.0)/(SELECT Count(*) FROM subscribers),3) AS Percentage,
CASE
WHEN ((Count(*)*100.0)/(SELECT Count(*) FROM subscribers)) > 2 THEN
country
ELSE 'Other'
END AS Territory
FROM subscribers
GROUP BY country
ORDER BY percentage DESC)
GROUP BY territory
ORDER BY population DESC;

Reconciliation Automation Query

I have one database and time to time i change some part of query as per requirement.
i want to keep record of results of both before and after result of these queries in one table and want to show queries which generate difference.
For Example,
Consider following table
emp_id country salary
---------------------
1 usa 1000
2 uk 2500
3 uk 1200
4 usa 3500
5 usa 4000
6 uk 1100
Now, my before query is :
Before Query:
select count(emp_id) as count,country from table where salary>2000 group by country;
Before Result:
count country
2 usa
1 uk
After Query:
select count(emp_id) as count,country from table where salary<2000 group by country;
After Query Result:
count country
2 uk
1 usa
My Final Result or Table I want is:
column 1 | column 2 | column 3 | column 4 |
2 usa 2 uk
1 uk 1 usa
...... but if query results are same than it shouldn't show in this table.
Thanks in advance.
I believe that you can use the same approach as here.
select t1.*, t2.* -- if you need specific columns without rn than you have to list them here
from
(
select t.*, row_number() over (order by count) rn
from
(
-- query #1
select count(emp_id) as count,country from table where salary>2000 group by country;
) t
) t1
full join
(
select t.*, row_number() over (order by count) rn
from
(
-- query #2
select count(emp_id) as count,country from table where salary<2000 group by country;
) t
) t2 on t1.rn = t2.rn

Hive sql: count and avg

I'm recently trying to learn Hive and i have a problem with a sql consult.
I have a json file with some information. I want to get the average for each register. Better in example:
country times
USA 1
USA 1
USA 1
ES 1
ES 1
ENG 1
FR 1
then with next consult:
select country, count(*) from data;
I obtain:
country times
USA 3
ES 2
ENG 1
FR 1
then i should get next out:
country avg
USA 0,42 (3/7)
ES 0,28 (2/7)
ENG 0,14 (1/7)
FR 0,14 (1/7)
I don't know how i can obtain this out from the first table.
I tried:
select t1.country, avg(t1.tm),
from (
select country,count(*)as tm from data where not country is null group by country
) t1
group by t1.country;
but my out is wrong.
Thanks for help!! BR.
Divide the each group count by total count to get the result. Use Sub-Query to find the total number of records in your table
Try this
select t1.country, count(*)/IFNULL((select cast(count(*) as float) from data),0)
from data
group by t1.country;

Specific Ordering in SQL

I have a SQL Server 2008 database. In this database, I have a result set that looks like the following:
ID Name Department LastOrderDate
-- ---- ---------- -------------
1 Golf Balls Sports 01/01/2015
2 Compact Disc Electronics 02/01/2015
3 Tires Automotive 01/15/2015
4 T-Shirt Clothing 01/10/2015
5 DVD Electronics 01/07/2015
6 Tennis Balls Sports 01/09/2015
7 Sweatshirt Clothing 01/04/2015
...
For some reason, my users want to get the results ordered by department, then last order date. However, not by department name. Instead, the departments will be in a specific order. For example, they want to see the results ordered by Electronics, Automotive, Sports, then Clothing. To throw another kink in works, I cannot update the table schema.
Is there a way to do this with a SQL Query? If so, how? Currently, I'm stuck at
SELECT *
FROM
vOrders o
ORDER BY
o.LastOrderDate
Thank you!
You can use case expression ;
order by case when department = 'Electronics' then 1
when department = 'Automotive' then 2
when department = 'Sports' then 3
when department = 'Clothing' then 4
else 5 end
create a table for the departments that has the name (or better id) of the department and the display order. then join to that table and order by the display order column.
alternatively you can do a order by case:
ORDER BY CASE WHEN Department = 'Electronics' THEN 1
WHEN Department = 'Automotive' THEN 2
...
END
(that is not recommended for larger tables)
Here solution with CTE
with c (iOrder, dept)
as (
Select 1, 'Electronics'
Union
Select 2, 'Automotive'
Union
Select 3, 'Sports'
Union
Select 4, 'Clothing'
)
Select * from c
SELECT o.*
FROM
vOrders o join c
on c.dept = o.Department
ORDER BY
c.iOrder