SQL Rank and SUM - sql

I am facing a bit of issue with an SQL query:
Currently I have 2 tables. The first table lists sales by a vendor and country eg and there are a lot more rows but this is just the gist.
Country id Sale
US 1 100
UK 2 1000
US 3 150
UK 2 200
In the second table I have ids that links to the vendor's name eg
id name
1 john
2 david
3 tom
I need to get the top vendor in each country but sum of sales. the output should look something like this
country id name sum_sales
Would you be able to help. Currently I am only able to groupby and sum and am unable to obtain the top guy in each country. thank you!
I am running this on big_query sql

Use dense_rank() with aggregation :
select yr, Country, id, name, total_sales
from (select extract(year from s.date) as yr,
s.Country, s.id, v.name, sum(s.sales) as total_sales,
dense_rank() over (partition by s.date, s.country order by sum(s.sales) desc) as seq
from sales s inner join
vendors v
on v.id = v.id
group by s.date, s.Country, s.id, v.name
) t
where seq <= 2;
EDIT : For specific year format use FORMAT_DATETIME
FORMAT_DATETIME("%Y", DATETIME "2020-03-19")
By this way, you will get vendors for each country which are having higher sales.
Note : This will display two or more vendors which are having same total sales. If you want only one from them, then use row_number() instead of dense_rank().

In BigQuery, you can use window functions with aggregation:
select id, name, country, sum_sales
from (select s.id, v.name, s.country, sum(sales) as sum_sales
row_number() over (partition by s.country order by sum(sales) desc) as seqnum
from sales s join
vendors v
on v.id = v.id
group by s.id, v.name, s.country
) sv
where seqnum = 1;

Below is for BigQuery Standard SQL
#standardSQL
SELECT AS VALUE ARRAY_AGG(t ORDER BY sum_sale DESC LIMIT 1)[OFFSET(0)]
FROM (
SELECT country, id, name, SUM(sale) sum_sale
FROM `project.dataset.vendors`
JOIN `project.dataset.sales`
USING(id)
GROUP BY id, name, country
) t
GROUP BY country

Related

how to make a request?

I have a table Tabl1 : id, name, country, year, medal.
how can I find the top 10 countries by the number of medals for each year in 1 request?
thanks:)
You haven't told us anything about your table schema or the data, so this is a guess!
Going to assume your medal column contains the qty of medals for each Id/name, so you just need to rank by the sum of medals. Something along the lines of:
select [year], country, [Rank] from (
select [year], country, Rank() over(partition by [year] order by Sum(medal) desc ) [Rank]
from Tabl1
group by [year],country
)x
where [Rank]<=10
order by [year], [Rank]
here you can get the top 10 countries in each year:
select * from
(
select country,year,count(*),row_number() over (order by count(*) desc) as rn
from table
group by country, year
) tt
where tt.rn < 11
the sub query groups the data per country and year and gives you count() of each group, but at the same time It sorts them per count(*) desc and gives the a row number per each group ( it happanes using row_number() window funcion) , so the country with the most medal in eacg year is on top and it gets row number = 1 in each group , you need top 10 , so you filter them tt.rn < 11 in the main query.
If you want 10 countries per year:
with data as (
select country, "year" as yr,
rank() over (partition by "year" order by count(*) desc) as rnk
from T
group by country, "year"
)
select yr as "year", country from data
where rnk <= 10
order by yr, rnk;
Note that if ties are possible this could return more than ten rows for any given year.

How to retrieve the most frequent value of a column for a specific ID in a table

I'm trying to fetch the most frequent value from a SQLite 3 database table for each specific ID (which is the ID of a company). I have tried with GROUP BY and ORDER BY as well as with COUNT() function.
SELECT company_id, max(car)
FROM car_orders
GROUP by company_id
ORDER by max(car)
For a specific company_id (9) I am expecting 'Audi' to be in result but this is not the case as its 'Volkswagen' (which is wrong)
Similar to your attempts, consider joining two aggregates that calculates COUNT per car and company and MAX of same counter per company. Below uses CTE introduced in SQLite in version 3.8.3, released in February 2014.
WITH cnt AS (
SELECT company_id, car, COUNT(*) AS car_count
FROM car_orders
GROUP by company_id, car
),
max_cnt AS (
SELECT cnt.company_id, MAX(cnt.car_count) as max_count
FROM cnt
GROUP BY cnt.company_id
)
SELECT cnt.company_id, cnt.car
FROM cnt
INNER JOIN max_cnt
ON cnt.company_id = max_cnt.company_id
AND cnt.car_count = max_cnt.max_count
In the more recent versions of SQLite, you can use window functions:
SELECT cc.*
FROM (SELECT company_id, car, COUNT(*) as cnt,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY COUNT(*) DESC) as seqnum
FROM car_orders
GROUP by company_id, car
) cc
WHERE seqnum = 1;
In earlier versions, it is a little more complicated:
WITH cc as (
SELECT company_id, car, COUNT(*) as cnt
FROM car_orders
GROUP by company_id, car
)
SELECT cc.*
FROM cc
WHERE cc.cnt = (SELECT MAX(cc2.cnt)
FROM cc cc2
WHERE cc2.company_id = cc.company_id
);

SQL RANK with multiple WHERE clause

I have got few sales offices, together with their sales. I am trying to set-up report that will basically tell how is each office performing. Getting some SUMs, COUNTs are quite easy, however I am struggling with getting rank of single office.
I would like to have this query return the rank of single office, during the entire period and/or specified time (eg. BETWEEN '2015-01-01' AND '2015-01-15')
I need to also exclude some offices from the rank list (eg. OfficeName NOT IN ('GGG','QQQ')), so using the sample data, the rank of office 'XYZ' would be 5.
In case that the OfficeName = 'XYZ' is included in WHERE clause, the RANK would be obviously = 1 as SQL filters out other rows, not contained in WHERE clause before executing the rest of the code.
Is there any way of doing the same, without using the TemporaryTable ?
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
--AND OfficeName = 'XYZ'
GROUP BY OfficeName
ORDER BY 2 DESC;
I am using MS SQL server 2008.
SQL Fiddle with some random data is here: http://sqlfiddle.com/#!3/fac7a/35
Many thanks for help!
if i understand you correctly you want to do:
SELECT *
FROM (
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
) dat
WHERE OfficeName = 'XYZ';
You just need to wrap your code as derived table or use a CTE like this and then do the filter for OfficeName = 'XYZ'.
;WITH CTE AS
(
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
)
SELECT *
FROM CTE
WHERE OfficeName = 'XYZ';
Here is an amusing way to do this without a subquery:
SELECT TOP 1 OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t JOIN
Office o
ON t.TransID = o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
ORDER BY (CASE WHEN OfficeName = 'XYZ' THEN 1 ELSE 2 END);

Max count by customer and category

I have transactional data that looks like this
Account ProductCategory
1 a
1 a
1 b
2 c
2 d
2 d
I need to find the ProductCategory that appears most per customer. Results:
Account ProductCategory
1 a
2 d
My result was a long with many nested subqueries. Any good ideas?
Thank you in advance for the help.
Most databases support the ANSI-standard window functions, particularly row_number(). You can use this with aggregation to get what you want:
select Account, ProductCategory
from (select Account, ProductCategory, count(*) as cnt,
row_number() over (partition by Account order by count(*) desc) as seqnum
from table t
group by Account, ProductCategory
) apc
where seqnum = 1;
This can be done using analytic SQL , or just using count over group. The syntax depends on the RDBMS, as asked by Michael .
you can try following SQL :
select * from
(select account, ProductCategory, ct , ROW_NUMBER() OVER (partition by account, ProductCategory ORDER BY ct DESC ) As myRank
from (select account, ProductCategory, count(0) as ct
from <table>
group by account, ProductCategory ) t ) t2
where t2.myRank = 1
Code:
WITH A AS (SELECT [Account], ProductCategory, COUNT([ProductCategory]) OVER(PARTITION BY ProductCategory) AS [Count]
FROM tbl_all)
SELECT A.Account, ProductCategory
FROM A INNER JOIN (SELECT Account, MAX([Count]) AS Count FROM A GROUP BY A.Account) AS B ON A.Account=B.Account AND A.Count=B.Count
GROUP BY A.Account, ProductCategory

Selecting maximum value for a column with contraint

Table is as follows
Company, Vertical, Counts
For each company I want to get the SUM of counts based on a specific Vertical having the highest count
Company Vertical Counts
IBM Finance 10
IBM R&D 5
IBM PR 2
I would like to get the following output
IBM Finance 17
A self-join should do it.
select company, vertical, total_count
from(
select sum(counts) as total_count
from table
)a
cross join table
where counts=(select max(counts) from table);
Depending on your RDBMS, you can also use a window function (eg sum(count) over () as total_count) and not have to worry about the cross join.
It's a twist on the problem of "How to get the MAX row" (DBA.SE link)
get total and highest vertical per Company in a simple aggregate
use these to identify the row in the source table
Something like this, untested
SELECT
t.Company, t.Vertical, m.CompanyCount
FROM
( --get total and highest vertical per Company
SELECT
COUNT(*) AS CompanyCount,
MAX(Vertical) AS CompanyMaxVertical,
Company
FROM MyTable
GROUP BY Company
) m
JOIN --back to get the row for that company with highest vertical
MyTable t ON m.Company = t.Company AND m.CompanyMaxVertical = t.Vertical
Edit: this is closer to standard SQL than a ROW_NUMBER because we don't know the platform
select Company,
Vertical,
SumCounts
from (
select Company,
Vertical,
row_number() over(partition by Company order by Counts desc) as rn,
sum(Counts) over(partition by Company) as SumCounts
from YourTable
) as T
where rn = 1
SELECT company,
vertical,
total_sum
FROM (
SELECT Company,
Vertical,
sum(counts) over (partition by null) as total_sum,
rank() over (order by counts desc) as count_rank
FROM the_table
) t
WHERE count_rank = 1