PostgreSQL list companies and rank by sales - sql

So I have:
companies (id, name, tenant_id)
invoices (id, company_id, tenant_id, total)
What I want to do is return a result set like:
company | Feb Sales | Feb Rank | Lifetime Sales | Lifetime Rank
-----------------------------------------------------------------------
ABC Comp | 1,000 | 1 | 2,000 | 2
XYZ Corp | 500 | 2 | 5,000 | 1
I can do the sales totals using subselects, but when I do the rank always returns 1. I'm assuming because it only returns 1 row per subselect so will always be the top row?
Here is a piece of the sql:
SELECT
"public".companies."name",
(
SELECT
rank() OVER (PARTITION BY i.tenant_id ORDER BY sum(grand_total) DESC) AS POSITION
FROM
invoices i
where
company_id = companies.id
group by
i.tenant_id, i.company_id
)
from companies

Below is untested version that can have typos. Please treat it just as description of the approach. For simplicity I assumed that invoices have a month column.
SELECT
"public".companies."name",
rank() OVER (PARTITION BY sales.companies ORDER BY sales.lifetime) As "Lifetime Rank",
rank() OVER (PARTITION BY sales.companies ORDER BY sales.month As "One Month"
FROM companies LEFT JOIN
(
SELECT
SUM(grand_total) As Lifetime,
SUM(CASE WHEN i.month = <the month of report>, grand_total, 0) As Month
FROM
invoices i
GROUP BY company_id
) sales
ON companies.company_id = sales.company_id
If you run into problems, add the actual code that you used and sample data to your post and I will attempt to create a live demo for you.

Related

SELECT only rows when count=1 - without additional SELECT or/ and having

I wonder if there is a way to build a query without joins or/and having clause that would return the same result as the query below? I already found similar question (select and count rows) but didn't find the answer.
SELECT ID, CATEGORY, PRODUCT, DESC
FROM SALES s
JOIN (SELECT ID, COUNT(CATEGORY)
FROM SALES
GROUP by ID
HAVING count(CATEGORY)=1) S2 ON S.ID=S2.ID;
So the table looks like
ID | Country | Product | DESC
1 | USA | Cream | Super cream
1 | Canada | Toothpaste| Great Toothpaste
2 | Germany | Beer | Tasty Beer
and the result I would like to get is
ID | Country | Product | DESC
2 | Germany | Beer | Tasty Beer
because id=1 has 2 different countries assigned
I'm using SQL Server
In general I'm interested in the 'fastest' solution. The table is huge and I just wonder if there is a way to do it smarter.
you may want to consider this query.
select t2.id, t2.category, t2.product, t2.desc from (
select id, category, product,
case when (select count(1) from sales where id=t1.id group by id) as ct
,desc
from sales t1) t2 where t2.ct = 1
You can try this Query:
SELECT ID, CATEGORY, PRODUCT, DESC
FROM SALES s
WHERE 1 = (
SELECT COUNT(*)
FROM SALES x
WHERE x.ID = s.ID
);
One method uses window functions:
SELECT ID, CATEGORY, PRODUCT, DESC
FROM (SELECT s.*, COUNT(*) OVER (PARTITION BY ID) as cnt
FROM SALES s
) s
WHERE cnt = 1;
However, the fastest solution would require a unique id and an index. That would be:
select s.*
from sales s
where not exists (select 1
from sales s2
where s2.id = s.id and
s2.<unique key> <> s.<unique key>
);
This can take advantage of an index on (id, <unique key>).
Note: This particular formulation assumes that category is never null.

Retrieve rows in SQL based on Maximum value in multiple columns

I have a SQL table with the following fields:
Company ID
Company Name
Fiscal Year
Fiscal Quarter
There are multiple records for various fiscal years and fiscal quarters for each company. I want to retrieve the rows for each company based on Maximum Fiscal Year and Maximum Fiscal Quarter. For example, if the table has the following:
Company ID | Company Name | Fiscal Year | Fiscal Quarter
1 | Test1 | 2017 | 1
1 | Test1 | 2017 | 2
1 | Test1 | 2018 | 1
1 | Test1 | 2018 | 2
2 | Test2 | 2018 | 3
2 | Test2 | 2018 | 4
The query should return the following (Only the record with the maximum fiscal year and maximum fiscal quarter for that year):
Company ID | Company Name | Fiscal Year | Fiscal Quarter
1 | Test1 | 2018 | 2
2 | Test2 | 2018 | 4
I am able to use the below query to get the records with the maximum fiscal year but not sure how to further select the maximum quarter within the year:
SELECT fp.companyId, fp.companyname, fp.fiscalyear,fp.fiscalquarter
FROM dbo.ciqFinPeriod fp
LEFT OUTER JOIN dbo.ciqFinPeriod fp2
ON (fp.companyId = fp2.companyId AND fp.fiscalyear < fp2.fiscalyear)
WHERE fp2.companyId IS NULL
Thank you so much for any assistance!
If you have a list of companies, I would simply do:
select fp.*
from Companies c outer apply
(select top (1) fp.*
from dbo.ciqFinPeriod fp
where fp.companyId = c.companyId
order by fp.fiscalyear desc, fp.fiscalquarter desc
) fp;
If not, then row_number() is probably the simplest method:
select fp.*
from (select fp.*,
row_number() over (partition by fp.companyId order by order by fp.fiscalyear desc, fp.fiscalquarter desc) as seqnum
from dbo.ciqFinPeriod fp
) fp
where seqnum = 1;
Or the somewhat more abstruse (clever ?):
select top (1) with ties fp.*
from dbo.ciqFinPeriod fp
order by row_number() over (partition by fp.companyId order by order by fp.fiscalyear desc, fp.fiscalquarter desc)
I've had some success with the following, same output as you.
create table #table
(
CompanyID int,
CompanyName varchar(200),
Year int,
Quater int
)
insert into #table (CompanyID,CompanyName,Year,Quater)
VALUES
('1','Test1','2017','1'),
('1','Test1','2017','2'),
('1','Test1','2018','1'),
('1','Test1','2018','2'),
('2','Test2','2018','3'),
('2','Test2','2018','4')
SELECT CompanyID,CompanyName,Year,Quater
FROM
(
Select CompanyID,CompanyName,Year,Quater
, ROW_NUMBER() OVER(PARTITION BY CompanyID ORDER BY Year desc,Quater DESC)
as RowNum
from #table
) X WHERE RowNum = 1
drop table #table
Select Company I'd, company name,Max(year),Max(quarter) group by 1,2

Get the results of a subquery in SQL

How do you create a join to get the latest invoice for all customers?
Tables:
- Invoices
- Customers
Customers table has: id, last_invoice_sent_at, last_invoice_guid
Invoices table has: id, customer_id, sent_at, guid
I'd like to fetch the latest invoice for every customer and, with that data, update last_invoice_sent_at and last_invoice_guid in the Customers table.
You want to use distinct on. For a query soring by customer_id and then by invoice, it would return the first row for each distinct value indicated in distinct on. That is the rows with * below:
customer_id | sent_at |
1 | 2014-07-12 | *
1 | 2014-07-10 |
1 | 2014-07-09 |
2 | 2014-07-11 | *
2 | 2014-07-10 |
So your update query could look like:
update customers
set last_invoice_sent_at = sent_at
from (
select distinct on (customer_id)
customer_id,
sent_at
from invoices
order by customer_id, sent_at desc
) sub
where sub.customer_id = customers.customer_id
#Konrad provided a flawless SQL statement. But since we are only interested in a single column, GROUP BY will be more efficient than DISTINCT ON (which is great to retrieve multiple columns from the same row):
UPDATE customers c
SET last_invoice_sent_at = sub.last_sent
FROM (
SELECT customer_id, max(sent_at) AS last_sent
FROM invoices
GROUP BY 1
) sub
WHERE sub.customer_id = c.customer_id;

Aggregate highest prices per client of salesmen

I have a table like this:
SELECT * FROM orders;
client_id | order_id | salesman_id | price
-----------+----------+-------------+-------
1 | 167 | 1 | 65
1 | 367 | 1 | 27
2 | 401 | 1 | 29
2 | 490 | 2 | 48
3 | 199 | 1 | 68
3 | 336 | 2 | 22
3 | 443 | 1 | 84
3 | 460 | 2 | 92
I want to find the an array of order_ids for each of the highest priced sales for each unique salesman and client pair. In this case I want the resulting table:
salesman_id | order_id
-------------+----------------
1 | {167, 401, 443}
2 | {490, 460}
So far I have an outline for a query:
SELECT salesman_id, max_client_salesman(order_id)
FROM orders
GROUP BY salesman_id;
However I'm having trouble writing the aggregate function max_client_salesman.
The documentation online for aggregate functions and arrays in postgres is very minimal. Any help is appreciated.
Standard SQL
I would combine the window function last_value() or firstvalue() with DISTINCT to the get the orders with the highest price per (salesman_id, client_id) efficiently and then aggregate this into the array you are looking for with the simple aggregate function array_agg().
SELECT salesman_id
,array_agg(max_order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT
salesman_id, client_id
,last_value(order_id) OVER (PARTITION BY salesman_id, client_id
ORDER BY price
ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING) AS max_order_id
FROM orders
) x
GROUP BY salesman_id
ORDER BY salesman_id;
Returns:
salesman_id | most_expensive_orders_per_client
-------------+------------------------------------
1 | {167, 401, 443}
2 | {490, 460}
SQL Fiddle.
If there are multiple highest prices per (salesman_id, client_id), this query pick only one order_id arbitrarily - for lack of definition.
For this solution it is essential to understand that window functions are applied before DISTINCT. How you to combine DISTINCT with a window function:
PostgreSQL: running count of rows for a query 'by minute'
For an explanation on ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING refer to this closely related answer on dba.SE.
Simper with non-standard DISTINCT ON
PostgreSQL implements, as extension to the SQL standard, DISTINCT ON. With it you can very effectively select rows unique according to a defined set of columns.
It won't get simpler or faster than this:
SELECT salesman_id
,array_agg(order_id) AS most_expensive_orders_per_client
FROM (
SELECT DISTINCT ON (1, client_id)
salesman_id, order_id
FROM orders
ORDER BY salesman_id, client_id, price DESC
) x
GROUP BY 1
ORDER BY 1;
SQL Fiddle.
I also use positional parameters for shorter syntax. Details:
Select first row in each GROUP BY group?
I think you want the Postgres function array_agg in combination with row_number() However, your description of the query does not make sense to me.
The following gets clients and salesmen and the list of orders for the highest priced order by salesman:
select client_id, salesman_id, array_agg(order_id)
from (select o.*,
row_number() over (partition by salesman_id order by price desc) as sseqnum,
row_number() over (partition by client_id order by price desc) as cseqnum
from orders o
) o
where sseqnum = 1
group by salesman_id, client_id
I don't know what you mean by "highest priced sales for each salesman and client". Perhaps you want:
where sseqnum = 1 or cseqnum = 1

SQL Server - Most recent date and sale amount columns

I'm using SQL Server 2008 R2 to complete a query. I have a sale table that contains a unique sale id, a customer id, a sale date, and a sale amount. I'm trying to create a table that has the most recent sale for each customer and the amount for that sale.
| customer_id | most recent sale date | sale amount |
| 1 |2012-06-11 00:00:00.000| 150 |
| 2 |2012-01-07 00:00:00.000| 55 |
| 3 |2012-02-18 00:00:00.000| 117 |
| 4 |2012-09-02 00:00:00.000| 25 |
I have the first two columns with this query:
SELECT DISTINCT customer_id, MAX(sale_date)
FROM sale
GROUP BY customer_id
When I try to add the amount of the sale, everything I try includes every sale for that customer, not just the most recent one. Is there a way to do this? Keep in mind there is a unique sale id on this table that might be of some use. Thank you for your time.
You can use ROW_NUMBER with PARTITION BY in a CTE:
WITH CTE AS
(
SELECT sale_id,customer_id,sale_date, sale_amount
, RN = ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sale_date DESC)
FROM sale
)
SELECT sale_id, customer_id, sale_date, sale_amount
FROM CTE
WHERE RN = 1
Here's a sample fiddle: http://sqlfiddle.com/#!3/513280/1/0
SELECT a.customer_id, a.sale_date, a.sale_amount
FROM sale a
INNER JOIN
(
SELECT customer_id, MAX(sale_date) maxSale
FROM sale
GROUP BY customer_id
) b ON a.customer_ID = b.customer_ID AND
a.sale_date = b.maxSale
By just using Aggregate function MAX in an independent subquery worked for me.
SELECT Customerid, Saleid, Sldate AS 'Most Recent Sale', Saleamount
FROM Sale S1
WHERE Sldate = (SELECT MAX(Sldate) FROM Sale S2);
Using Aliases is a mandatory step.
Hope this helps!
SELECT CUSTOMER_ID,SALE_AMOUNT,SALEDATE FROM SALE WHERE SALEDATE =(SELECT MAX(SALEDATE) FROM SALE);