SQL: Sum / Group By Issue for Multiple Rows - sql

I have looked elsewhere, but not managed to get an answer to this, so hoping someone with much more SQL experience can help me out on this!
I have the following portfolio table:
Ticker Company_ID Exposure
ABC 1 0.02
DEF 2 0.10
XYZ 3 0.01
GTS 3 0.01
And the following information table (where there are duplicates, with other information, and they cannot be deleted):
Company_ID Company_Name
1 Alpha
2 Defacto
2 Defacto
3 XeeWhy
3 XeeWhy
And I would like the result to be of the form
Company_ID Company_Name Sum(Exposure)
1 Alpha 0.02
2 Defacto 0.10
3 XeeWhy 0.02
I can run something to get a simple sum from the portfolio table, but this does not include the company name:
Select Distinct Company_ID, Sum(Exposure)
From Portfolio
Group By Company_ID
But whenever I join the tables to get the Company Name, I get the sum duplicated depending how many times they appear in the Information table.
Any help or pointers would be much appreciated!
Thanks!

Your simplest way would be to make the JOIN to your companies table DISTINCT, something like this:
Select p.Company_ID,
c.Company_name,
Sum(Exposure) as Exposure
From Portfolio p
INNER JOIN (
SELECT DISTINCT Company_id, Company_Name
FROM Companies) c
ON c.Company_id = p.Company_ID
Group By p.Company_ID,
c.Company_Name

Try to join a subquery, that gets the distinct company information and a subquery getting the grouped portfolio data.
SELECT x1.company_id,
x1.company_name,
x2.exposure
FROM (SELECT DISTINCT
company_id,
company_name
FROM information) x1
LEFT JOIN (SELECT company_id,
sum(exposure) exposure
FROM portfolio
GROUP BY company_id) x2
ON x2.company_id = x1.company_id;
I wasn't sure, if you want all companies in the result or only those, that have portfolio data. If you want the latter, change the LEFT JOIN to INNER JOIN.

Try this simple query:
SELECT (SELECT TOP 1 Company_Name FROM CompanyTable
WHERE Company_ID = P.Company_ID) Company_Name,
Sum(Exposure)
FROM Portfolio P
GROUP BY Company_ID

I used a CTE to get the Aggregation out of the way first:
Create table #portfolio (Ticker varchar(10), Company_ID int,Exposure decimal(10,2))
Insert into #portfolio values
('ABC', 1, 0.02),
('DEF', 2, 0.10),
('XYZ', 3, 0.01),
('GTS', 3, 0.01)
Create table #Information (Company_ID int,Company_Name varchar(10))
Insert into #Information values
(1,'Alpha'),
(2,'Defacto'),
(2,'Defacto'),
(3,'XeeWhy'),
(3,'XeeWhy')
;WITH CTE as(
SELECT Company_ID, SUM(Exposure) EXP from #portfolio GROUP BY Company_ID
)
SELECT t1.Company_ID,t2.Company_Name, t1.EXP
from CTE t1
INNER JOIN (SELECT DISTINCT Company_ID, Company_Name from #Information) t2 on
t1.Company_ID = t2.Company_ID

Related

Combine Customer and Purchase Date tables for latest purchase, but include nulls

I've got two tables, one where customer ID is store and another that stores each date they had a purchase on. I am stuck on keeping all new customers that don't have a purchase date yet when querying for the max purchase date for all customers.
CustomerTable:
CustomerID
Full_Name
1
John Doe
2
Jane Doe
PurchaseDates:
CustomerID
Purchase_Date
1
11/21/2021
1
4/19/2003
I have set up a view in SQL that combines the two and queries for the MAX purchase date for each customer. The problem is that since I am using MAX, customers that have not purchased anything yet do not show up as they either do not have an entry in PurchaseDates table or their purchase_date field is blank.
My SQL View Code:
SELECT ct.CustomerID,
ct.Full_Name,
pd.Purchase_Date,
FROM CustomerTable AS ct
LEFT OUTER JOIN PurchaseDates AS pd
ON ct.CustomerID = pd.CustomerID
WHERE EXISTS (SELECT 1
FROM PurchaseDates AS pd_latest
WHERE ( CustomerID= pd.CustomerID)
GROUP BY CustomerID
HAVING ( Max(Purchase_Date) = pd.Purchase_Date))
The result in my example above yields only customerID 1 with the purchase date of 11/21/2021, but I'd like to also display CustomerID 2 with a null date for their purchase_date. Not really sure how to proceed apart from seeing that some have opted to replace all nulls with arbitrary days.
The end result should be
CustomerID
Full_Name
Purchase_Date
1
John Doe
11/21/2021
2
Jane Doe
Appreciate the help
You only need a single value from the PurchaseDates table so a simple correlated subquery is all you require:
select ct.CustomerID, ct.Full_Name,
(
select Max(pd.Purchase_Date)
from PurchaseDates pd
where pd.CustomerId = ct.CustomerId
) as Purchase_Date
from CustomerTable ct;
Should more than a single column be required then you could apply the appropriate row:
select ct.CustomerID, ct.Full_Name, pd.*
from CustomerTable ct
outer apply (
select top(1) *
from PurchaseDates pd
where pd.CustomerId = ct.CustomerId
order by pd.Purchase_date desc
)pd;
Another version of the correlated subquery :
select *
from (
(select Full_name,
your_date,
(select max(your_date) from PurchaseDates c where c.id=A.id ) as max_date
from CustomerTable A
LEFT JOIN PurchaseDates B ON A.ID =B.ID)) x
where (x.max_date = your_date) or your_date is null or max_date is null

How to aggregate different CTEs in outer query SQL

i am trying to join two ctes to get the difference in performance of different countries and group on id here is my example
every campaign can be done in different countries, so how can i group by at the end to have 1 row per campaign id ?
CTE 1: (planned)
select
country
, campaign_id
, sum(sales) as planned_sales
from table x
group by 1,2
CTE 2: (Actual)
select
country
, campaign_id
, sum(sales) as actual_sales
from table y
group by 1,2
outer select
select
country,
planned_sales,
actual_sales
planned - actual as diff
from cte1
join cte2
on campaign_id = campaign_id
This should do it:
select
cte1.campaign_id,
sum(cte1.planned_sales),
sum(cte2.actual_sales)
sum(cte1.planned_sales) - sum(cte2.actual_sales) as diff
from cte1
join cte2
on cte1.campaign_id = cte2.campaign_id and cte1.country = cte2.country
group by 1
I would suggest using full join, so all data is included in both tables, not just data in one or the other. Your query is basically correct but it needs a group by.
select campaign_id,
sum(cte1.planned_sales) as planned_sales
sum(cte2.actual_sales) as actual_sales,
(coalesce(sum(cte1.planned_sales), 0) -
coalesce(sum(cte2.actual_sales), 0)
) as diff
from cte1 full join
cte2
using (campaign_id, country)
group by campaign_id;
That said, there is no reason why the CTEs should aggregate by both campaign and country. They could just aggregate by campaign id -- simplifying the query and improving performance.

How to create an additional column with the percentages related to a count distinct statement

I'm trying to query each distinct medical speciality (e.g. oncologist, pediatrician, etc.) in a table and then count the number of times a claim (claim_id) is linked to it, which I've done using this:
select distinct specialization, count(distinct claim_id) AS Claim_Totals
from table1
group by specialization
order by Claim_Totals DESC
However, I also want to include an additional column which lists the % that each speciality makes up in the table (based on the number of claim_id related to it). So for instance, if there were 100 total claims and "cardiologist" had 25 claim_id records related to it, "oncologist" had 15, "general surgeon" had 10, and so forth, I want the output to look like this:
specialization | Claims_Totals | PERCENTAGE
___________________________________________
cardiologist 25 25%
oncologist 15 15%
general surgeon 10 10%
Could do this? I'm not familiar with Barbaros's syntax. If that works its more concise and better.
select specialization, count(distinct claim_id) AS Claim_Totals, count(distinct claim_id)/total_claims
from table1
INNER JOIN ( SELECT COUNT(DISTINCT claim_id)*1.0000 total_claims AS total_claims
FROM table1 ) TMP
ON 1 = 1
group by specialization
order by Claim_Totals DESC
select specialization,
count(distinct claim_id) AS claim_by_spec,
count(distinct claim_id)/
( SELECT COUNT(DISTINCT claim_id)*1.0000
FROM table1 ) AS percentage_calc
from table1
group by specialization
order by Claim_Totals DESC
You can use sum(count(distinct)) over() to get the overall claims and use it in the denominator to get the percentage.
select specialization
,count(distinct claim_id) AS Claim_Totals
,round(100*count(distinct claim_id)/sum(count(distinct claim_id)) over(),3) as percentage
from table1
group by specialization
You can use
,concat_ws('',count(distinct claim_id),'%') as percentage
or
,concat(count(distinct claim_id),'%') as percentage
as added to the select list's tail
Btw, distinct before specialization in the select list is redundant, since already included in the group by list.
Because you are using count(distinct), window functions are less useful. You can try:
select t1.specialization,
count(distinct t1.claim_id) AS Claim_Totals,
count(distinct t1.claim_id) / tt1.num_claims
from table1 t1 cross join
(select count(distinct claim_id) as num_claims
from table1
) tt1
group by t1.specialization
order by Claim_Totals DESC

In T-SQL, how can I collate positive and negative actions in order that they happened?

I have a table like this:
;WITH CTE AS
( SELECT *
FROM (VALUES(1,'BlueCar',NULL),
(2,'RedCar',NULL),
(3,NULL,'BlueCar'),
(4,'GreenCar',NULL),
(5,NULL,'RedCar'),
(6,'BlueCar',NULL)
) AS ValuesTable(Time,Buy,Sell)
)
SELECT *
FROM CTE
Time Buy Sell
1 BlueCar NULL
2 RedCar NULL
3 NULL BlueCar
4 GreenCar NULL
5 NULL RedCar
6 BlueCar NULL
How can I query this table to get the total number of cars still in stock? The Time column is days since the shop opened. The time that the car was purchased must be preserved
Note: The input data is such that there will never be a situation where there are multiple cars in the inventory.
Expected Output
Time Buy
4 GreenCar
6 BlueCar
In the query below, I do two separate aggregations to obtain the buy and sell counts for each car. I left join buys to sells, which should not run the risk of losing data assuming that the dealer did not short sell any inventory which does not actually exist.
Then I join that result to a CTE which finds the latest time for each car. This would then correspond to the time when the most recent car came into inventory, for each car type.
I also include the inventory count, which you did request, but it may be useful for you if you decide to expand the scope of your query later on.
WITH yourTable AS (
SELECT 1 AS Time, 'BlueCar' AS Buy, NULL AS Sell UNION ALL
SELECT 2,'RedCar',NULL UNION ALL
SELECT 3,NULL,'BlueCar' UNION ALL
SELECT 4,'GreenCar',NULL UNION ALL
SELECT 5,NULL,'RedCar' UNION ALL
SELECT 6,'BlueCar',NULL
),
cte AS (
SELECT Buy, Time
FROM
(
SELECT Buy, Time,
ROW_NUMBER() OVER (PARTITION BY Buy ORDER BY Time DESC) rn
FROM yourTable
) t
WHERE rn = 1
)
SELECT
t1.Buy,
t1.buy_cnt - COALESCE(t2.sell_cnt, 0) AS inventory,
t3.Time
FROM
(
SELECT Buy, COUNT(*) AS buy_cnt
FROM yourTable
GROUP BY Buy
) t1
LEFT JOIN
(
SELECT Sell, COUNT(*) AS sell_cnt
FROM yourTable
GROUP BY Sell
) t2
ON t1.Buy = t2.Sell
LEFT JOIN cte t3
ON t1.Buy = t3.Buy
WHERE
t1.Buy IS NOT NULL AND
t1.buy_cnt - COALESCE(t2.sell_cnt, 0) > 0
ORDER BY
t3.Time;
Output:
Demo here:
Rextester
You can do this with a not exists:
;WITH CTE AS
( SELECT *
FROM (VALUES(1,'BlueCar',NULL),
(2,'RedCar',NULL),
(3,NULL,'BlueCar'),
(4,'GreenCar',NULL),
(5,NULL,'RedCar'),
(6,'BlueCar',NULL)
) AS ValuesTable(Time,Buy,Sell)
)
SELECT
[Time], Buy
FROM CTE as T1
WHERE
NOT EXISTS (SELECT 1 FROM CTE as T2 WHERE T2.TIME > T1.TIME AND T1.Buy = T2.Sell) AND
BUY IS NOT NULL
Presumably, you want:
with cte as (
. . .
)
select count(buy) - count(sell)
from cte;
Note: This does not verify that what you sell is something that has already been bought. It just counts up the non-NULL values in each column and takes the difference.
To get the stock at a certain point in time you can do
SELECT car, SUM(Inc) total FROM
(SELECT ID, Buy car, 1 Inc FROM tbl WHERE Buy>''
UNION ALL
SELECT ID, Sell car, -1 Inc FROM tbl WHERE Sell>'') coll
WHERE ID < 20 -- some cut-off time
GROUP BY car
I combine the two columns Buy and Sell into one (= car) and add another column (inc) with the increment of each action (-1 or 1). The rest is simple: select with a group by [car] and summation over column inc.
Here is a little demo: http://rextester.com/LLQDW60692
It is Good Question. I like that. Time by time your expected outputs changes.Its ok.
check below simple query for your problem.
Using Joins and Rownumber() we can achieve this.
;with CTE as
(
select a.time,a.buy,a.rid,COALESCE(b.rid,0)rid2 ,coalesce(b.sell,a.buy)sell from
( select time,buy,ROW_NUMBER()over( partition by buy order by (select 1)) rid
from #tableName where buy is not null)a left join
( select time,sell, ROW_NUMBER()over( partition by sell order by (select 1)) rid
from #TableName
where sell is not null )b on a.buy=b.sell
)
select Time,Buy from CTE
where rid!=rid2
Sample Demo For All Your Expected outputs.
Demo Link : Click Here
ALL Required Outputs :

Top and Bottom in the same query?

I need to make a ssrs-report that shows the best and the worst customer depending on how much they've spent. In my report i would like to represent the gap between the Top 1 and Bottom customer, within one chart. My problem is that it's impossible for me to get these values within the same dataset/query.
This is my results from a query(see code below). I would like to, with maybe union all or something, get the same result from only one query. Or is there a easier way with e.g. Visual Studio to represent these values. Top N, Bottom N filters perhaps? If so please show me a way or "best practice" cuz i haven't figured it out yet. thx.
Code:
SELECT DISTINCT TOP 1
dimcustomer.FirstName ,
SUM(FactInternetSales.OrderQuantity * UnitPrice)
FROM DimCustomer
INNER JOIN FactInternetSales ON FactInternetSales.CustomerKey = DimCustomer.CustomerKey
GROUP BY FirstName
ORDER BY SUM(FactInternetSales.OrderQuantity * UnitPrice) DESC
SELECT DISTINCT TOP 1
dimcustomer.FirstName ,
SUM(FactInternetSales.SalesAmount)
FROM DimCustomer
INNER JOIN FactInternetSales ON FactInternetSales.CustomerKey = DimCustomer.CustomerKey
GROUP BY FirstName
ORDER BY SUM(FactInternetSales.SalesAmount) DESC
Two result sets:
FirstName | SalesAmount
Morgan 145044,5816
------------------------
FirstName | SalesAmount
Dave 3.99
The union operator doesn't like the order by clause so you can restructure slightly
with CustomersOrders as
(
select dimcustomer.FirstName, sum(FactInternetSales.OrderQuantityUnitPrice) Total
from DimCustomer
inner join FactInternetSales on FactInternetSales.CustomerKey = DimCustomer.CustomerKey
group by FirstName
)
select *
from
(
select top 1 *
from CustomersOrders
order by Total desc
) a
union all
select *
from
(
select top 1 *
from CustomersOrders
order by Total
) b
You can UNION these queries, and add another column "CustomerType" to queries with values - TopCustomer & BottomCustomer correspondingly to distinguish the customer type.