SQL Server Amount Split - sql

I have below 2 tables in SQL Server database.
Customer Main Expense Table
ReportID CustomerID TotalExpenseAmount
1000 1 200
1001 2 600
Attendee Table
ReportID AttendeeName
1000 Mark
1000 Sam
1000 Joe
There is no amount at attendee level. I have need to manually calculate individual attendee amount as mentioned below. (i.e split TotalExpenseAmount based on number of attendees and ensure individual split figures round to 2 decimals and sums up to the TotalExpenseAmount exactly)
The final report should look like:
ReportID CustID AttendeeName TotalAmount AttendeeAmount
1000 1 Mark 200 66.66
1000 1 Sam 200 66.66
1000 1 Joe 200 66.68
The final report will have about 1,50,000 records. If you notice the attendee amount I have rounded the last one in such a way that the totals match to 200. What is the best way to write an efficient SQL query in this scenario?

You can do this using window functions:
select ReportID, CustID, AttendeeName, TotalAmount,
(case when seqnum = 1
then TotalAmount - perAttendee * (cnt - 1)
else perAttendee
end) as AttendeeAmount
from (select a.ReportID, a.CustID, a.AttendeeName, e.TotalAmount,
row_number() over (partition by reportId order by AttendeeName) as seqnum,
count(*) over (partition by reportId) as cnt,
cast(TotalAmount * 1.0 / count(*) over (partition by reportId) as decimal(10, 2)) as perAttendee
from attendee a join
expense e
on a.ReportID = e.ReportID
) ae;
The perAttendee amount is calculated in the subquery. This is rounded down by using cast() (only because floor() doesn't accept a decimal places argument). For one of the rows, the amount is the total minus the sum of all the other attendees.

Doing something similar to #Gordon's answer but using a CTE instead.
with CTECount AS (
select a.ReportId, a.AttendeeName,
ROW_NUMBER() OVER (PARTITION BY A.ReportId ORDER BY A.AttendeeName) [RowNum],
COUNT(A.AttendeeName) OVER (PARTITION BY A.ReportId) [AttendeeCount],
CAST(c.TotalExpenseAmount / (COUNT(A.AttendeeName) OVER (PARTITION BY A.ReportId)) AS DECIMAL(10,2)) [PerAmount]
FROM #Customer C INNER JOIN #Attendee A ON A.ReportId = C.ReportID
)
SELECT CT.ReportID, CT.CustomerId, AT.AttendeeName,
CASE WHEN CC.RowNum = 1 THEN CT.TotalExpenseAmount - CC.PerAmount * (CC.AttendeeCount - 1)
ELSE CC.PerAmount END [AttendeeAmount]
FROM #Customer CT INNER JOIN #Attendee AT
ON CT.ReportID = AT.ReportId
INNER JOIN CTECount CC
ON CC.ReportId = CT.ReportID AND CC.AttendeeName = AT.AttendeeName
I like the CTE because it allows me to separate the different aspects of the query. The cool thing that #Gordon used was the Case statement and the inner calculation to have the lines total correctly.

Related

SQL Select value from other table based on column value as treshold

I have a SQLite query which returns a user name and how much a user spent (done by SELECT SUM() from the different table).
Name
Spent
Adam
700
Mike
400
Steve
100
I have another table which contains discount amount with corresponding treshold:
Treshold
Discount
200
5
400
10
600
15
I need to find what discount each user has (if it does at all). So results would look like this:
Name
Spent
Discount
Total
Adam
700
15
595
Mike
400
10
360
Steve
100
0
100
You need a LEFT join of your query to the 2nd table and aggregation:
SELECT t1.name, t1.Spent,
COALESCE(MAX(t2.Discount), 0) Discount,
t1.Spent * (1 - 0.01 * COALESCE(MAX(t2.Discount), 0)) Total
FROM (SELECT name, SUM(Spent) Spent FROM table1 GROUP BY name) t1
LEFT JOIN table2 t2 ON t2.Treshold <= t1.Spent
GROUP BY t1.name;
See the demo.
I am in a hurry. Sorry.
with a as (
select name, sum(spent) spe
from test1
group by name)
select a.name
, a.spe
, max(tres)
, max(disc)
, spe -spe * (0 || '.' || disc) total
from test2, a
where tres <= a.spe
DEMO

Alternative: Sql - SELECT rows until the sum of a row is a certain value

My question is very similar to my previous one posted here:
Sql - SELECT rows until the sum of a row is a certain value
To sum it up, I need to return the rows, until a certain sum is reached, but the difference this time, is that, I need to find the best fit for this sum, I mean, It doesn't have to be sequential. For example:
Let's say I have 5 unpaid receipts from customer 1:
Receipt_id: 1 | Amount: 110€
Receipt_id: 2 | Amount: 110€
Receipt_id: 3 | Amount: 130€
Receipt_id: 4 | Amount: 110€
Receipt_id: 5 | Amount: 190€
So, customer 1 ought to pay me 220€.
Now I need to select the receipts, until this 220€ sum is met and it might be in a straight order, like (receipt 1 + receipt 2) or not in a specific order, like (receipt 1 + receipt 4), any of these situations would be suitable.
I am using SQL Server 2016.
Any additional questions, feel free to ask.
Thanks in advance for all your help.
This query should solve it.
It is a quite dangerous query (containing a recursive CTE), so please be careful!
You can find some documentation here: https://www.essentialsql.com/recursive-ctes-explained/
WITH the_data as (
SELECT *
FROM (
VALUES (1, 1, 110),(1, 2,110),(1, 3,130),(1, 4,110),(1, 5,190),
(2, 1, 10),(2, 2,20),(2, 3,200),(2, 4,190)
) t (user_id, receipt_id, amount)
), permutation /* recursive used here */ as (
SELECT
user_id,
amount as sum_amount,
CAST(receipt_id as varchar(max)) as visited_receipt_id,
receipt_id as max_receipt_id,
1 as i
FROM the_data
WHERE amount > 0 -- remove empty amount
UNION ALL
SELECT
the_data.user_id,
sum_amount + amount as sum_amount,
CAST(concat(visited_receipt_id, ',', CAST(receipt_id as varchar))as varchar(max)) as visited_receipt_id,
receipt_id as max_receipt_id ,
i + 1
FROM the_data
JOIN permutation
ON the_data.user_id = permutation.user_id
WHERE i < 1000 -- max 1000 loops, means any permutation with less than 1000 different receipts
and receipt_id > max_receipt_id -- in order that sum in komutatif , we can check the sum in any unique order ( here we take the order of the reciept_id in fact we do not produce any duplicates )
-- AND sum_amount + amount <= 220 -- ignore everything that is bigger than the expected value (optional)
)
SELECT *
FROM permutation
WHERE sum_amount = 220
in order to select only one combination per user_id, replace the last three lines of the previous query by
SELECT *
FROM (
SELECT *, row_number() OVER (partition by user_id order by random() ) as r
FROM permutation
WHERE sum_amount = 220
) as t
WHERE r = 1
IF your target is to sum only 2 receipts in order to reach your value, this could be a solution:
DECLARE #TARGET INT = 220 --SET YOUR TARGET
, #DIFF INT
, #FIRSTVAL INT
SET #FIRSTVAL = (
SELECT TOP 1 AMOUNT
FROM myRECEIPTS
ORDER BY RECEIPT_ID ASC
)
SELECT TOP 1 *
FROM myRECEIPTS
WHERE AMOUNT = #TARGET - #FIRSTVAL
ORDER BY RECEIPT_ID ASC
this code will do it:
declare #sum1 int
declare #numrows int
set #numrows= 1
set #sum1 =0
while (#sum1 < 10)
begin
select top (#numrows) #sum1=sum(sum1) from receipts
set #numrows +=1
end
select top(#numrows) * from receipts

Join tables based on dates with check

I have two tables in PostgreSQL:
Demans_for_parts:
demandid partid demanddate quantity
40 125 01.01.17 10
41 125 05.01.17 30
42 123 20.06.17 10
Orders_for_parts:
orderid partid orderdate quantity
1 125 07.01.17 15
54 125 10.06.17 25
14 122 05.01.17 30
Basicly Demans_for_parts says what to buy and Orders_for_parts says what we bought. We can buy parts which do not list on Demans_for_parts.
I need a report which shows me all parts in Demans_for_parts and how many weeks past since the most recent matching row in Orders_for_parts. note quantity field is irrelevent here,
The expected result is (if more than one row per part show the oldes):
partid demanddate weeks_since_recent_order
125 01.01.17 2 (last order is on 10.06.17)
123 20.06.17 Unhandled
I think the tricky part is getting one row per table. But that is easy using distinct on. Then you need to calculate the months. You can use age() for this purpose:
select dp.partid, dp.date,
(extract(year from age(dp.date, op.date))*12 +
extract(month from age(dp.date, op.date))
) as months
from (select distinct on (dp.partid) dp.*
from demans_for_parts dp
order by dp.partid, dp.date desc
) dp left join
(select distinct on (op.partid) op.*
from Orders_for_parts op
order by op.partid, op.date desc
) op
on dp.partid = op.partid;
smth like?
with o as (
select distinct partid, max(orderdate) over (partition by partid)
from Orders_for_parts
)
, p as (
select distinct partid, min(demanddate) over (partition by partid)
from Demans_for_parts
)
select p.partid, min as demanddate, date_part('day',o.max - p.min)/7
from p
left outer join o on (p.partid = o.partid)
;

Calculate Count as Percentage

I have looked around but I just can't seem to understand the logic. I think a good response is here, but like I said, it doesn't make sense, so a more specific explanation would be greatly appreciated.
So I want to show how often customers of each ethnicity are using an credit card. There are different types of credit cards, but if the CardID = 1, they used cash (hence the not equal to 1 statement).
I want to Group By ethnicity and show the count of transactions, but as a percentage.
SELECT Ethnicity, COUNT(distinctCard.TransactionID) AS CardUseCount
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
So for example, this is what it comes up with:
Ethnicity | CardUseCount
0 | 100
1 | 200
2 | 300
3 | 400
But I would like this:
Ethnicity | CardUsePer
0 | 0.1
1 | 0.2
2 | 0.3
3 | 0.4
If you need the percentage of card-transaction per ethnicity, you have to divide the cardtransactions per ethnicity by the total transactions of the same ethnicity. You don't need a sub query for that:
SELECT Ethnicity, sum(IIF(CardID=1,0,1))/count(1) AS CardUsePercentage
FROM TransactionT
INNER JOIN CustomerT
ON TransactionT.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
From your posted sample result to me it looks like you just wanted to divide the count by 1000 like
SELECT Ethnicity,
COUNT(distinctCard.TransactionID) / 1000 AS CardUseCount
FROM <rest part of query>
SELECT Ethnicity, COUNT(distinctCard.TransactionID) / (SELECT COUNT(1) FROM TransactionT WHERE CardID <> 1) AS CardUsePer
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
I think the answer you posted is your answer. As they said in your comments , you just count the transactions, you need to divide it by the number of total transactions. As stated in the answer, you need to divide the count(...) by the total number. This would be done as follows:
SELECT Ethnicity, COUNT(distinctCard.TransactionID)/(SELECT COUNT(TransactionT.TransactionID)
FROM TransactionT WHERE CardID <> 1)
AS CardUsePercent
FROM (SELECT DISTINCT TransactionID, CustomerID FROM TransactionT WHERE CardID <> 1)
AS distinctCard INNER JOIN CustomerT ON distinctCard.CustomerID = CustomerT.CustomerID
GROUP BY Ethnicity
ORDER BY COUNT(distinctCard.TransactionID) ASC
This will give the result you want.
EDIT: This may be wrong, as i dont know the exact format of your tables, but i was assuming that the TransactionID field is Unique in the table. Else use the DISTINCT keyword, or the PK of your table , depending on your actual implemetation

SQL query to select percentage of total

I have a MSSQL table stores that has the following columns in a table:
Storeid, NumEmployees
1 125
2 154
3 10
4 698
5 54
6 98
7 87
8 100
9 58
10 897
Can someone help me with the SQL query to produce the top stores(storeID) that has 30% of the total emplyees(NumEmployees)?
WITH cte
AS (SELECT storeid,
numemployees,
( numemployees * 100 ) / SUM(numemployees) OVER (PARTITION BY 1)
AS
percentofstores
FROM stores)
SELECT *
FROM cte
WHERE percentofstores >= 30
ORDER BY numemployees desc
Working Demo
Alternative that doesn't use SUM/OVER
SELECT s.storeid, s.numemployees
FROM (SELECT SUM(numemployees) AS [tots]
FROM stores) AS t,
stores s
WHERE CAST(numemployees AS DECIMAL(15, 5)) / tots >= .3
ORDER BY s.numemployees desc
Working Demo
Note that in the second version I decided not to multiply by 100 before dividing. This requires a cast to decimal otherwise it would be implicitly converted to a int resulting in no records returned
Also I'm not completely clear that you want this, but you can add TOP 1 to both queries and it will limit the results to just the one with the greatest # of stores with more than 30%
UPDATE
Based on your comments it sounds to paraphrase Kevin
You want the rows, starting at the store with the most employees and working down until you have at least 30 %
This is difficult because it requires a running percentage and its a bin packing problem however this does work. Note I've included two other test cases (where the percent exactly equals and its just over the top two combined)
Working Demo
DECLARE #percent DECIMAL (20, 16)
SET #percent = 0.3
--Other test values
--SET #percent = 0.6992547128452433
--SET #percent = 0.6992547128452434
;WITH sums
AS (SELECT DISTINCT s.storeid,
s.numemployees,
s.numemployees + Coalesce(SUM(s2.numemployees) OVER (
PARTITION
BY
s.numemployees), 0)
runningsum
FROM stores s
LEFT JOIN stores s2
ON s.numemployees < s2.numemployees),
percents
AS (SELECT storeid,
numemployees,
runningsum,
CAST(runningsum AS DECIMAL(15, 5)) / tots.total
running_percent,
Row_number() OVER (ORDER BY runningsum, storeid ) rn
FROM sums,
(SELECT SUM(numemployees) total
FROM stores) AS tots)
SELECT p.storeID,
p.numemployees,
p.running_percent,
p.running_percent,
p.rn
FROM percents p
CROSS JOIN (SELECT MAX(rn) rn
FROM percents
WHERE running_percent = #percent) exactpercent
LEFT JOIN (SELECT MAX(rn) rn
FROM percents
WHERE running_percent <= #percent) underpercent
ON p.rn <= underpercent.rn
OR ( exactpercent.rn IS NULL
AND p.rn <= underpercent.rn + 1 )
WHERE
underpercent.rn is not null or p.rn = 1