Grouped Weighted Average in Access Query

Grouped Weighted Average in Access Query - sql

I am trying to have a weighted average fee % of sales for each Client/Product/City combo from this data. I don't need the level of detail of sub product.
My data looks like this:
+--------+---------+-------+--------------+-------+----------------+
| Client | Product | City | Sub Product | Sales | Fee % of Sales |
+--------+---------+-------+--------------+-------+----------------+
| a | b | b | c | 1000 | 1% |
| a | b | b | d | 2000 | 2% |
| c | c | b | c | 3000 | 3% |
| d | c | b | c | 4000 | 4% |
+--------+---------+-------+--------------+-------+----------------+
I want to calculate the weighted average Fee % charged for each Client & Product combo. i.e. For Client 'a', Product 'b', City 'b': the fee% of sales would be (1,000/3,000)*1% + (2,000/3000 * 2%)
After I do this I will have another query that takes only the Client, Product,City Sales and new Weighted average field from the last query. I need another query because I will be using the results as part of a larger query.

This would have been easier done using window function, but since you are using ms-access... You can compute the sales subtotals per client/product/city in a subquery, and then JOIN in with the original table:
SELECT
t.client, t.product, t.city, SUM(t.sales * t.fee / t1.sales) res
FROM
mytable t
INNER JOIN (
SELECT client, product, city, SUM(sales) sales
FROM mytable
GROUP BY client, product, city
) t1
ON t1.client = t.client
AND t1.product = t.product
AND t1.city = t.city
GROUP BY t.client, t.product, t.city
This demo on DB Fiddle with your sample data returns:
| client | product | city | res |
| ------ | ------- | ---- | ------------------------------- |
| a | b | b | 0.016666666294137638 |
| c | c | b | 0.029999999329447746 |
| d | c | b | 0.03999999910593033 |

You can calculate the total sales & fee values as part of a subquery, then perform the division with the resulting values, e.g.:
select
q.client,
q.product,
q.city,
q.fee/q.totalsales as weightedfee
from
(
select
t.client,
t.product,
t.city,
sum(t.sales) as totalsales,
sum(t.sales*t.[fee % of sales]) as fee
from yourtable t
group by t.client, t.product, t.city
) q
Change yourtable to suit your table name.

Related

How to write SQL sub-query in SQL?

Here is sample data I am looking for total buying trade value and total selling trades value based on country.
Here are two tables, country, and trades
Table [companies]:
+-------------+--------------------+
| name| country |
+-------------+--------------------+
| Alice s.p. | Wonderland |
| Y-zap | Wonderland |
| Absolute | Mathlands |
| Arcus t.g. | Mathlands |
| Lil Mermaid | Underwater Kingdom |
| None at all | Nothingland |
+-------------+--------------------+
Table [trades]:
trades:
+----------+-------------+------------+-------+
| id | seller | buyer | value |
+----------+-------------+------------+-------+
| 20121107 | Lil Mermaid | Alice s.p. | 10 |
| 20123112 | Arcus t.g. | Y-zap | 30 |
| 20120125 | Alice s.p. | Arcus t.g. | 100 |
| 20120216 | Lil Mermaid | Absolute | 30 |
| 20120217 | Lil Mermaid | Absolute | 50 |
+----------+-------------+------------+-------+
Expected Output:
+--------------------+--------+--------+
| country| buyer | seller|
+--------------------+--------+--------+
| Mathlands | 180 | 30 |
| Nothingland | 0 | 0 |
| Underwater Kingdom | 0 | 90 |
| Wonderland | 40 | 100 |
+--------------------+--------+--------+
I am trying this: It gives only one value column and it doesn't show the 0 trade country that I want to show also.
select country, sum(value), sum(value)
from
(select a.buyer as export, a.seller as import, value, b.country as country
from trades as a
join companies as b
on a.seller=b.name)
group by country
order by country

Join country to distinct rows of trades which contain only buyer or seller and aggregate conditionally:
SELECT c.country,
SUM(CASE WHEN buyer IS NOT NULL THEN value ELSE 0 END) buyer,
SUM(CASE WHEN seller IS NOT NULL THEN value ELSE 0 END) seller
FROM country c
LEFT JOIN (
SELECT buyer, null seller, value FROM trades
UNION ALL
SELECT null, seller, value FROM trades
) t ON c.name IN (t.buyer, t.seller)
GROUP BY c.country
Or, with SUM() window function:
SELECT DISTINCT c.country,
SUM(CASE WHEN c.name = t.buyer THEN value ELSE 0 END) OVER (PARTITION BY c.country) buyer,
SUM(CASE WHEN c.name = t.seller THEN value ELSE 0 END) OVER (PARTITION BY c.country) seller
FROM country c LEFT JOIN trades t
ON c.name IN (t.buyer, t.seller)
See the demo.

Try CTE:
WITH sold AS (
SELECT sum(t.value) AS value, c.country FROM trades AS t INNER JOIN companies AS c ON (t.seller = c.name) GROUP BY c.country
), buyed AS (
SELECT sum(t.value) AS value, c.country FROM trades AS t INNER JOIN companies AS c ON (t.buyer = c.name) GROUP BY c.country
)
SELECT DISTINCT c.country, COALESCE(b.value, 0) AS buyer, COALESCE(s.value, 0) AS seller
FROM companies AS c
LEFT JOIN sold AS s ON (c.country = s.country)
LEFT JOIN buyed AS b ON (c.country = b.country)
https://www.db-fiddle.com/f/kgLezmhyiL9BKB2JUsaWYc/0

SQL how to calculate median not based on rows

I have a sample of cars in my table and I would like to calculate the median price for my sample with SQL. What is the best way to do it?
+-----+-------+----------+
| Car | Price | Quantity |
+-----+-------+----------+
| A | 100 | 2 |
| B | 150 | 4 |
| C | 200 | 8 |
+-----+-------+----------+
I know that I can use percentile_cont (or percentile_disc) if my table is like this:
+-----+-------+
| Car | Price |
+-----+-------+
| A | 100 |
| A | 100 |
| B | 150 |
| B | 150 |
| B | 150 |
| B | 150 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
| C | 200 |
+-----+-------+
But in the real world, my first table has about 100 million rows and the second table should have about 3 billiard rows (and moreover I don't know how to transform my first table into the second).

Here is a way to do this in sql server
In the first step i do is calculate the indexes corresponding to the lower and upper bounds for the median (if we have odd number of elements then the lower and upper bounds are same else its based on the x/2 and x/2+1th value)
Then i get the cumulative sum of the quantity and the use that to choose the elements corresponding to the lower and upper bounds as follows
with median_dt
as (
select case when sum(quantity)%2=0 then
sum(quantity)/2
else
sum(quantity)/2 + 1
end as lower_limit
,case when sum(quantity)%2=0 then
(sum(quantity)/2) + 1
else
sum(quantity)/2 + 1
end as upper_limit
from t
)
,data
as (
select *,sum(quantity) over(order by price asc) as cum_sum
from t
)
,rnk_val
as(select *
from (
select price,row_number() over(order by d.cum_sum asc) as rnk
from data d
join median_dt b
on b.lower_limit<=d.cum_sum
)x
where x.rnk=1
union all
select *
from (
select price,row_number() over(order by d.cum_sum asc) as rnk
from data d
join median_dt b
on b.upper_limit<=d.cum_sum
)x
where x.rnk=1
)
select avg(price) as median
from rnk_val
+--------+
| median |
+--------+
| 200 |
+--------+
db fiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=c5cfa645a22aa9c135032eb28f1749f6

This looks right on few results, but try on a larger set to double-check.
First create a table which has the total for each car (or use CTE or sub-query), your choice. I'm just creating a separate table here.
create table table2 as
(
select car,
quantity,
price,
price * quantity as total
from table1
)
Then run this query, which looks for the price group that falls in the middle.
select price
from (
select car, price,
sum(total) over (order by car) as rollsum,
sum(total) over () as total
from table2
)a
where rollsum >= total/2
Correctly returns a value of $200.

Retrieve the minimal create date with multiple rows

I have an issue with an SQL query that I am trying to write. I am trying to retrieve the row that has the minimal create_dt for each inst (see table) and amount (which isn't unique).
Unfortunately I can't use group by as the amount column isn't unique.
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company A | 400 | 4545 | 01/11/2018 |
| Company A | 200 | 4545 | 31/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
| Company B | 212 | 4893 | 04/10/2016 |
| Company B | 100 | 4893 | 10/10/2017 |
| Company B | 20 | 4893 | 04/10/2018 |
+--------------+--------+------+-------------+
In the above example I expect to see:
+--------------+--------+------+-------------+
| Company_Name | Amount | inst | Create Date |
+--------------+--------+------+-------------+
| Company A | 1000 | 4545 | 01/10/2018 |
| Company B | 2000 | 4893 | 01/10/2016 |
+--------------+--------+------+-------------+
Code:
SELECT
bill_company, bill_name, account_no
FROM
dbo.customer_information;
SELECT
balance_id, balance_id2, minus_balance,new_balance,
create_date, account_no
FROM
dbo.btr
SELECT
balance_id, balance_id2, expired_Date, amount, balance_type, account_no
FROM
dbo.btr_balance
SELECT
balance_ist, expired_date, account_no, balance_type
FROM
dbo.BALANCE_inst
Retrieve the minimal create data for a balance instance with the lowest balance for a balance inst.
(SELECT
bill_company,
bill_name,
account_no,
balance_ist,
amount,
MIN(create_date)
FROM
dbo.mtr btr
LEFT JOIN
btr_balance btrb ON btr.balance_id = btrb.balance_id
AND btr.balance_id2 = btrb.balance_id2
LEFT JOIN
balance_inst bali ON btr.account_no = bali.account_no
AND btrb.expired_date = bali.expired_date
GROUP BY
bill_company, bill_name, account_no,amount, balance_ist)
I have seen some solutions about using correlated query but can't see to get my head around it.

Common Table Expression (CTE) will help you.
;with cte as (
select *, row_number() over(partition by company_name order by create_date) rn
from dbo.myTable
)
select * from cte
where rn = 1;

use row_number() i assumed bill_company is your company name
select * from
( SELECT bill_company,
bill_name,
account_no,
balance_ist,
amount,
create_date,
row_number() over(partition by bill_company order by create_date) rn
FROM dbo.mtr btr left join btr_balance btrb
on btr.balance_id = btrb.balance_id and btr.balance_id2 = btrb.balance_id2
left join balance_inst bali
on btr.account_no = bali.account_no and btrb.expired_date = bali.expired_date
) t where t.rn=1

Oracle SQL Selecting Most Recent Data

Good morning,
This is a follow-up to SELECT most recent in Oracle SQL Query
I am hoping to take my Oracle skills to the next level after learning a lot from this site.
I work for a small construction company and thus, we buy a lot of smaller parts/materials from our vendors. Sometimes, in the same calendar year, we may switch who we buy the SAME part from. I want to only grab the most recent VENDOR for each individual PART NUMBER. Here is an example of what I mean:
The code for my starting query:
WITH
PartNums AS -- Grabs me all of the stuff we "bought", and its vendor, in the construction division since Jan 1 2018
(
SELECT
PO_ITEM AS "PART_NUM",
VEND_NUM,
VEND_NM,
PODiv AS "DIVISION_CD"
FROM
INNER JOIN
(
SELECT MAX(PODate) OVER(PARTITION BY PO_Number, VEND_NUM))
FROM tblPurchases
WHERE PODate > '01-Jan-2017'
) tblTemp INNER JOIN tblPurchases ON tblPurchases.VEND_NUM = tblTemp.VEND_NUM
INNER JOIN tblVendors ON tblPurchases.VEND_NUM = tblVendors.VEND_NUM
WHERE
PODate > '01-Jan-2017'
AND
PODiv = 'C'
),
Defects AS -- Grabs me the listed defects against their stuff
(
SELECT
PartNums.*,
DEFECT_NUM,
DEFECT_CAT
FROM
PartNums
INNER JOIN tblDefects ON PartNums.PART_NUM = tblDefects.DEFECTIVE_PART_NUM
WHERE
DEFECT_DATE > '01-Jan-2017'
),
Names AS
(
SELECT
Defects.*,
PART_NM
FROM
Defects
INNER JOIN tblParts ON Defects.PART_NUM = tblParts.PART_NUM
)
SELECT
VEND_NUM,
VEND_NM,
PART_NUM,
PART_NM,
DEFECT_NUM,
DEFECT_CAT,
DIVISION_CD
FROM Names
This produces the following results:
| Vendor Number | Vendor Name | Part Number | Part Name | Defect Number | Defect Category | Division | Purchase Order Date |
|---------------|------------------------------|-------------|----------------|---------------|-----------------|----------|---------------------|
| 200123 | Push-Button LLC | 54211EW | Faceplate | PROB333211 | WRPT | C | 11-Jan-2017 |
| 200587 | Entirely Concrete | 69474TR | 2in Screw | PROB587412 | WRPT | C | 03-Mar-2017 |
| 200444 | Maaco | 77489GF | Hammer NR | PROB369854 | WRPT | C | 08-Aug-2017 |
| 200100 | Fleischman Contractors | 21110LW | Service | PROB215007 | OPYM | C | 01-Jun-2017 |
| 200664 | Advanced Tool Repair LLC | 47219UZ | Service | PROB9874579 | UPYM | C | 14-Jan-2018 |
| 200999 | AllTech Electronic Equipment | 36654DD | Plastic Casing | PROB326598 | NA | C | 16-Jan-2018 |
| 200321 | ZyotoCard Electronics | 74200ZN | Service | PROB012547 | MISCT | C | 19-Apr-2017 |
| 200331 | Black&Decker | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 03-Aug-2017 |
| 200333 | Sears | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 11-Mar-2017 |
As you can see, there are 2 vendors for Part Number 41122UT. For this part number, I only want Black & Decker (whose PO Date is 5 months newer than Sears).
I would like for the data to look like this:
| Vendor Number | Vendor Name | Part Number | Part Name | Defect Number | Defect Category | Division | Purchase Order Date |
|---------------|------------------------------|-------------|----------------|---------------|-----------------|----------|---------------------|
| 200123 | Push-Button LLC | 54211EW | Faceplate | PROB333211 | WRPT | C | 11-Jan-2017 |
| 200587 | Entirely Concrete | 69474TR | 2in Screw | PROB587412 | WRPT | C | 03-Mar-2017 |
| 200444 | Maaco | 77489GF | Hammer NR | PROB369854 | WRPT | C | 08-Aug-2017 |
| 200100 | Fleischman Contractors | 21110LW | Service | PROB215007 | OPYM | C | 01-Jun-2017 |
| 200664 | Advanced Tool Repair LLC | 47219UZ | Service | PROB9874579 | UPYM | C | 14-Jan-2018 |
| 200999 | AllTech Electronic Equipment | 36654DD | Plastic Casing | PROB326598 | NA | C | 16-Jan-2018 |
| 200321 | ZyotoCard Electronics | 74200ZN | Service | PROB012547 | MISCT | C | 19-Apr-2017 |
| 200331 | Black&Decker | 41122UT | .11mm Drillbit | PROB147741 | BRKN | C | 03-Aug-2017 |
I have found that using MAX() OVER (PARTITION BY) can be used to return the most recent, so I tried this query and it now runs, but it gives me the most recent date, for each vendor, for each part. Not just for each part. I need the MOST RECENT VENDOR INFORMATION (found on the Purchase Order, so ultimately need the most recent Purchase Order) for every PART. Could anyone advise?
WITH
PartNums AS -- Grabs me all of the stuff we "bought", and its vendor, in the construction division since Jan 1 2018
(
SELECT
PO_ITEM AS "PART_NUM",
VEND_NUM,
VEND_NM,
PODiv AS "DIVISION_CD"
FROM
INNER JOIN
(
SELECT PO_NUMBER, VEND_NUM, MAX(PODate) OVER(PARTITION BY PO_NUMBER, VEND_NUM))
FROM tblPurchases
WHERE PODate > '01-Jan-2017'
) tblTemp INNER JOIN tblPurchases ON tblPurchases.VEND_NUM = tblTemp.VEND_NUM
INNER JOIN tblVendors ON tblPurchases.VEND_NUM = tblVendors.VEND_NUM
WHERE
PODate > '01-Jan-2017'
AND
PODiv = 'C'
),
Defects AS -- Grabs me the listed defects against their stuff
(
SELECT
PartNums.*,
DEFECT_NUM,
DEFECT_CAT
FROM
PartNums
INNER JOIN tblDefects ON PartNums.PART_NUM = tblDefects.DEFECTIVE_PART_NUM
WHERE
DEFECT_DATE > '01-Jan-2017'
),
Names AS
(
SELECT
Defects.*,
PART_NM
FROM
Defects
INNER JOIN tblParts ON Defects.PART_NUM = tblParts.PART_NUM
)
SELECT
VEND_NUM,
VEND_NM,
PART_NUM,
PART_NM,
DEFECT_NUM,
DEFECT_CAT,
DIVISION_CD
FROM Names
Thank you very much for your time and help. Sorry if this creates any ambiguity.

Instead of using MAX, use DENSE_RANK, RANK or ROW_NUMBER and partition it by PO_NUMBER, VEND_NUM, order it by PO_DATE DESC, and filter out the records that returns value greater than 1,
Your query could be similar like below, as you can see I used DENSE_RANK,
SELECT *
FROM (SELECT A.*, DENSE_RANK() OVER(PARTITION BY PO_NUMBER, VEND_NUM ORDER BY podate DESC) rank_value
FROM your_table)
WHERE rank_value = 1;

Supposedly you are looking for all parts reported defective since a particular date and want to find the according order so as to be able to contact the supplier.
In Oracle 12c you can use CROSS APPLY to join only the latest order (which you get with ORDER BY date DESC FETCH FIRST ROW ONLY).
select
o.vend_num as vendor_number,
o.vend_nm as vendor_name,
d.defective_part_num as part_number,
p.part_nm as part_name,
d.defect_num as defect_number,
d.defect_cat as defect_category,
o.podiv as division,
o.podate as purchase_order_date
from tbldefects d
cross apply
(
select *
from tblpurchases pu
where pu.po_number = d.defective_part_num
and pu.podate <= d.defect_date
and pu.podiv = 'C'
order by pu.podate desc
fetch first row only
) o
join tblparts p on p.part_num = d.defective_part_num
where d.defect_date >= date '2017-01-01';

Issue with SQL involving JOINS

I have 2 tables with similar layout, involving INCOME and EXPENSES.
The id column is a customer ID.
I need a result of customer TOTAL AMOUNT, summing up income and expenses.
Table: Income
| id | amountIN|
+--------------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
Table: Expenses
| id | amountOUT|
+---------------+
| 1 | -x |
| 4 | -z |
My problem is that some customers only have expenses and others just income... so cannot know in advance id I need to do a LEFT or RIGHT JOIN.
In the example above an RIGHT JOIN could do the trick, but if the situation is inverted (more customers on the Expenses table) it doesn't work.
Expected Result
| id | TotalAmount|
+--------------+
| 1 | a - x |
| 2 | b |
| 3 | c |
| 4 | d - z |
Any help?

select id, SUM(Amount)
from
(
select id, amountin as Amount
from Income
union all
select id, amountout as Amount
from Expense
) a
group by id

I believe a full join will solve your problem.

I would approach this as a union. Do that in your subquery then sum on it.
For instance:
select id, sum(amt) from
(
select i.id, i.amountIN as amt from Income i
union all
select e.id, e.amountOUT as amt from Expenses e
)
group by id

You should really have another table like client :
Table: Client
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
So you could do something like that
SELECT Client.ID, COALESCE(Income.AmountIN, 0) - COALESCE(Expenses.AmountOUT, 0)
FROM Client c
LEFT JOIN Income i ON i.ID = c.ID
LEFT JOIN Expense e ON e.ID = c.ID
Will be less complicated and i'm sure it will come handy another time :)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Grouped Weighted Average in Access Query - sql

Related

How to write SQL sub-query in SQL?

SQL how to calculate median not based on rows

Retrieve the minimal create date with multiple rows

Oracle SQL Selecting Most Recent Data

Issue with SQL involving JOINS

Categories

Resources