Running total of values from a table until it matches value from another table - sql

I have 2 tables.
Table 1 is a temp variable table:
declare #Temp as table ( proj_num varchar(10), sum_dom decimal(23,8))
My temp table is populated with a list of project numbers, and a month end accounting dollar amount.
For example:
proj_num | sum_dom
11522 | 2477.15
11524 | 26474.20
41865 | 9012.10
Table 2 is a Project Transactions table.
We're concerned with just the following columns:
proj_num
amount
cost_code
tran_date
Individual values will somemething like this:
proj_num | cost_code | amount | tran_date
11522 | LBR | 112.10 | 10/1/2018
11522 | LBR | 1765.90 | 10/2/2018
11522 | MAT | 599.15 | 10/3/2018
11522 | FRT | 57.50 | 10/4/2018
So for this project, since the grand total of $2477.15 is met on 10/3, example output would be:
proj_num | cost_code | amount
11522 | LBR | 1878.00
11522 | MAT | 599.15
I want to sum the amounts (grouped by cost_code, and ordered by tran_date) under the project transaction table until the total sum of values for that project value matches the value in the sum_dom column of the temp table, at which point I will output that data.
Can you help me figure out how to write the query to do that?
I know I should avoid cursors, but I havent had much luck with my attempts so far. I cant seem to get it to keep a running total.

Running sum is done using SUM(...) OVER (ORDER BY ...). You just need to tell where to stop:
SELECT sq.*
FROM projects
INNER JOIN (
SELECT
proj_num,
cost_code,
amount,
SUM(amount) OVER (PARTITION BY proj_num ORDER BY tran_date) AS running_sum
FROM project_transactions
) AS sq ON projects.proj_num = sq.proj_num
WHERE running_sum <= projects.sum_dom
DB Fiddle

Related

Sum Total of Distinct Items In Table

I have a table of transactions. One column is for the vendor ID, and one column is for the amount due. [There are other columns, but they aren't relevant]
id | amount | custid
23 | -31.32 | 904424
24 | -19.94 | 646744
25 | -4.77 | 904424
26 | -29.40 | 972979
I want to run a query that delivers the total for each distinct customer ID.
The goal is to determine how much each customer is owed.
That's basic aggregation:
select custid, sum(amount) total_amount
from mytable
group by custid

SQL to find max of sum of data in one table, with extra columns

Apologies if this has been asked elsewhere. I have been looking on Stackoverflow all day and haven't found an answer yet. I am struggling to write the query to find the highest month's sales for each state from this example data.
The data looks like this:
| order_id | month | cust_id | state | prod_id | order_total |
+-----------+--------+----------+--------+----------+--------------+
| 67212 | June | 10001 | ca | 909 | 13 |
| 69090 | June | 10011 | fl | 44 | 76 |
... etc ...
My query
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders GROUP BY `month`, `state`
ORDER BY sales;
| month | state | sales |
+------------+--------+--------+
| September | wy | 435 |
| January | wy | 631 |
... etc ...
returns a few hundred rows: the sum of sales for each month for each state. I want it to only return the month with the highest sum of sales, but for each state. It might be a different month for different states.
This query
SELECT `state`, MAX(order_sum) as topmonth
FROM (SELECT `state`, SUM(order_total) order_sum FROM orders GROUP BY `month`,`state`)
GROUP BY `state`;
| state | topmonth |
+--------+-----------+
| ca | 119586 |
| ga | 30140 |
returns the correct number of rows with the correct data. BUT I would also like the query to give me the month column. Whatever I try with GROUP BY, I cannot find a way to limit the results to one record per state. I have tried PartitionBy without success, and have also tried unsuccessfully to do a join.
TL;DR: one query gives me the correct columns but too many rows; the other query gives me the correct number of rows (and the correct data) but insufficient columns.
Any suggestions to make this work would be most gratefully received.
I am using Apache Drill, which is apparently ANSI-SQL compliant. Hopefully that doesn't make much difference - I am assuming that the solution would be similar across all SQL engines.
This one should do the trick
SELECT t1.`month`, t1.`state`, t1.`sales`
FROM (
/* this one selects month, state and sales*/
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
) AS t1
JOIN (
/* this one selects the best value for each state */
SELECT `state`, MAX(sales) AS best_month
FROM (
SELECT `month`, `state`, SUM(order_total) AS sales
FROM orders
GROUP BY `month`, `state`
)
GROUP BY `state`
) AS t2
ON t1.`state` = t2.`state` AND
t1.`sales` = t2.`best_month`
It's basically the combination of the two queries you wrote.
Try this:
SELECT `month`, `state`, SUM(order_total) FROM orders WHERE `month` IN
( SELECT TOP 1 t.month FROM ( SELECT `month` AS month, SUM(order_total) order_sum FROM orders GROUP BY `month`
ORDER BY order_sum DESC) t)
GROUP BY `month`, state ;

Access query combine two tables with criteria

The below code references two tables. Each table are identical in structure, only difference being the "PRICE" and "PRICE_DATE" values. This is because it's the same table created one year ago. All I want to do is have a new table which takes the latest price in each table for each fund and inserts that into a new table. In addition to this, I also want another column which calculates the growth.
The code below works for this purpose.
SELECT [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE AS
[PRICE_#_112015], [2016_11_Fund_Prices].PRICE AS [PRICE_#_112016]
([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1) AS Growth INTO 2016_11_Monthly_Fund_Prices
FROM 2016_11_Fund_Prices INNER JOIN 2015_11_Fund_Prices ON [2016_11_Fund_Prices].FUND_CODE = [2015_11_Fund_Prices].FUND_CODE
GROUP BY [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE_DATE, [2015_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE_DATE, ([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1)
HAVING ((([2015_11_Fund_Prices].PRICE_DATE)=#24/11/2015#) AND (([2016_11_Fund_Prices].PRICE_DATE)=#24/11/2016#));
However, this code assumes that the latest price is 24/11 in both tables. I want to replace this with a max function that will result in the query referencing only the price in the row with the highest date value.
Can anyone help?
Tabels used are
+-----------+------------+-------+
| Fund_Code | PRICE_DATE | PRICE |
+-----------+------------+-------+
| 1 | 12/12/12 | 1 |
| 1 | 13/12/12 | 1.2 |
| 1 | 14/12/12 | 1.1 |
| 2 | 12/12/12 | 1.12 |
| 2 | 13/12/12 | 1.13 |
| 2 | 14/12/12 | 1.11 |
So the second table is exactly the same but dates corresponding to the following year.
All I want is a table with:
Fund_Code Price1 Price2 Growth
Thanks
You need a sub-query like this:
SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE
If you add this sub-query to the above and link it to the 2016_11_Fund_Prices table on FUND_CODE and PRICE_DATE=MaxPriceDate it should do what you need.
SELECT 2016_11_Fund_Prices.FUND_CODE, PRICE, PRICE_DATE
FROM 2016_11_Fund_Prices
INNER JOIN (SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE) mp
ON 2016_11_Fund_Prices.FUND_CODE=mp.FUND_CODE AND 2016_11_Fund_Prices.PRICE_DATE=mp.MaxPriceDate

PostgreSQL return multiple rows with DISTINCT though only latest date per second column

Lets says I have the following database table (date truncated for example only, two 'id_' preix columns join with other tables)...
+-----------+---------+------+--------------------+-------+
| id_table1 | id_tab2 | date | description | price |
+-----------+---------+------+--------------------+-------+
| 1 | 11 | 2014 | man-eating-waffles | 1.46 |
+-----------+---------+------+--------------------+-------+
| 2 | 22 | 2014 | Flying Shoes | 8.99 |
+-----------+---------+------+--------------------+-------+
| 3 | 44 | 2015 | Flying Shoes | 12.99 |
+-----------+---------+------+--------------------+-------+
...and I have a query like the following...
SELECT id, date, description FROM inventory ORDER BY date ASC;
How do I SELECT all the descriptions, but only once each while simultaneously only the latest year for that description? So I need the database query to return the first and last row from the sample data above; the second it not returned because the last row has a later date.
Postgres has something called distinct on. This is usually more efficient than using window functions. So, an alternative method would be:
SELECT distinct on (description) id, date, description
FROM inventory
ORDER BY description, date desc;
The row_number window function should do the trick:
SELECT id, date, description
FROM (SELECT id, date, description,
ROW_NUMBER() OVER (PARTITION BY description
ORDER BY date DESC) AS rn
FROM inventory) t
WHERE rn = 1
ORDER BY date ASC;

SQL - Select unique rows from a group of results

I have wrecked my brain on this problem for quite some time. I've also reviewed other questions but was unsuccessful.
The problem I have is, I have a list of results/table that has multiple rows with columns
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 005DTHGP | 1966 | 2006-09-12 | Tracker
| 013DTHGP | 2281 | 2006-11-01 | Tracker
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 2404 | 2006-10-20 | Tracker
| 017DTNGP | 508 | 2007-11-10 | MBio
I am trying to select rows with unique REGISTRATIONS and where the DATE is max (the latest). The IDs are not proportional to the DATE, meaning the ID could be a low value yet the DATE is higher than the other matching row and vise-versa. Therefore I can't use MAX() on both the DATE and ID and grouping just doesn't seem to work.
The results I want are as follows;
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 508 | 2007-11-10 | MBio
PLEASE HELP!!!?!?!?!?!?!?
You want embedded queries, which not all SQLs support. In t-sql you'd have something like
select r.registration, r.recent, t.id, t.unittype
from (
select registration, max([date]) recent
from #tmp
group by
registration
) r
left outer join
#tmp t
on r.recent = t.[date]
and r.registration = t.registration
TSQL:
declare #R table
(
Registration varchar(16),
ID int,
Date datetime,
UnitType varchar(16)
)
insert into #R values ('A','1','20090824','A')
insert into #R values ('A','2','20090825','B')
select R.Registration,R.ID,R.UnitType,R.Date from #R R
inner join
(select Registration,Max(Date) as Date from #R group by Registration) M
on R.Registration = M.Registration and R.Date = M.Date
This can be inefficient if you have thousands of rows in your table depending upon how the query is executed (i.e. if it is a rowscan and then a select per row).
In PostgreSQL, and assuming your data is indexed so that a sort isn't needed (or there are so few rows you don't mind a sort):
select distinct on (registration), * from whatever order by registration,"date" desc;
Taking each row in registration and descending date order, you will get the latest date for each registration first. DISTINCT throws away the duplicate registrations that follow.
select registration,ID,date,unittype
from your_table
where (registration, date) IN (select registration,max(date)
from your_table
group by registration)
This should work in MySQL:
SELECT registration, id, date, unittype FROM
(SELECT registration AS temp_reg, MAX(date) as temp_date
FROM table_name GROUP BY registration) AS temp_table
WHERE registration=temp_reg and date=temp_date
The idea is to use a subquery in a FROM clause which throws up a single row containing the correct date and registration (the fields subjected to a group); then use the correct date and registration in a WHERE clause to fetch the other fields of the same row.