Access query combine two tables with criteria - sql

The below code references two tables. Each table are identical in structure, only difference being the "PRICE" and "PRICE_DATE" values. This is because it's the same table created one year ago. All I want to do is have a new table which takes the latest price in each table for each fund and inserts that into a new table. In addition to this, I also want another column which calculates the growth.
The code below works for this purpose.
SELECT [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE AS
[PRICE_#_112015], [2016_11_Fund_Prices].PRICE AS [PRICE_#_112016]
([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1) AS Growth INTO 2016_11_Monthly_Fund_Prices
FROM 2016_11_Fund_Prices INNER JOIN 2015_11_Fund_Prices ON [2016_11_Fund_Prices].FUND_CODE = [2015_11_Fund_Prices].FUND_CODE
GROUP BY [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE_DATE, [2015_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE_DATE, ([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1)
HAVING ((([2015_11_Fund_Prices].PRICE_DATE)=#24/11/2015#) AND (([2016_11_Fund_Prices].PRICE_DATE)=#24/11/2016#));
However, this code assumes that the latest price is 24/11 in both tables. I want to replace this with a max function that will result in the query referencing only the price in the row with the highest date value.
Can anyone help?
Tabels used are
+-----------+------------+-------+
| Fund_Code | PRICE_DATE | PRICE |
+-----------+------------+-------+
| 1 | 12/12/12 | 1 |
| 1 | 13/12/12 | 1.2 |
| 1 | 14/12/12 | 1.1 |
| 2 | 12/12/12 | 1.12 |
| 2 | 13/12/12 | 1.13 |
| 2 | 14/12/12 | 1.11 |
So the second table is exactly the same but dates corresponding to the following year.
All I want is a table with:
Fund_Code Price1 Price2 Growth
Thanks

You need a sub-query like this:
SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE
If you add this sub-query to the above and link it to the 2016_11_Fund_Prices table on FUND_CODE and PRICE_DATE=MaxPriceDate it should do what you need.
SELECT 2016_11_Fund_Prices.FUND_CODE, PRICE, PRICE_DATE
FROM 2016_11_Fund_Prices
INNER JOIN (SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE) mp
ON 2016_11_Fund_Prices.FUND_CODE=mp.FUND_CODE AND 2016_11_Fund_Prices.PRICE_DATE=mp.MaxPriceDate

Related

SQL question, query is not updating account_id's fields: income, customerid, customergroup?

I am executing this query through a databricks notebook, to join data from a stage table to a target table based on the shared join keys: account_id and stmt_end_dt. The stage table has 2 billion rows of data and the target table has 3 billion rows of data.
Here is the main query:
"UPDATE TARGET_TBL SET INCOME = S.INCOME, CUSTOMERGROUPID = S.CUSTOMERGROUPID, CUSTOMERID = S.CUSTOMERID
FROM STAGE_TBL AS S
WHERE CAST(S.ACCT_ID AS NUMBER(18,0)) = TARGET_TBL.ACCT_ID
AND CAST(S.STMT_END_DT AS DATE) = TARGET_TBL.STMT_END_DT"
What I want to do is add "income", "customerid", and "customergroup" data to the matching rows of "account_id" and "stmt_end_dt" in the target table, from the stage table. When I go into the target table I see that there are now fields for "income", "customerid", and "customergrop" (this is fine because it was created through an earlier query). After my query has run and I click into the target table I see that account_id is blank and that "income", "customerid" and "customergroup" all have data filled. And when I run this query: SELECT * FROM TARGET_TBL WHERE INCOME IS NOT NULL; I get back 80000 rows (seems kinda low considering the stage table is 2 billion). Also after that query runs I see again that "income", "customerid" and "customergroup" are all populated with data, but account_id is full of NULLS. It is as this data is just being appended or tacked on, and not updating each account_id's fields with the matching data, this is how I imagine it should look like:
account_id | income | customerid | customergroupid
4321 | 60000 | 6345 | 3
5432 | 55000 | 4345 | 5
But instead it looks like this:
account_id | income | customerid | customergroupid
| 60000 | 6345 | 3
| 55000 | 4345 | 5
Or when I run: SELECT * WHERE INCOME IS NOT NULL:
account_id | income | customerid | customergroupid
NULL | 60000 | 6345 | 3
NULL | 55000 | 4345 | 5
And if I simply open the target table it looks like this:
account_id | income | customerid | customergroupid
4321 | | |
5432 | | |
After that query runs, it is also NULL for all other fields outside of the last 3 shown.
Perhaps the data types coming from the stage table aren't compatible with the target table?
What could be causing this strange behavior?
you can't compare "values" with "null"... if a field is "null" there is nothing to compare. I believe this is your problem.
if you have null fields and you want to compare, usually you can try "is null" or "nvl" lookup for the syntax of these.. it is very helpfull.

MS Access 2016 - Pull client name from separate table in complex query

I have three tables for vulnerability scanning jobs: customers, authorization forms, and scans. Relationships are one to many from left to right. I previously had scans directly related to clients, but implemented the forms table to add the ability to prevent scanning without authorization. I have the below query which pulls the dates of the most recent and next coming scans (huge thanks to #donPablo), but when I made the change in tables I'm no longer pulling the correct data from the customers table. I'm not exactly sure how to fix it.
SELECT u.Customer_Company, z.*
FROM (Select
NZ(a.Scan_Data.Customer_ID, b.Scan_Data.Customer_ID) as Customer,
aPast as Past,
aFuture as Future,
DATEDIFF("d", aPast, aFuture) as Difference
FROM
(Select Scan_Data.Customer_ID, Max(Scan_Date) as aPast from Scan_Data where Scan_Date <= DATE() Group By Scan_Data.Customer_ID) a
LEFT JOIN
(Select Scan_Data.Customer_ID, Min(Scan_Date) as aFuture from Scan_Data where Scan_Date > DATE() Group By Scan_Data.Customer_ID) b
ON a.Scan_Data.Customer_ID = B.Scan_Data.Customer_ID
UNION
Select
NZ(a.Scan_Data.Customer_ID, b.Scan_Data.Customer_ID) as Customer,
aPast as Past,
aFuture as Future,
DATEDIFF("d", aPast, aFuture) as Difference
FROM
(Select Scan_Data.Customer_ID, Max(Scan_Date) as aPast from Scan_Data where Scan_Date <= DATE() Group By Scan_Data.Customer_ID) a
RIGHT JOIN
(Select Scan_Data.Customer_ID, Min(Scan_Date) as aFuture from Scan_Data where Scan_Date > DATE() Group By Scan_Data.Customer_ID) b
ON a.Scan_Data.Customer_ID = B.Scan_Data.Customer_ID
) AS z LEFT JOIN Customer_Data AS u ON cint(z.Customer) = cint(u.Customer_ID);
In this query the Scan_Data.Customer_ID winds up being the FormID and it then pulls the customer's name based on the FormID. I fixed it in my other queries by doing a double inner join to pull the actual CustomerID based on the FormID, but I can't get that to work here because of the existing joins. Form_Data.Customer_ID is the way it's identified in the Form table. All IDs in their primary tables are autonumber generated PKs.
Customer_Data table:
.Customer_ID | .Customer_Name | etc.
1 | Microsoft |
2 | Reddit |
Form_Data table:
.Form_ID | .Signature_Date | .Expiration_Date | .Customer_ID
1 | 01-Jan-19 | 01-Jan-20 | 2/Reddit
2 | 15-May-18 | 15-May-21 | 1/Microsoft
Scan_Data table:
.Scan_ID | .Scan_Title | .Scan_Date | .Customer_ID
1 | First MS 19052018 | 19-May-18 | 1/2/Reddit
2 | First R 05012019 | 05-Jan-19 | 2/1/Microsoft
The above Scan_Data shows the problem I'm having. The numbers in the Scan_Data.Customer_ID field are the PKs from the other two tables. The .Customer_ID field is pulling the customer ID based upon the form ID and not the actual customer ID. It should show like this:
.Scan_ID | .Scan_Title | .Scan_Date | .Customer_ID
1 | First MS 19052018 | 19-May-18 | 2/1/Microsoft
2 | First R 05012019 | 05-Jan-19 | 1/2/Reddit

Running total of values from a table until it matches value from another table

I have 2 tables.
Table 1 is a temp variable table:
declare #Temp as table ( proj_num varchar(10), sum_dom decimal(23,8))
My temp table is populated with a list of project numbers, and a month end accounting dollar amount.
For example:
proj_num | sum_dom
11522 | 2477.15
11524 | 26474.20
41865 | 9012.10
Table 2 is a Project Transactions table.
We're concerned with just the following columns:
proj_num
amount
cost_code
tran_date
Individual values will somemething like this:
proj_num | cost_code | amount | tran_date
11522 | LBR | 112.10 | 10/1/2018
11522 | LBR | 1765.90 | 10/2/2018
11522 | MAT | 599.15 | 10/3/2018
11522 | FRT | 57.50 | 10/4/2018
So for this project, since the grand total of $2477.15 is met on 10/3, example output would be:
proj_num | cost_code | amount
11522 | LBR | 1878.00
11522 | MAT | 599.15
I want to sum the amounts (grouped by cost_code, and ordered by tran_date) under the project transaction table until the total sum of values for that project value matches the value in the sum_dom column of the temp table, at which point I will output that data.
Can you help me figure out how to write the query to do that?
I know I should avoid cursors, but I havent had much luck with my attempts so far. I cant seem to get it to keep a running total.
Running sum is done using SUM(...) OVER (ORDER BY ...). You just need to tell where to stop:
SELECT sq.*
FROM projects
INNER JOIN (
SELECT
proj_num,
cost_code,
amount,
SUM(amount) OVER (PARTITION BY proj_num ORDER BY tran_date) AS running_sum
FROM project_transactions
) AS sq ON projects.proj_num = sq.proj_num
WHERE running_sum <= projects.sum_dom
DB Fiddle

how to get daily profit from sql table

I'm stucking for a solution at the problem of finding daily profits from db (ms access) table. The difference wrt other tips I found online is that I don't have in the table a field "Price" and one "Cost", but a field "Type" which distinguish if it is a revenue "S" or a cost "C"
this is the table "Record"
| Date | Price | Quantity | Type |
-----------------------------------
|01/02 | 20 | 2 | C |
|01/02 | 10 | 1 | S |
|01/02 | 3 | 10 | S |
|01/02 | 5 | 2 | C |
|03/04 | 12 | 3 | C |
|03/03 | 200 | 1 | S |
|03/03 | 120 | 2 | C |
So far I tried different solutions like:
SELECT
(SELECT SUM (RS.Price* RS.Quantity)
FROM Record RS WHERE RS.Type='S' GROUP BY RS.Data
) as totalSales,
(SELECT SUM (RC.Price*RC.Quantity)
FROM Record RC WHERE RC.Type='C' GROUP BY RC.Date
) as totalLosses,
ROUND(totalSales-totaleLosses,2) as NetTotal,
R.Date
FROM RECORD R";
in my mind it could work but obviously it doesn't
and
SELECT RC.Data, ROUND(SUM (RC.Price*RC.QuantitY),2) as DailyLoss
INTO #DailyLosses
FROM Record RC
WHERE RC.Type='C' GROUP BY RC.Date
SELECT RS.Date, ROUND(SUM (RS.Price*RS.Quantity),2) as DailyRevenue
INTO #DailyRevenues
FROM Record RS
WHERE RS.Type='S'GROUP BY RS.Date
SELECT Date, DailyRevenue - DailyLoss as DailyProfit
FROM #DailyLosses dlos, #DailyRevenues drev
WHERE dlos.Date = drev.Date";
My problem beyond the correct syntax is the approach to this kind of problem
You can use grouping and conditional summing. Try this:
SELECT data.Date, data.Income - data.Cost as Profit
FROM (
SELECT Record.Date as Date,
SUM(IIF(Record.Type = 'S', Record.Price * Record.Quantity, 0)) as Income,
SUM(IIF(Record.Type = 'C', Record.Price * Record.Quantity, 0)) as Cost,
FROM Record
GROUP BY Record.Date
) data
In this case you first create a sub-query to get separate fields for Income and Cost, and then your outer query uses subtraction to get actual profit.

prevent from double/triple SUMing when JOINing

i am joining two tables: accn_demographics and accn_payments. The relationship between the two tables is one to many between accn_demographics.accn_id and accn_payments.accn_id
My question is when I am summing the PAID_AMT and COPAY_AMT, I am getting double/triple/quadrouple the number that I should be getting.
Is there an obvious problem with my join condition?
select sum(p.paid_amt) as SumPaidAmount
, sum(p.copay_amt) as SumCoPay
, p.pmt_date
, d.load_Date
, p.ACCN_ID
from accn_payments p
join
(
select distinct load_date, accn_id
from accn_demographics
) d
on p.ACCN_ID=d.ACCN_ID
where p.POSTED='Y'
and p.pmt_date between '20120701' and '20120731'
group by p.pmt_date, d.load_Date,p.ACCN_ID
order by 3 desc
thanks so much for your guidance.
You need to do the summation in a subquery:
select sum(p.SumPaidAmount) as SumPaidAmount, sum(p.SumCoPay) as SumCoPay,
p.pmt_date, d.load_Date, p.ACCN_ID
from (select accn_id, p.pmt_date, sum(paid_amt) as SumPaidAmt,
sum(copay_amt) as SumCoPay
from accn_payments p
where p.POSTED='Y' and
p.pmt_date between '20120701' and '20120731'
group by accn_id, pmt_date
) p join
(select distinct load_date, accn_id from accn_demographics) d
on p.ACCN_ID=d.ACCN_ID
group by p.pmt_date, d.load_Date,p.ACCN_ID
order by 3 desc
Question: do you really intend for pmt_date to be in the final results? It looks like you want to remove it from both the outer SELECT and the subquery.
The only thing I can see if that (select distinct load_date, accn_id from accn_demographics) might return several matches. Look at your data and run a separate query
select distinct load_date, accn_id from accn_demographics WHERE accn_id=SomeID
where SomeID is one of the result accounts that is returning double/triple values. That should pinpoint your problem.
Yes, but it's not so obvious for beginners. What happens is that for every accn_payments record, you're matching on ONLY the accn_id, which means if there are multiple records in accn_demographics for that particular accn_id, then you will get duplicate accn_payment records due to the join. Is there another limiting field on accn_demographics to join back to the payments?
Ultimately, think of it this way:
accn_payments (p):
accn_id | paid_amt | copay_amt | ...
----------------------------------------------------
1 | 100.00 | 20.00 | ...
accn_demographics (d):
accn_id | load_date | ...
------------------------------------
1 | 2012/01/01 | ...
1 | 2012/03/05 | ...
1 | 2012/06/23 | ...
After joining, your results will look like this:
p.accn_id | p.paid_amt | p.copay_amt | p... | d.accn_id | d.load_date | d...
----------------------------------------------------------------------------
1 | 100.00 | 20.00 | .... | 1 | 2012/01/01 | ....
1 | 100.00 | 20.00 | .... | 1 | 2012/03/05 | ....
1 | 100.00 | 20.00 | .... | 1 | 2012/06/21 | ....
As you can see, the same row from accn_payments gets replicated for every matching accn_demographics record, since you specified only the accn_id column to be the join criteria. It can't limit the results any further, so it the DB engine says "Hey, look, this p record matches for all these d records, this must be what he was asking for!" Obviously not what was intended, as when you sum on the p.paid_amt and p.copay_amt, it performs a sum for ALL ROWS (even though they are duplicated).
Ultimately, see if you can limit the join criteria for accn_demographics even further (by some date, perhaps), that way you limit the number of duplicate payment records during the join.