Percentage difference between numbers in two columns - sql

My SQL experience is fairly minimal so please go easy on me here. I have a table tblForEx and I'm trying to create a query that looks at one particular column LastSalesRateChangeDate and also ForExRate.
Basically what I want to do is for the query to check that LastSalesRateChangeDate and then pull the ForExRate that is on the same line (obviously in the ForExRate column), then I need to check to see if there is a +/- 5% change since the last time the LastSalesRateChangeDate changed. I hope this makes sense, I tried to explain it as clearly as possible.
I believe I would need to create a 'subquery' to look at the LastSalesRateChangeDate and pull the ForEx rate from that date, but I just don't know how to go about this.
I should add this is being done in Access (SQL)
Sample data, here is what the table looks like:
| BaseCur | ForCur | ForExRate | LastSalesRateChangeDate
| USD | BRL | 1.718 | 12/9/2008
| USD | BRL | 1.65 | 11/8/2008
So I would need a query to look at the LastSalesRateChangeDate column, check to see if the date has changed, if so take the ForExRate value and then give a percentage difference of that ForExRate value since the last record.
So the final result would likely look like
"BaseCur" "ForCur" "Percentage Change since Last Sales Rate Change"
USD BRL X%

Gordon's answer pointed in the right direction:
SELECT t2.*, (SELECT top 1 t.ForExRate
FROM tblForEx t
where t.BaseCur=t2.BaseCur AND t.ForCur=t2.ForCur and t.LastSalesRateChangeDate<t2.LastSalesRateChangeDate
order by t.LastSalesRateChangeDate DESC, t.ForExRate DESC
) AS PreviousRate, [ForExRate]/[PreviousRate]-1 AS ChangeRatio
FROM tblForEx AS t2;
Access gives errors where the TOP 1 in the subquery causes "ties". We broke the ties and therefore removed the error by adding an extra item to the ORDER BY clause. To get the ratio to display as a percentage, switch to the design view and change the properties of that column accordingly.

If I understand correctly, you want the previous value. In MS Access, you can use a correlated subquery:
select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t;
Now, with this as a subquery, you can get the previous exchange rate using a join:
select t.*, ( (t.ForExRate / tprev.ForExRate) - 1) as change_ratio
from (select t.*,
(select top (1) t2.LastSalesRateChangeDate
from tblForEx as t2
where t2.BaseCur = t.BaseCur and t2.ForCur = t.ForCur
t2.LastSalesRateChangeDate < t.LastSalesRateChangeDate
order by t2.LastSalesRateChangeDate desc
) as prev_LastSalesRateChangeDate
from t
) as t inner join
tblForEx as tprev
on tprev.BaseCur = t.BaseCur and tprev.ForCur = t.ForCur
tprev.LastSalesRateChangeDate = t.prev_LastSalesRateChangeDate;

As per my understanding, you can use LEAD function to get last changed date Rate in a new column by using below query:
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float)
FROM CTE
Problem here is:
for every last row in group by you will have new calculalted column which we have made using LEAD function.
If there is only a single row for a particular BaseCur and ForCur, then also you will have NULL in column.
Resolution:
If you are sure that there will be at least two rows for each BaseCur and ForCur, then you can use WHERE clause to remove NULL values in final result.
WITH CTE AS (
SELECT *, LEAD(ForExRate, 1) OVER(PARTITION BY BaseCur, ForCur ORDER BY LastChangeDate DESC) LastValue
FROM #TT
)
SELECT BaseCur, ForCur, ForExRate, LastChangeDate , CAST( ((ForExRate - ISNULL(LastValue, 0))/LastValue)*100 AS float) Percentage
FROM CTE
WHERE LastValue IS NOT NULL

SELECT basetbl.BaseCur, basetbl.ForCur, basetbl.NewDate, basetbl.OldDate, num2.ForExRate/num1.ForExRate*100 AS PercentChange FROM
(((SELECT t.BaseCur, t.ForCur, MAX(t.LastSalesRateChangeDate) AS NewDate, summary.Last_Date AS OldDate
FROM (tblForEx AS t
LEFT JOIN (SELECT TOP 2 BaseCur, ForCur, MAX(LastSalesRateChangeDate) AS Last_Date FROM tblForEx AS t1
WHERE LastSalesRateChangeDate <>
(SELECT MAX(LastSalesRateChangeDate) FROM tblForEx t2 WHERE t2.BaseCur = t1.BaseCur AND t2.ForCur = t1.ForCur)
GROUP BY BaseCur, ForCur) AS summary
ON summary.ForCur = t.ForCur AND summary.BaseCur = t.BaseCur)
GROUP BY t.BaseCur, t.ForCur, summary.Last_Date) basetbl
LEFT JOIN tblForEx num1 ON num1.BaseCur=basetbl.BaseCur AND num1.ForCur = basetbl.ForCur AND num1.LastSalesRateChangeDate = basetbl.OldDate))
LEFT JOIN tblForEx num2 ON num2.BaseCur=basetbl.BaseCur AND num2.ForCur = basetbl.ForCur AND num2.LastSalesRateChangeDate = basetbl.NewDate;
This uses a series of subqueries. First, you are selecting the most recent date for the BaseCur and ForCur. Then, you are joining onto that the previous date. I do that by using another subquery to select the top two dates, and exclude the one that is equal to the previously established most recent date. This is the "summary" subquery.
Then, you get the BaseCur, ForCur, NewDate, and OldDate in the "basetbl" subquery. After that, it is two simple joins of the original table back onto those dates to get the rate that was applicable then.
Finally, you are selecting your BaseCur, ForCur, and whatever formula you want to use to calculate the rate change. I used a simple ratio in that one, but it is easy to change. You can remove the dates in the first line if you want, they are there solely as a reference point.
It doesn't look pretty, but complicated Access SQL queries never do.

Related

Modify my SQL Server query -- returns too many rows sometimes

I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here
You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).
Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).
A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE

SSRS Table of locations per item type

I have a basic query which shows what the latest product to be put in each location (FVTank) is:
SELECT TOP 1
T0.[DateTime],
T0.[TankName],
T1.[Item]
FROM
t005_pci_data T0
INNER JOIN t001_fvbatch T1 ON T1.[FVBatch] = T0.[FVBatch]
WHERE
T0.[TankName] = 'FV101'
UNION
SELECT TOP 1
T0.[DateTime],
T0.[TankName],
T1.[Item]
FROM
t005_pci_data T0
INNER JOIN t001_fvbatch T1 ON T1.[FVBatch] = T0.[FVBatch]
WHERE
T0.[TankName] = 'FV102'
[...etc...]
ORDER BY
T0.[DateTime] DESC
Which gives a result like this:
What I'd like to do is create a summary page on SSRS which would display all the locations which currently hold each item. Ideally it would look something like this:
There are 50 locations and 7 main items so I need it to have 8 headers (one additional one for "other".)
Is there a way to do this in SSRS? Or is there a better solution by doing it in SQL?
Thank you.
Add an additional column to your dataset that calculates a row number for each Item, ordered by the DateTime field:
row_number() over (partition by Item order by DateTime desc) as rn
Judging by your source query in your question, this may be best included as a wrapping select around your final query:
select DateTime
,TankName
,Item
,row_number() over (partition by Item order by DateTime desc) as rn
from(
<Your original query here>
) a
You can then use this as your row group, as without one you will not get the top aligned format you are after in each Item x column. Remember to delete the rn column but keep the grouping:
When you run this report you will get the following format (I didn't bother typing out all your data into my dataset query, hence the missing values):

Datediff between two tables

I have those two tables
1-Add to queue table
TransID , ADD date
10 , 10/10/2012
11 , 14/10/2012
11 , 18/11/2012
11 , 25/12/2012
12 , 1/1/2013
2-Removed from queue table
TransID , Removed Date
10 , 15/1/2013
11 , 12/12/2012
11 , 13/1/2013
11 , 20/1/2013
The TansID is the key between the two tables , and I can't modify those tables, what I want is to query the amount of time each transaction spent in the queue
It's easy when there is one item in each table , but when the item get queued more than once how do I calculate that?
Assuming the order TransIDs are entered into the Add table is the same order they are removed, you can use the following:
WITH OrderedAdds AS
( SELECT TransID,
AddDate,
[RowNumber] = ROW_NUMBER() OVER(PARTITION BY TransID ORDER BY AddDate)
FROM AddTable
), OrderedRemoves AS
( SELECT TransID,
RemovedDate,
[RowNumber] = ROW_NUMBER() OVER(PARTITION BY TransID ORDER BY RemovedDate)
FROM RemoveTable
)
SELECT OrderedAdds.TransID,
OrderedAdds.AddDate,
OrderedRemoves.RemovedDate,
[DaysInQueue] = DATEDIFF(DAY, OrderedAdds.AddDate, ISNULL(OrderedRemoves.RemovedDate, CURRENT_TIMESTAMP))
FROM OrderedAdds
LEFT JOIN OrderedRemoves
ON OrderedAdds.TransID = OrderedRemoves.TransID
AND OrderedAdds.RowNumber = OrderedRemoves.RowNumber;
The key part is that each record gets a rownumber based on the transaction id and the date it was entered, you can then join on both rownumber and transID to stop any cross joining.
Example on SQL Fiddle
DISCLAIMER: There is probably problem with this, but i hope to send you in one possible direction. Make sure to expect problems.
You can try in the following direction (which might work in some way depending on your system, version, etc) :
SELECT transId, (sum(add_date_sum) - sum(remove_date_sum)) / (1000*60*60*24)
FROM
(
SELECT transId, (SUM(UNIX_TIMESTAMP(add_date)) as add_date_sum, 0 as remove_date_sum
FROM add_to_queue
GROUP BY transId
UNION ALL
SELECT transId, 0 as add_date_sum, (SUM(UNIX_TIMESTAMP(remove_date)) as remove_date_sum
FROM remove_from_queue
GROUP BY transId
)
GROUP BY transId;
A bit of explanation: as far as I know, you cannot sum dates, but you can convert them to some sort of timestamps. Check if UNIX_TIMESTAMPS works for you, or figure out something else. Then you can sum in each table, create union by conveniently leaving the other one as zeto and then subtracting the union query.
As for that devision in the end of first SELECT, UNIT_TIMESTAMP throws out miliseconds, you devide to get days - or whatever it is that you want.
This all said - I would probably solve this using a stored procedure or some client script. SQL is not a weapon for every battle. Making two separate queries can be much simpler.
Answer 2: after your comments. (As a side note, some of your dates 15/1/2013,13/1/2013 do not represent proper date formats )
select transId, sum(numberOfDays) totalQueueTime
from (
select a.transId,
datediff(day,a.addDate,isnull(r.removeDate,a.addDate)) numberOfDays
from AddTable a left join RemoveTable r on a.transId = r.transId
order by a.transId, a.addDate, r.removeDate
) X
group by transId
Answer 1: before your comments
Assuming that there won't be a new record added unless it is being removed. Also note following query will bring numberOfDays as zero for unremoved records;
select a.transId, a.addDate, r.removeDate,
datediff(day,a.addDate,isnull(r.removeDate,a.addDate)) numberOfDays
from AddTable a left join RemoveTable r on a.transId = r.transId
order by a.transId, a.addDate, r.removeDate

Calculating information by using values from previous line

I have the current balance for each account and I need to subtract the netamount for transactions to create the previous month's end balance for the past 24 months. Below is a sample dataset;
create table txn_by_month (
memberid varchar(15)
,accountid varchar(15)
,effective_year varchar(4)
,effective_month varchar(2)
,balance money
,netamt money
,prev_mnthendbal money)
insert into txn_by_month values
(10001,111222333,2012,12,634.15,-500,1134.15)
,(10001,111222333,2012,11,NULL,-1436,NULL)
,(10001,111222333,2012,10,NULL,600,NULL)
,(10002,111333444,2012,12,1544.20,1650,-105.80)
,(10002,111333444,2012,11,NULL,1210,NULL)
,(10002,111333444,2012,10,NULL,-622,NULL)
,(10003,111456456,2012,01,125000,1200,123800)
,(10003,111456456,2011,12,NULL,1350,NULL)
,(10003,111456456,2011,11,NULL,-102,NULL)
As you can see I already have a table of all the transactions for each month totaled up. I just need to calculate the previous month end balance on the first line and bring it down to the second, third line etc. I have been trying to use CTEs, but am not overly familiar with them and seem to be stuck at the moment. This is what I have;
;
WITH CTEtest AS
(SELECT ROW_NUMBER() OVER (PARTITION BY memberid order by(accountid)) AS Sequence
,memberid
,accountid
,prev_mnthendbal
,netamt
FROM txn_by_month)
select c1.memberid
,c1.accountid
,c1.sequence
,c2.prev_mnthendbal as prev_mnthendbal
,c1.netamt,
COALESCE(c2.prev_mnthendbal, 0) - COALESCE(c1.netamt, 0) AS cur_mnthendbal
FROM CTEtest AS c1
LEFT OUTER JOIN CTEtest AS c2
ON c1.memberid = c2.memberid
and c1.accountid = c2.accountid
and c1.Sequence = c2.Sequence + 1
This is working only for the sequence = 2. I know that my issue is that I need to bring my cur_mnthendbal value down into the next line, but I can't seem to wrap my head around how. Do I need another CTE?
Any help would be greatly appreciated!
EDIT: Maybe I need to explain it better.... If I have this;
The balance for line 2 would be the prev_mnthendbal from line 1 ($1,134.15). Then the prev_mnthendbal from line 2 would be the balance - netamt ($1,134.15 - (-$1,436) = $2,570.15). I have been trying to use CTEs, but I can't seem to figure out how to populate the balance field with the prev_mnthendbal from the previous line (since it isn't calculated until the balance is available). Maybe I can't use CTE? Do I need to use cursor?
Turns out that I needed to combine a running total with the sequential CTE I was using to begin with.
;
with CTEtest AS
(SELECT ROW_NUMBER() OVER (PARTITION BY memberid order by effective year, effective month desc) AS Sequence, *
FROM txn_by_month)
,test
as (select * , balance - netamt as running_sum from CTEtest where sequence = 1
union all
select t.*, t1.running_sum - t.netamt from CTEtest t inner join test t1
on t.memberid = t1.memberid and t.sequence = t1.Sequence+1 where t.sequence > 1)
select * from test
order by memberid, Sequence
Hopefully this will help someone else in the future.
See LEAD/LAG analytic functions.

Sorting twice on same column

I'm having a bit of a weird question, given to me by a client.
He has a list of data, with a date between parentheses like so:
Foo (14/08/2012)
Bar (15/08/2012)
Bar (16/09/2012)
Xyz (20/10/2012)
However, he wants the list to be displayed as follows:
Foo (14/08/2012)
Bar (16/09/2012)
Bar (15/08/2012)
Foot (20/10/2012)
(notice that the second Bar has moved up one position)
So, the logic behind it is, that the list has to be sorted by date ascending, EXCEPT when two rows have the same name ('Bar'). If they have the same name, it must be sorted with the LATEST date at the top, while staying in the other sorting order.
Is this even remotely possible? I've experimented with a lot of ORDER BY clauses, but couldn't find the right one. Does anyone have an idea?
I should have specified that this data comes from a table in a sql server database (the Name and the date are in two different columns). So I'm looking for a SQL-query that can do the sorting I want.
(I've dumbed this example down quite a bit, so if you need more context, don't hesitate to ask)
This works, I think
declare #t table (data varchar(50), date datetime)
insert #t
values
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-09-16'),
('Xyz','2012-10-20')
select t.*
from #t t
inner join (select data, COUNT(*) cg, MAX(date) as mg from #t group by data) tc
on t.data = tc.data
order by case when cg>1 then mg else date end, date desc
produces
data date
---------- -----------------------
Foo 2012-08-14 00:00:00.000
Bar 2012-09-16 00:00:00.000
Bar 2012-08-15 00:00:00.000
Xyz 2012-10-20 00:00:00.000
A way with better performance than any of the other posted answers is to just do it entirely with an ORDER BY and not a JOIN or using CTE:
DECLARE #t TABLE (myData varchar(50), myDate datetime)
INSERT INTO #t VALUES
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-09-16'),
('Xyz','2012-10-20')
SELECT *
FROM #t t1
ORDER BY (SELECT MIN(t2.myDate) FROM #t t2 WHERE t2.myData = t1.myData), T1.myDate DESC
This does exactly what you request and will work with any indexes and much better with larger amounts of data than any of the other answers.
Additionally it's much more clear what you're actually trying to do here, rather than masking the real logic with the complexity of a join and checking the count of joined items.
This one uses analytic functions to perform the sort, it only requires one SELECT from your table.
The inner query finds gaps, where the name changes. These gaps are used to identify groups in the next query, and the outer query does the final sorting by these groups.
I have tried it here (SQL Fiddle) with extended test-data.
SELECT name, dat
FROM (
SELECT name, dat, SUM(gap) over(ORDER BY dat, name) AS grp
FROM (
SELECT name, dat,
CASE WHEN LAG(name) OVER (ORDER BY dat, name) = name THEN 0 ELSE 1 END AS gap
FROM t
) x
) y
ORDER BY grp, dat DESC
Extended test-data
('Bar','2012-08-12'),
('Bar','2012-08-11'),
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-08-16'),
('Bar','2012-09-17'),
('Xyz','2012-10-20')
Result
Bar 2012-08-12
Bar 2012-08-11
Foo 2012-08-14
Bar 2012-09-17
Bar 2012-08-16
Bar 2012-08-15
Xyz 2012-10-20
I think that this works, including the case I asked about in the comments:
declare #t table (data varchar(50), [date] datetime)
insert #t
values
('Foo','20120814'),
('Bar','20120815'),
('Bar','20120916'),
('Xyz','20121020')
; With OuterSort as (
select *,ROW_NUMBER() OVER (ORDER BY [date] asc) as rn from #t
)
--Now we need to find contiguous ranges of the same data value, and the min and max row number for such a range
, Islands as (
select data,rn as rnMin,rn as rnMax from OuterSort os where not exists (select * from OuterSort os2 where os2.data = os.data and os2.rn = os.rn - 1)
union all
select i.data,rnMin,os.rn
from
Islands i
inner join
OuterSort os
on
i.data = os.data and
i.rnMax = os.rn-1
), FullIslands as (
select
data,rnMin,MAX(rnMax) as rnMax
from Islands
group by data,rnMin
)
select
*
from
OuterSort os
inner join
FullIslands fi
on
os.rn between fi.rnMin and fi.rnMax
order by
fi.rnMin asc,os.rn desc
It works by first computing the initial ordering in the OuterSort CTE. Then, using two CTEs (Islands and FullIslands), we compute the parts of that ordering in which the same data value appears in adjacent rows. Having done that, we can compute the final ordering by any value that all adjacent values will have (such as the lowest row number of the "island" that they belong to), and then within an "island", we use the reverse of the originally computed sort order.
Note that this may, though, not be too efficient for large data sets. On the sample data it shows up as requiring 4 table scans of the base table, as well as a spool.
Try something like...
ORDER BY CASE date
WHEN '14/08/2012' THEN 1
WHEN '16/09/2012' THEN 2
WHEN '15/08/2012' THEN 3
WHEN '20/10/2012' THEN 4
END
In MySQL, you can do:
ORDER BY FIELD(date, '14/08/2012', '16/09/2012', '15/08/2012', '20/10/2012')
In Postgres, you can create a function FIELD and do:
CREATE OR REPLACE FUNCTION field(anyelement, anyarray) RETURNS numeric AS $$
SELECT
COALESCE((SELECT i
FROM generate_series(1, array_upper($2, 1)) gs(i)
WHERE $2[i] = $1),
0);
$$ LANGUAGE SQL STABLE
If you do not want to use the CASE, you can try to find an implementation of the FIELD function to SQL Server.