Calculating a value difference between a transactions and its prior transaction

Calculating a value difference between a transactions and its prior transaction - sql

I am using MS SQL and need some advice on how to construct a query. Essentially I have file of fuel transactions (credit card swipes) and within it the current odometer reading is captured. I am trying to construct a query using the vehicle number (unique id for the vehicle), the transaction date, and the current odometer reading to calculate a new column that looks at a given fuel transaction, finds the prior transaction (based on transaction date) for that vehicle transaction and then calculates the miles that were driven between the two data points.
I am struggling with identifying the prior transaction. Any help would be appreciated to help me get started. I am not looking for the specific script, but just some pseudo code would help get me going.
If you want to get specific, here are the key columns. CompanyVehicleNumber, TransactionDate(format YYYYMMDD), TransactionTime(format HHMMSS), Odometer (e.g. 123456)
Thanks.

You can use APPLY to get the previous transaction:
SELECT
t.*, MilesDiff = t.odometer - x.odometer
FROM tbl t
OUTER APPLY(
SELECT TOP 1 odometer
FROM tbl
WHERE
CompanyVehicleNumber = t.CompanyVehicleNumber
AND (TransactionDate + TransactionTime) < (t.TransactionDate + t.TransactionTime)
ORDER BY (TransactionDate + TransactionTime) DESC
) x(odometer)
Note here that you need to convert the TransactionDate and TransactionTime to a DATETIME to be able to compare the transactions.
Here is one way to convert dates and times:
DECLARE #date VARCHAR(8) = '20130101'
DECLARE #time VARCHAR(6) = '053000'
SELECT
CAST(#date AS DATETIME) +
CAST(LEFT(#time, 2) + ':' + SUBSTRING(#time, 3, 2) + ':' + RIGHT(#time, 2) AS DATETIME)

Related

SQL Server : average count of alerts per day, not including days with no alerts

I have a table that acts as a message log, with the two key tables being TIMESTAMP and TEXT. I'm working on a query that grabs all alerts (from TEXT) for the past 30 days (based on TIMESTAMP) and gives a daily average for those alerts.
Here is the query so far:
--goback 30 days start at midnight
declare #olderdate as datetime
set #olderdate = DATEADD(Day, -30, DATEDIFF(Day, 0, GetDate()))
--today at 11:59pm
declare #today as datetime
set #today = dateadd(ms, -3, (dateadd(day, +1, convert(varchar, GETDATE(), 101))))
print #today
--Grab average alerts per day over 30 days
select
avg(x.Alerts * 1.0 / 30)
from
(select count(*) as Alerts
from MESSAGE_LOG
where text like 'The process%'
and text like '%has alerted%'
and TIMESTAMP between #olderdate and #today) X
However, I want to add something that checks whether there were any alerts for a day and, if there are no alerts for that day, doesn't include it in the average. For example, if there are 90 alerts for a month but they're all in one day, I wouldn't want the average to be 3 alerts per day since that's clearly misleading.
Is there a way I can incorporate this into my query? I've searched for other solutions to this but haven't been able to get any to work.

This isn't written for your query, as I don't have any DDL or sample data, thus I'm going to provide a very simple example instead of how you would do this.
USE Sandbox;
GO
CREATE TABLE dbo.AlertMessage (ID int IDENTITY(1,1),
AlertDate date);
INSERT INTO dbo.AlertMessage (AlertDate)
VALUES('20190101'),('20190101'),('20190105'),('20190110'),('20190115'),('20190115'),('20190115');
GO
--Use a CTE to count per day:
WITH Tots AS (
SELECT AlertDate,
COUNT(ID) AS Alerts
FROM dbo.AlertMessage
GROUP BY AlertDate)
--Now the average
SELECT AVG(Alerts*1.0) AS DayAverage
FROM Tots;
GO
--Clean up
DROP TABLE dbo.AlertMessage;

You're trying to compute a double-aggregate: The average of daily totals.
Without using a CTE, you can try this as well, which is generalized a bit more to work for multiple months.
--get a list of events per day
DECLARE #Event TABLE
(
ID INT NOT NULL IDENTITY(1, 1)
,DateLocalTz DATE NOT NULL--make sure to handle time zones
,YearLocalTz AS DATEPART(YEAR, DateLocalTz) PERSISTED
,MonthLocalTz AS DATEPART(MONTH, DateLocalTz) PERSISTED
)
/*
INSERT INTO #Event(EntryDateLocalTz)
SELECT DISTINCT CONVERT(DATE, TIMESTAMP)--presumed to be in your local time zone because you did not specify
FROM dbo.MESSAGE_LOG
WHERE UPPER([TEXT]) LIKE 'THE PROCESS%' AND UPPER([TEXT]) LIKE '%HAS ALERTED%'--case insenitive
*/
INSERT INTO #Event(DateLocalTz)
VALUES ('2018-12-31'), ('2019-01-01'), ('2019-01-01'), ('2019-01-01'), ('2019-01-12'), ('2019-01-13')
--get average number of alerts per alerting day each month
-- (this will not return months with no alerts,
-- use a LEFT OUTER JOIN against a month list table if you need to include uneventful months)
SELECT
YearLocalTz
,MonthLocalTz
,AvgAlertsOfAlertingDays = AVG(CONVERT(REAL, NumDailyAlerts))
FROM
(
SELECT
YearLocalTz
,MonthLocalTz
,DateLocalTz
,NumDailyAlerts = COUNT(*)
FROM #Event
GROUP BY YearLocalTz, MonthLocalTz, DateLocalTz
) AS X
GROUP BY YearLocalTz, MonthLocalTz
ORDER BY YearLocalTz ASC, MonthLocalTz ASC
Some things to note in my code:
I use PERSISTED columns to get the month and year date parts (because I'm lazy when populating tables)
Use explicit CONVERT to escape integer math that rounds down decimals. Multiplying by 1.0 is a less-readable hack.
Use CONVERT(DATE, ...) to round down to midnight instead of converting back and forth between strings
Do case-insensitive string searching by making everything uppercase (or lowercase, your preference)
Don't subtract 3 milliseconds to get the very last moment before midnight. Change your semantics to interpret the end of a time range as exclusive, instead of dealing with the precision of your datatypes. The only difference is using explicit comparators (i.e. use < instead of <=). Also, DATETIME resolution is 1/300th of a second, not 3 milliseconds.
Avoid using built-in keywords as column names (i.e. "TEXT"). If you do, wrap them in square brackets to avoid ambiguity.

Instead of dividing by 30 to get the average, divide by the count of distinct days in your results.
select
avg(x.Alerts * 1.0 / x.dd)
from
(select count(*) as Alerts, count(distinct CAST([TIMESTAMP] AS date)) AS dd
...

Find closest date in SQL Server

I have a table dbo.X with DateTime column Y which may have hundreds of records.
My Stored Procedure has parameter #CurrentDate, I want to find out the date in the column Y in above table dbo.X which is less than and closest to #CurrentDate.
How to find it?

The where clause will match all rows with date less than #CurrentDate and, since they are ordered descendantly, the TOP 1 will be the closest date to the current date.
SELECT TOP 1 *
FROM x
WHERE x.date < #CurrentDate
ORDER BY x.date DESC

Use DateDiff and order your result by how many days or seconds are between that date and what the Input was
Something like this
select top 1 rowId, dateCol, datediff(second, #CurrentDate, dateCol) as SecondsBetweenDates
from myTable
where dateCol < #currentDate
order by datediff(second, #CurrentDate, dateCol)

I have a better solution for this problem i think.
I will show a few images to support and explain the final solution.
Background
In my solution I have a table of FX Rates. These represent market rates for different currencies. However, our service provider has had a problem with the rate feed and as such some rates have zero values. I want to fill the missing data with rates for that same currency that as closest in time to the missing rate. Basically I want to get the RateId for the nearest non zero rate which I will then substitute. (This is not shown here in my example.)
1) So to start off lets identify the missing rates information:
Query showing my missing rates i.e. have a rate value of zero
2) Next lets identify rates that are not missing.
Query showing rates that are not missing
3) This query is where the magic happens. I have made an assumption here which can be removed but was added to improve the efficiency/performance of the query. The assumption on line 26 is that I expect to find a substitute transaction on the same day as that of the missing / zero transaction.
The magic happens is line 23: The Row_Number function adds an auto number starting at 1 for the shortest time difference between the missing and non missing transaction. The next closest transaction has a rownum of 2 etc.
Please note that in line 25 I must join the currencies so that I do not mismatch the currency types. That is I don't want to substitute a AUD currency with CHF values. I want the closest matching currencies.
Combining the two data sets with a row_number to identify nearest transaction
4) Finally, lets get data where the RowNum is 1
The final query
The query full query is as follows;
; with cte_zero_rates as
(
Select *
from fxrates
where (spot_exp = 0 or spot_exp = 0)
),
cte_non_zero_rates as
(
Select *
from fxrates
where (spot_exp > 0 and spot_exp > 0)
)
,cte_Nearest_Transaction as
(
select z.FXRatesID as Zero_FXRatesID
,z.importDate as Zero_importDate
,z.currency as Zero_Currency
,nz.currency as NonZero_Currency
,nz.FXRatesID as NonZero_FXRatesID
,nz.spot_imp
,nz.importDate as NonZero_importDate
,DATEDIFF(ss, z.importDate, nz.importDate) as TimeDifferece
,ROW_NUMBER() Over(partition by z.FXRatesID order by abs(DATEDIFF(ss, z.importDate, nz.importDate)) asc) as RowNum
from cte_zero_rates z
left join cte_non_zero_rates nz on nz.currency = z.currency
and cast(nz.importDate as date) = cast(z.importDate as date)
--order by z.currency desc, z.importDate desc
)
select n.Zero_FXRatesID
,n.Zero_Currency
,n.Zero_importDate
,n.NonZero_importDate
,DATEDIFF(s, n.NonZero_importDate,n.Zero_importDate) as Delay_In_Seconds
,n.NonZero_Currency
,n.NonZero_FXRatesID
from cte_Nearest_Transaction n
where n.RowNum = 1
and n.NonZero_FXRatesID is not null
order by n.Zero_Currency, n.NonZero_importDate

using a loop in a stored procedure

i am trying to write a procedure that inserts rows into a temp table. the basis of the table is an insurance policy table listing the amount of the premium earned over the life of the policys. the original data consists of the trans_date (date sold) and the policy_start and policy_end dates. i.e. if the policy is 12 months long, we give each month 1/12 of the premium collected.
so something like
while trans_month < policy_end month
insert to tblUEPtmp
select dateadd(mm, 1, trans_date), earned_premium from tblpolicys
set trans_date = dateadd(mm, 1, trans_date)
(i know this is rubbush code but i completely baffled at the moment)
My problem is that i need to create the extra 11 rows of data and modify the transaction date to add 1 month each time until the modified transaction date = policy_end date.
i've researched using a CTE, but while loops aren't posible within a CTE..
is this something a multistatement table function could do?
Many thanks.

You can defo do this with a CTE, for example this little snippet will demonstrate how to do recursion using dates:
declare #start DATETIME = '2012-02-01'
declare #end DATETIME = '2013-02-01'
;with cte (date)
AS
(
SELECT #start
UNION ALL
SELECT DATEADD(mm,1,cte.date)
FROM cte WHERE DATEADD(mm,1,cte.date)<#end
)
select * from cte
That will generate a list of dates between #start & #end with month gaps.
You can
Use your real tables in place of the dummy dates
Perform an insert into...select ... from cte to insert your required data
If you can provide more detail about your table schema, I can probably help out with a more concrete example.

Something like this?
set #trans_date = ...
while #trans_date < #policy_end
begin
insert to tblUEPtmp
select trans_date, earned_premium
from tblpolicys
where {whatever}
set #trans_date = dateadd(mm, 1, #trans_date)
end

How can I create a list of weeks in access?

How do I create a query which breaks down a frequency of counts based on a list of weeks between 2 different dates in Access?
At the moment I have the following code in t-sql, but would like to have it run in Access.
declare #fromdate smalldatetime
declare #todate smalldatetime
declare #toptr smalldatetime
declare #fromptr smalldatetime
set #fromdate = '1/11/2010'
set #todate = '27/12/2010'
set #fromptr = dateadd(dd,1 - datepart(weekday,#fromdate), #fromdate)
while #fromptr < #todate
begin
print 'from: ' + cast(#fromptr as nvarchar) + ' --> ' + cast(#toptr as nvarchar)
set #fromptr = dateadd(dd,7, #fromptr)
set #toptr = dateadd(dd,7, #fromptr)
insert into #weeks values (#fromptr, #toptr)
end
I want to somehow bind some rows with lots of dates in them and aggregate them per 'week- ending date' from the dates creating in the table variable. Access doesn't seem to allow this kind of sql query, so was wondering if there was another way of doing this:
1) either by not using an intermediate table at all, 2) and/or converting the above code into access compatible

This will group by week (starting with Sunday) and be faster than other date calculation methods like DateAdd, DateDiff, DatePart, and Format.
SELECT
CDate((([DateColumn] - 1) \ 7) * 7 + 1) AS WeekStartingDate,
Sum([OrderCount]) AS SumOfOrders
FROM
Orders
GROUP BY
CDate((([DateColumn] - 1) \ 7) * 7 + 1);
If you want to see week ending date, add 7 at the end instead of 1. The GROUP BY expression can probably be just ([DateColumn] - 1) \ 7 but I'm not sure.
The backslash performs integer division, dividing by 7 converts a week of dates to a single integer, and the -1 adjusts for the fact that the "zero date" is a Saturday rather than a Sunday. To use a different starting day of the week, adjust the -1 and +1 by the same amount. To use Monday, for example, it would be -2 and +2.
This is language and region independent by depending on VB's internal representation of dates as numbers.

You can use Format in Access queries: http://msdn.microsoft.com/en-us/library/aa159657(v=office.10).aspx
SELECT Format(Date,"ww") FROM Table
GROUP BY Format(Date,"ww")

The plain-vanilla solution is to introduce a Calendar table which may look something like
Calendar
------------------------
FullDate date
CalendarYear integer
DayNumberInWeek integer
DayNumberInMonth integer
DayNumberInYear integer
DayNumberInEpoch integer
WeekNumberInYear integer
WeekNumberInEpoch integer
MonthNumberInYear integer
MonthNumberInEpoch integer
... and many more that you may need to group by
Then if you have table Counters
Counters
-----------
FullDate date
Value integer -- cumulative for one day
You can:
select
WeekNumberInYear
, sum(Value)
from Calendar as a
join Counters as b on b.FullDate = a.FullDate
where CalendarYear = 2010
group by WeekNumberInYear ;
The easiest way to populate the Calendar is to spend some time in Excel, create 10-20 years worth of rows and simply import in a DB.
Nothing specific to Access here, but hope you get the idea.

calculating "Max Draw Down" in SQL

edit: it's worth reviewing the comments section of the first answer to get a clearer idea of the problem.
edit: I'm using SQLServer 2005
something similar to this was posted before but I don't think enough information was given by the poster to truly explain what max draw down is. All my definitions of max draw down come from (the first two pages of) this paper:
http://www.stat.columbia.edu/~vecer/maxdrawdown3.pdf
effectively, you have a few terms defined mathematically:
Running Maximum, Mt
Mt = maxu in [0,t] (Su)
where St is the price of a Stock, S, at time, t.
Drawdown, Dt
Dt = Mt - St
Max Draw Down, MDDt
MDDt = maxu in [0,t] (Du)
so, effectively what needs to be determined is the local maximums and minimums from a set of hi and low prices for a given stock over a period of time.
I have a historical quote table with the following (relevant) columns:
stockid int
day date
hi int --this is in pennies
low int --also in pennies
so for a given date range, you'll see the same stockid every day for that date range.
EDIT:
hi and low are high for the day and low for each day.
once the local max's and min's are determined, you can pair every max with every min that comes after it and calculate the difference. From that set, the maximum difference would be the "Max Draw Down".
The hard part though, is finding those max's and min's.
edit: it should be noted:
max drawdown is defined as the value of the hypothetical loss if the stock is bought at it's highest buy point and sold at it's lows sell point. A stock can't be sold at a minval that came before a maxval. so, if the global minval comes before the global maxval, those two values do not provide enough information to determine the max-drawdown.

Brutally inefficient, but very simple version using a view is below:
WITH DDView
AS (SELECT pd_curr.StockID,
pd_curr.Date,
pd_curr.Low_Price AS CurrPrice,
pd_prev.High_Price AS PrevPrice,
pd_curr.Low_Price / pd_prev.High_Price - 1.0 AS DD
FROM PriceData pd_curr
INNER JOIN PriceData pd_prev
ON pd_curr.StockID = pd_prev.StockID
AND pd_curr.Date >= pd_prev.Date
AND pd_curr.Low_Price <= pd_prev.High_Price
AND pd_prev.Date >= '2001-12-31' -- #param: min_date of analyzed period
WHERE pd_curr.Date <= '2010-09-31' -- #param: max_date of analyzed period
)
SELECT dd.StockID,
MIN(COALESCE(dd.DD, 0)) AS MaxDrawDown
FROM DDView dd
GROUP BY dd.StockID
As usually you would perform the analysis on specific time period, it would make sense to wrap the query in a stored procedure with the parameters #StartDate, #EndDate and possibly #StockID. Again, this is quite inefficient by design - O(N^2), but if you have good indices and not huge amount of data, SQL Server will handle it pretty good.

Some things we need to consider in the problem domain:
Stocks have a range of prices every day, often viewed in candlestick charts
lets call the highest price of a day HI
lets call the lowest price of a day LOW
the problem is constrained by time, even if the time constraints are the IPO date and Delisting Dates
the maximum drawdown is the most you could possibly lose on a stock over that timeframe
assuming a LONG strategy: logically if we are able to determine all local maxes (MAXES) and all local mins (MINS) we could define a set of where we pair each MAX with each subsequent MIN and calculate the difference DIFFS
Sometimes the difference will result in a negative number, however that is not a drawdown
therefore, we need to select append 0 in the set of diffs and select the max
The problem lies in defining the MAXES and the MINS, with the function of the curve we could apply calculus, bummer we can't. Obviously
the maxes need to come from the HI and
the MINS need to come from the LOW
One way to solve this is to define a cursor and brute force it. Functional languages have nice toolsets for solving this as well.

For SQL Server and for one stock at a time, try this:
Create Procedure 'MDDCalc'(
#StartDate date,
#EndDate date,
#Stock int)
AS
DECLARE #MinVal Int
DECLARE #MaxVal Int
DECLARE #MaxDate date
SET #MaxVal = (
SELECT MAX(hi)
FROM Table
WHERE Stockid = #Stock
AND Day BETWEEN (#Startdate-1) AND (#EndDate+1))
SET #MaxDate=(
SELECT Min(Date)
FROM Table
WHERE Stockid = #Stock
AND hi = #MaxVal)
SET #MinVal = (
SELECT MIN(low)
FROM Table
WHERE Stockid = #Stock
AND Day BETWEEN (#MaxDate-1) AND (#EndDate+1))
SELECT (#MaxVal-#MinVal) AS 'MDD'

I have encounter this problem recently, My solution is like this:
let data: 3,5,7,3,-1,3,-8,-3,0,10
add the sum one by one, if the sum is great than 0, set it 0, else get the sum, the result would be like this
0,0,0,0,-1,0,-8,-11,-11,-1
The Maximum draw down is the lowest value in the data, -11.

Is this what you're after?
select StockID,max(drawdown) maxdrawdown
from (
select h.StockID,h.day highdate,l.day lowdate,h.hi - l.lo drawdown
from mdd h
inner join mdd l on h.StockID = l.StockID
and h.day<l.day) x
group by StockID;
It's a SQL based brute force approach. It compares every low price after today's hi price within the same stock and finds the greatest difference between the two prices. This will be the Maximum Draw Down.
It doesn't compare consider the same day as possible for maximum draw down as we don't have enough info in the table to determine if the Hi price happened before the Lo price on the day.

Here is a SQL Server 2005 user-defined function that should return the correct answer for a single stockid very efficiently
CREATE FUNCTION dbo.StockMaxDD(#StockID int, #day datetime) RETURNS int AS
BEGIN
Declare #MaxVal int; Set #MaxVal = 0;
Declare #MaxDD int; Set #MaxDD = 0;
SELECT TOP(99999)
#MaxDD = CASE WHEN #MaxDD < (#MaxVal-low) THEN (#MaxVal-low) ELSE #MaxDD END,
#MaxVal = CASE WHEN hi > #MaxVal THEN hi ELSE #MaxVal END
FROM StockHiLo
WHERE stockid = #Stockid
AND [day] <= #day
ORDER BY [day] ASC
RETURN #MaxDD;
END
This would not, however, be very efficient for doing a number of stockids at the same time. If you need to do many/all of the stockids at once, then there is a similar, but substantially more difficult approach that can do that very efficiently.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas