Kotlin: how to check if 2 date ranges are overlapping? [duplicate] - kotlin

This question already has answers here:
Determine Whether Two Date Ranges Overlap
(39 answers)
Closed 8 months ago.
my current solution is like this:
(A.startAt in B.startAt..B.endAt) ||
(A.endAt in B.startAt..B.endAt) ||
(A.startAt <= B.startAt && A.endAt >= B.endAt)
if there is more idiomatic or performant check, please help me.

How about max(A.startAt,B.startAt)<=min(A.endAt,B.endAt)

The most elegant (in my subjective opinion) is probably
val overlap = a.intersect(b).isNotEmpty()
Or you could do
a.any { b.contains(it) }
// or a.any(b::contains)
which should avoid creating a new collection (which intersect does) - I feel like it's less immediately obvious what it's doing though, the first one reads better to me.
You could manually work out if the start or end of one range is between the start and end of the other, which is probably the "most efficient" way - that can be a little tricky though because you can have descending ranges, which aren't IntRanges, they're just IntProgressions (which an IntRange is a subclass of). So you'd have to do this kind of thing:
fun overlap(a: IntProgression, b: IntProgression): Boolean {
val max = maxOf(a.first, a.last)
val min = minOf(a.first, a.last)
return (b.first >= min && b.first <= max) || (b.last >= min && b.last <= max)
}
and I mean, is it really worth it? (Maybe it is! But you should benchmark it if so, because you're adding complexity and there should be a concrete benefit to that tradeoff)

B.endAt >= A.startAt && B.startAt <= A.endAt

Related

Constraint to require a certain number of matches

I am trying to write a hard constraint that requires that a certain value has been chosen a certain number of times. I have a constraint written below, which (I think) filters to a set of results that match this criteria, and I want it to penalize if there are no such results. I cannot figure out how to work .ifNotExists() into this. I think I am missing some understanding.
fun cpMustUseN(constraintFactory: ConstraintFactory): Constraint {
return constraintFactory.forEach(MealMenu::class.java)
.join(CpMustUse::class.java, equal({ mm -> mm.slottedCp!!.id }, CpMustUse::cpId))
.groupBy({ _, cpMustUse -> cpMustUse.numRequired }, countBi())
.filter { numRequired, count -> count >= numRequired }
.penalize(HardSoftScore.ONE_HARD)
.asConstraint("cpMustUseN")
}
MealMenu is an entity:
#PlanningEntity
class MealMenu {
#PlanningId
var id = 0
#PlanningVariable(valueRangeProviderRefs = ["cpRange"])
var slottedCp: Cp? = null
}
CpMustUse is a #ProblemFactCollectionProperty on my solution class, and the class looks like this:
class CpMustUse {
var cpId = 1
var numRequired = 4
}
I want to, in this case, constrain the result such that cpId 1 is chosen at least 4 times.
There are two conceptual issues here:
groupBy() will only match if the join returns a non-zero number of matches. Therefore you will never get a countBi() of zero - in that case, groupBy() will simply never match. Therefore you can not use grouping to check that something does not exist.
ifNotExists() always applies to a fact from the working memory. You can not use it to check if a result of a previous calculation exists.
Combined together, this makes your approach infeasible. This particular requirement will be a bit trickier to implement.
Start by inverting the logic of the constraint you pasted. Penalize every time count < numRequired; this handles all cases where count >= 1.
Then introduce a second constraint that will handle specifically the case where the count would be zero - in this case, you should be able to use forEach(MealMenu::class.java).ifNotExists(CpMustUse::class, ...).

Query for Excel or DB2

How can I write this as formula in Excel, and also how to use this logic in DB2 query :
IF:
Month(max(A_dt)) <= month(B_dt)
that is being used
THEN:
day=01, month=month(A_dt), year=year(B_dt) + 1
ELSE:
day=01 Month=Month(A_dt) Year=Year(B_dt)
This question is quite vague and hard to answer without any idea of table structure and data set. Following the logic that you outline, year is the only output that changes based on the conditional.
The below contains three simple case statements to handle your conditional. The first two are completely pointless due to returning the same response no matter pass or fail (per your logic above), but I included them still just to illustrate the point as it may help you understand if the conditional is confusing you.
I would suggest reading up on the db2 manual for your specific flavor or doing some intro sql tutorials. The case is pretty fundamental and quite useful in query writing.
select
_key
,case when month(max(a_dt)) <= month(b_dt)
then 01
else 01
end as day
,case when month(max(a_dt)) <= month(b_dt)
then month(a_dt)
else month(a_dt)
end as month
,case when month(max(a_dt)) <= month(b_dt)
then year(b_dt) + 1
else year(b_dt)
end as year
from
your_table

SPOTFIRE how do I shape by column in the map chart with a custom expression

I have a spotfire question. I am attempting to create Shapes by Column Values in a Map Chart.
I would like to create these Shapes by a Custom Expression. The custom expression is below, I have simplified it so it is easier to read.
All I am saying is:
if((Current months oil rate - 12MonthAgo Oilrate)/12MonthAgo Oilrate)>0,"UP","Down")
When I run this calculation though it only gives me one value, (there are both positives and negatives so it should give two).
I am not sure what I have done wrong? Any help is appreciated.
<If(((((Sum(If(CurrentMonth),[OILRATECD],null))
-
Sum(If(12MonthsAgo),[OILRATECD],null)))))
/
Sum(If(12MonthsAgo),[OILRATECD],null)))>0,"UP","DOWN")>
ORIGINAL EQUATION:
<If(((((Sum(If((Month([DATE])=Month(DateAdd("mm",-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-2,DateTimeNow()))),[OILRATECD],null))
-
Sum(If((Month([DATE])=Month(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))),[OILRATECD],null)))))
/
Sum(If((Month([DATE])=Month(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))),[OILRATECD],null)))>0,"UP","DOWN")>
First, you have a lot of unnecessary parentheses but that shouldn't hurt anything.
If(
(
( --this open parentheses is unneeded
( --this open parentheses is unneeded
(
Sum(If((Month([DATE])=Month(DateAdd("mm",-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-2,DateTimeNow()))),[OILRATECD],null))
-
Sum(If((Month([DATE])=Month(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))),[OILRATECD],null))
)
) --this closed parentheses is unneeded
) --this closed parentheses is unneeded
/
Sum(If((Month([DATE])=Month(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-${MonthInterval}-2,DateTimeNow()))),[OILRATECD],null))
)
>0,"UP","DOWN")
The reason you are not getting UP and DOWN returned is because the condition isn't met. We need a sample data set with expected output to verify this.
However, here's a reason why you could be getting unexpected results, regarding NULL in your SUM(IF(...,[OILRATECD],NULL)) expression.
TL;DR
If your condtion in your IF() statement isn't ever evaluated to true
then NULL is returned to SUM(), and SUM(NULL,NULL,NULL) = NULL
and NULL is not > nor is it < 0, thus your outer IF()
statement would return NULL instead of "UP" or "DOWN"
LONG VERSION
Spotfire ignored NULL when evaluating SUM() which is what you want. For example, Sum(4,3,NULL) = 7 and this is the behavior we want. However...
Spotfire doesn't ignore NULL for addition, subtraction, division, and other comparison operators like >. So, 4 - NULL = NULL and NULL / 18 = NULL and so on. This means if either of your two SUM() methods return NULL then your entire expression will be NULL because...
NULL isn't > nor is it < and certainly not = 0. NULL is the absence of a value, and thus can't be compared or equated to anything. For example, If(NULL > 1,"YES","NO") doesn't return YES or NO... it returns NULL, the lack of a value. Also, If(NULL=NULL,"YES","NO") will also return NULL
HOW TO GET AROUND THIS
Use IS NULL and IS NOT NULL in a IF() statement to set it to a default value, or use 0 in place of NULL in your current expression.
Sum(If((Month([DATE])=Month(DateAdd("mm",-2,DateTimeNow()))) and (Year([DATE])=Year(DateAdd("mm",-2,DateTimeNow()))),[OILRATECD],0))
BIG SIDE NOTE
You said your equation's pseudo code is:
If((Current months oil rate - 12MonthAgo Oilrate)/12MonthAgo Oilrate)>0,"UP","Down")
This doesn't seem to be what you are evaluating. Instead, I read:
x = (-var) + (-2) thus var < -2 (i.e. -3....-6...-55)
if((sum([2 months ago oil rate]) - sum([x months ago oil rate])) > 0, "UP","DOWN")
So you aren't ever looking at the current month's oil rate, but instead are looking at current - 2. Also, you are looking at the SUM() over that entire month... this may be what you want by you may be actually looking for Max() or Min()
OVER FUNCTION
You can handle this a lot easier most likely with a few calculated columns to keep your data / expressions clean and legible
DatePart("yy",[Date]) as [Year]
DatePart("mm",[Date]) as [Month]
Max([OILRATECD]) OVER (Intersect([Year],[Month])) as [YearMonthRate] or use SUM() if that's what you really want.
Check to see the difference with Rank()

Number between a and b - non-inclusive on a, inclusive on b

(I'm a little new to SQL) I have a lot of queries I'm re-writing which have a where clause like this:
where some_number > A
and some_number <= B
I want to use a single where clause (fewer lines, it isn't faster/slower is it?) like this:
where some_number between A and B
The problem is the first clause is exclusive on A and inclusive on B. Is there any way I can specify "inclusivisity" on a single line like the second query? Thanks.
A couple of points...
Firstly, it's only "fewer lines" if you use fewer lines. I would format it like this:
where some_number > A and some_number <= B
because it's really one range condition with each end of the range coded separately.
Secondly, it's actually no faster or slower than the between version, because under the covers between A and B gets converted to:
where (some_number >= A) and (some_number <= B)
so the performance is identical.
Basically, don't worry about it.
You can just offset your a by "+1"
Or just use your first syntax, it's easier to read.

Maths! Approximating the mean, without storing the whole data set

Obvious (but expensive) solution:
I would like to store rating of a track (1-10) in a table like this:
TrackID
Vote
And then a simple
SELECT AVERAGE(Vote) FROM `table` where `TrackID` = some_val
to calculate the average.
However, I am worried about scalability on this, especially as it needs to be recalculated each time.
Proposed, but possibly stupid, solution:
TrackID
Rating
NumberOfVotes
Every time someone votes, the Rating is updated with
new_rating = ((old_rating * NumberOfVotes) + vote) / (NumberOfVotes + 1)
and stored as the TrackID's new Rating value. Now whenever the Rating is wanted, it's a simple lookup, not a calculation.
Clearly, this does not calculate the mean. I've tried a few small data sets, and it approximates the mean. I believe it might converge as the data set increases? But I'm worried that it might diverge!
What do you guys think? Thanks!
Assuming you had infinite numeric precision, that calculation does update the mean correctly. In practice, you're probably using integer types, so it won't be exact.
How about storing the cumulative vote count, and the number of votes? (i.e. total=total+vote, numVotes=numVotes+1). That way, you can get the exact mean by dividing one by the other.
This approach will only break if you get so many votes that you overflow the range of the data type you're using. So use a big data type (32-bit ought to be enough, unless you're expecting ~4 billion votes)!
Store TrackId, RatingSum, NumberOfVotes in your table.
Every time someone votes,
NumberOfVotes = NumberOfVotes + 1
RatingsSum = RatingsSum + [rating supplied by user]
Then when selecting
SELECT TrackId, RatingsSum / NumberOfVotes FROM ...
Your solution is completely legit. and differes only by roughly a few times the floating point precision from a value calculated from the full source set.
You can certainly calculate a running mean and standard deviation without having all the points in hand. You merely need to accumulate the sum, sum of squares, and number of points.
It's not an approximation; the mean and standard deviation are exact.
Here's a Java class that demonstrates. You can adapt to your SQL solution as needed:
package statistics;
public class StatsUtils
{
private double sum;
private double sumOfSquares;
private long numPoints;
public StatsUtils()
{
this.init();
}
private void init()
{
this.sum = 0.0;
this.sumOfSquares = 0.0;
this.numPoints = 0L;
}
public void addValue(double value)
{
// Check for overflow in either number of points or sum of squares; reset if overflow is detected
if ((this.numPoints == Long.MAX_VALUE) || (this.sumOfSquares > (Double.MAX_VALUE-value*value)))
{
this.init();
}
this.sum += value;
this.sumOfSquares += value*value;
++this.numPoints;
}
public double getMean()
{
double mean = 0.0;
if (this.numPoints > 0)
{
mean = this.sum/this.numPoints;
}
return mean;
}
public double getStandardDeviation()
{
double standardDeviation = 0.0;
if (this.numPoints > 1)
{
standardDeviation = Math.sqrt((this.sumOfSquares - this.sum*this.sum/this.numPoints)/(this.numPoints-1L));
}
return standardDeviation;
}
public long getNumPoints() { return this.numPoints; }
}
Small improvement on your solution. You have the table:
TrackID
SumOfVotes
NumberOfVotes
When someone votes,
NumberOfVotes = NumberOfVotes + 1
SumOfVotes = SumOfVotes + ThisVote
and to see the average you only then do a division:
SELECT TrackID, (SumOfVotes/NumberOfVotes) AS Rating FROM `table`
I would add that the original (obvious and expensive) solution is only expensive compared to the provied solution when calculating the average.
It is cheaper when a vote is added, deleted or changed.
I guess that the original table
TrackID
Vote
VoterID
would still need to be used in the provided solution to keep track of the vote (rating) of every voter. SO, two tables have to be updated for every change in this table (insert, delete or Vote update).
In other words, the original solution may be the best way to go.