dynamically varied number of conditions in the 'where' statement using LINQ - sql

I'm working on my first project using LINQ (in mvc), so there is probably something very simple that I missed. However, a day of searching and experimenting has not turned up anything that works, hence the post.
I'm trying to write a LINQ query (Linq to SQL) that will contain a multiple number of conditions in the where statement separated by an OR or an AND. We don't know how many conditions are going to be in the query until runtime. This is for a search filter control, where the user can select multiple criteria to filter by.
select * from table
where table.col = 1
OR table.col = 2
OR table.col = 7
.... 'number of other conditions
Before I would just construct the SQL query as a string while looping over all conditions. However, it seems like there should be a nice way of doing this in LINQ.
I have tried looking using expression trees, but they seem a bit over my head for the moment. Another idea was to execute a lambda function inside the where statement, like so:
For Each value In values
matchingRows = matchingRows.Where(Function(row) row.col = value)
However, this only works for AND conditions. How do I do ORs?

I would use PredicateBuilder for this. It makes dynamic WHERE clauses very easy.

AND is easy - you can just call Where in a loop. OR is much trickier. You mention SQL, so I'm assuming this is something like LINQ-to-SQL, in which case one way I've found to do this involves building custom Expression trees at runtime - like so (the example is C#, but let me know if you need help translating it to VB; my VB isn't fantastic any more, so I'll let you try first... you can probably read C# better than I can write VB).
Unfortunately, this won't work with EF in 3.5SP1 (due to the Expression.Invoke), but I believe this is fixed in 4.0.

Something like this should work (forgive my VB):
Expression(Of Func(Of Something, Boolean)) filter = Nothing
ParameterExpression rowParam = Expression.Parameter("row", CType(Something))
For Each value In values
filterPart = Expression.Equal( _
Expression.Property(rowParam, "col"), _
Expression.Constant(value)))
If filter Is Nothing Then
filter = filterPart
Else
filter = Expression.OrElse(filter, filterPart)
End If
Next
If newPredicate IsNot Nothing Then
matchingRows = matchingRows.Where( _
Expression.Lambda(Of Func(Of SomeType, Boolean))(filter, rowParam))
End If
No guarantees, however, my VB is a little rusty :-)
But PredicateBuilder might be a better solution if you want to do more complicated stuff than just Ands and Ors.

Related

Use String for IF statement conditions

I'm hoping someone can help answer my question, perhaps with an idea of where to go or whether what I'm trying to do is not possible with the way I want to do it.
I've been asked to write a set of rules based on the data held by our ERP form components or variables.
Unfortunately, these components and variables cannot be accessed or used outside of the ERP, so I can't use SQL to query the values and then build some kind of SQL query.
They'd like the ability to put statements like these:
C(MyComponentName) = C(MyOtherComponentName)
V(MyVariableName) > 16
(C(MyComponentName) = "") AND V(MyVariableName) <> "")
((C(MyComponentName) = "") OR C(MyOtherComponentName) = "") AND V(MyVariableName) <> "")
This should be turned into some kind of query which gets the value of MyComponentName and MyOtherComponentName and (in this case) compares them for equality.
They don't necessarily want to just compare for equality, but to be able to determine whether a component / variable value is greaterthan or lessthan etc.
Basically it's a free-form statement that gets converted into something similar to an IF statement.
I've tried this:
Sub TestCondition()
Dim Condition as string = String.Format("{0} = {1}", _
Component("MyComponent").Value, Component("MyOtherComponent").Value)
If (Condition) Then
' Do Something
Else
' Do Something Else
End If
End Sub
Obviously, this does not work and I honestly didn't think it would be so simple.
Ignoring the fact that I'd have to parse the line, extract the required operators, the values from components or variables (denoted by a C or V) - how can I do this?
I've looked at Expression Trees but these were confusing, especially as I'd never heard of them, let alone used them. (Is it possible to create an expression tree for dynamic if statements? - This link provided some detail on expression trees in C#)
I know an easier way to solve this might be to simply populate the form with a multitude of drop-down lists, so users pick what they want from lists or fill in a text box for a specific search criteria.
This wouldn't be a simple matter as the ERP doesn't allow you to dynamically create controls on its forms. You have to drag each component manually and would be next to useless as we'd potentially want at least 1 rule for every form we have (100+).
I'm either looking for someone to say you cannot do this the way you want to do it (with a suitable reason or suggestion as to how I could do it) that I can take to my manager or some hints, perhaps a link or 2 pointing me in the right direction.
If (Condition) Then
This is not possible. There is no way to treat data stored in a string as code. While the above statement is valid, it won't and can't function the way you want it to. Instead, Condition will be evaluated as what it is: a string. (Anything that doesn't boil down to 0 is treated as True; see this question.)
What you are attempting borders on allowing the user to type code dynamically to get a result. I won't say this is impossible per se in VB.Net, but it is incredibly ambitious.
Instead, I would suggest clearly defining what your application can and can't do. Enumerate the operators your code will allow and build code to support each directly. For example:
Public Function TestCondition(value1 As Object, value2 As Object, op as string) As Boolean
Select Case op
Case "="
Return value1 = value2
Case "<"
Return value1 < value2
Case ">"
Return value1 > value2
Case Else
'Error handling
End Select
End Function
Obviously you would need to tailor the above to the types of variables you will be handling and your other specific needs, but this approach should give you a workable solution.
For my particular requirements, using the NCalc library has enabled me to do most of what I was looking to do. Easy to work with and the documentation is quite extensive - lots of examples too.

Using Orderby on BatchedJoinBlock(Of T1, T2) - Dataflow (Task Parallel Library)

I'm just looking to be able to sort the results of a BatchedJoinBlock (http://msdn.microsoft.com/en-us/library/hh194683.aspx) so that the different results of the different targets stay together. I will explain! Example in some pseudo-code:
Dim batchedJoin = New BatchedJoinBlock(Of String, object)(4)
batchedJoin.Target1.Post("String1Target1")
batchedJoin.Target2.Post(CType(BuildIt, StringBuilder1))
batchedJoin.Target1.Post("String1Target2")
batchedJoin.Target2.Post(CType(BuildIt, StringBuilder2))
Dim results = batchedJoin.Receive()
'This sorts one result...
Dim SortByResult = results.Item1.OrderBy(Function(item) item.ToString, New NaturalStringComparer)
Basically I've got a string and an object, the SortByResult variable above sorts the strings exactly as I'd like them to sort. I'm looking for a way to get the objects that used to be at the same index number in target2 into the same order. e.g. if "String1Target1" changes order I'd like to somehow reliably refer to/pair it together with "StringBuilder1". The actual end result just needs to be that the objects (target2) are sorted in the order that is dictated by the strings being sorted (target1). Something like:
Dim EndResult = results.Item2.OrderBy(strings in target1)
but I'll gladly take an intermediate solution! I've also tried using a dictionary (results.Item2.ToDictionary) with the string as a key (which would also be a fine solution) but it's a bit beyond my ken using lamba expressions in the proper context. I can realistically do this in several steps with a list or something, but I'm trying to get something more efficient/learn something, and it seems like there's a lot of default options with the results of the jointblock that I'm just not experienced enough to use. Thanks in advance for any help you can provide!
To me, it looks like you don't actually want BatchedJoinBlock, because the two pieces of data always come together. A better option for that would be a BatchBlock of Tuple<string, object>. When you have that, you can then use LINQ directly to sort each batch:
results.OrderBy(Function(tuple) tuple.Item1)

.NET .ToList function is WAY WAY too slow

We're having alot of troubles here with the .ToList command, it's used in VB.NET with a MVC ASP.NET web project.
We have ~2000 entries in our database, we use a LINQ command to SELECT and ORDER the 2000 entries. The result is transformed into a list by the .ToList method for our pager and grid builder. Problem is, the .ToList takes WAY WAY TOO long (we're talking 40-60seconds to execute) so our websites looks slow as hell.
We tested the equivalent SQL command on the database and it responds quickly. It's not a problem with the commands or a slow database server. We tried an IEnumrable witch was alot faster but we need it in the .ToList format at the end for our grids. What's the deal with the .ToList ? Anything we can do ?
Here's the code :
'list = (From c In _entities.XXXXXXXXSet.Include("XXXXXX").Include("XXXXXX") _
Where Not (c.XXXXXX Is Nothing AndAlso c.XXXXXX = String.Empty) _
And c.XXXXXX = codeClient _
And c.XXXXXX > dateLimite _
Order By c.XXXXXX Descending _
Select c).ToList()
We divided the code and to leave only the .ToList function alone and that's really what sucks up all the time. The LINQ command executes in no time.
Thanks alot.
Tom
Of course the LINQ command "executes" in no time, because it just represents the query. The query is only executed once you iterate over it, which is exactly what the ToList method does.
I would advise you to use the Skip and Take operators in your pagers to narrow down the result queried from the database. Doing this, you only request the 10 or 20 elements or whatever you need, resulting in a much smoother experience.
I think it would be better to page in the query instead of fetching all data in one go, using Skip and Take.
list = (From c In _entities.XXXXXXXXSet.Include("XXXXXX").Include("XXXXXX") _
Where Not (c.XXXXXX Is Nothing AndAlso c.XXXXXX = String.Empty) _
And c.XXXXXX = codeClient _
And c.XXXXXX > dateLimite _
Order By c.XXXXXX Descending _
Select c).Skip(pageSize * pageIndex).Take(pageSize).ToList();
That, paired with some well targeted caching (if possible) should provide a snappier user experience.
When you say "the equivalent SQL command on the database and it responds quickly" - is that the actual SQL statements which the LINQ code is generating or handcoded SQL which is logically equivalent?
Because that LINQ-generated code might not be terribly efficient.
For stuff like this, it's often useful to run the code in the profiler. There could be any number of things slowing down... network, memory, object size, etc.
You could also create your own list and copy the IEnumerable values into it. If it's at all possible, I would recommend changing your grid to accept an IEnumerable.
To confirm the performance of ToList as opposed to query execution, add a statement and compare:
//this call iterates a query, causing a database roundtrip.
List<Row> result = query.ToList();
//this call generates a new List by iterating the old List.
result = result.ToList();
Looking over your query, I suspect you'll need an codeClient, and an index on each of the tables mentioned in the calls to .Include. Grab the generated sql and check the execution plan to confirm.

Is it efficient to use LINQ to parse strings?

I've been using LINQ so much in the last couple of weeks that when I had to write a one line function to remove < and > from a string, I found that I had written it as a LINQ query:
Public Function StripLTGT(text As String) As String
Return String.Join("", (From i As Char In text.ToCharArray Where i <> "<"c Where i <> ">"c).ToArray)
End Function
My question is, is it better to do it with LINQ as above or with StringBuilder as I've always done, as below:
Public Function StripLTGT(text As String) As String
Dim a As New StringBuilder(text)
a = a.Replace("<", "")
a = a.Replace(">", "")
Return a.ToString
End Function
Both work, the second one is easier to read, but the first one is designed for executing queries against arrays and other enumerables.
Regex.Replace("[<>]", "")
Is much more straightforward.
Or:
myString = myString.Replace("<", "").Replace(">", "")
Whether or not option A, B or C is faster than the others is hard to say because option A may be better on small strings while option B may be better on long strings, etc.
Either one should really be fine in terms of functionality. The first one is not efficient as is. The ToArray call is doing far more work than necessary (if you're on .NET 4.0, it is not needed anyway), and the ToCharArray call is not needed. Basically the characters in the input string are being iterated a lot more than they need to be, and extra arrays are allocated superfluously.
I wouldn't say this particularly matters; but you asked about efficiency, so that's why I mention it.
The second one seems fine to me. Note that if you wanted to go the one-line route, you could still do so with a StringBuilder and I think still have something more concise than the LINQ version (though I haven't counted characters). Whether or not this even outperforms the more direct String.Replace option is kind of unclear to me, though:
' StringBuilder.Replace version:
Return New StringBuilder(text).Replace("<", "").Replace(">", "").ToString()
' String.Replace version:
Return text.Replace("<", "").Replace(">", "")

Do you choose Linq over Forloops?

Given a datatable containing two columns like this:
Private Function CreateDataTable() As DataTable
Dim customerTable As New DataTable("Customers")
customerTable.Columns.Add(New DataColumn("Id", GetType(System.Int32)))
customerTable.Columns.Add(New DataColumn("Name", GetType(System.String)))
Dim row1 = customerTable.NewRow()
row1.Item("Id") = 1
row1.Item("Name") = "Customer 1"
customerTable.Rows.Add(row1)
Dim row2 = customerTable.NewRow()
row2.Item("Id") = 2
row2.Item("Name") = "Customer 2"
customerTable.Rows.Add(row2)
Dim row3 = customerTable.NewRow()
row3.Item("Id") = 3
row3.Item("Name") = "Customer 3"
customerTable.Rows.Add(row3)
Return customerTable
End Function
Would you use this snippet to retrieve a List(Of Integer) containing all Id's:
Dim table = CreateDataTable()
Dim list1 As New List(Of Integer)
For i As Integer = 0 To table.Rows.Count - 1
list1.Add(CType(table.Rows(i)("Id"), Integer))
Next
Or rather this one:
Dim list2 = (From r In table.AsEnumerable _
Select r.Field(Of Integer)("Id")).ToList()
This is not a question about whether to type cast the Id column to Integer by using .Field(Of Integer), CType, CInt, DirectCast or whatever but generally about whether or not you choose Linq over forloops as the subject implies.
For those who are interested: I ran some iterations with both versions which resulted in the following performance graph:
graph http://dnlmpq.blu.livefilestore.com/y1pOeqhqQ5neNRMs8YpLRlb_l8IS_sQYswJkg17q8i1K3SjTjgsE4O97Re_idshf2BxhpGdgHTD2aWNKjyVKWrQmB0J1FffQoWh/analysis.png?psid=1
The vertical axis shows the milliseconds it took the code to convert the rows' ids into a generic list with the number of rows shown on the horizontal axis. The blue line resulted from the imperative approach (forloop), the red line from the declarative code (linq).
Whatever way you generally choose: Why do you go that way and not the other?
Whenever possible I favor the declarative way of programming instead of imperative. When you use a declarative approach the CLR can optimize the code based on the characteristics of the machine. For example if it has multiple cores it could parallelize the execution while if you use an imperative for loop you are basically locking this possibility. Today maybe there's no big difference but I think that in the future more and more extensions like PLINQ will appear allowing better optimization.
I avoid linq unless it helps readability a lot, because it completely destroys edit-and-continue.
When they fix that, I will probably start using it more, because I do like the syntax a lot for some things.
For almost everything I've done I've come to the conclusion that LINQ is optimized enough. If I handcrafted a for loop it would have better performance, but in the grand scheme of things we are usually talking milliseconds. Since I rarely have a situation where those milliseconds will make any kind of impact, I find it's much more important to have readable code with clear intentions. I would much rather have a call that is 50ms slower than have someone come along and break it altogether!
Resharper has a cool feature that will flag and convert loops into Linq expressions. I will flip it to the Linq version and see if that hurts or helps readability. If the Linq expression more clearly communicates the intent of the code, I will go with that. If the Linq expression is unreadable, I will flip back to the foreach version.
Most of the performance issues don't really compare with readability for me.
Clarity trumps cleverness.
In the above example, I would go with the the Linq version since it clearly explains the intent and also locks out people accidently adding side effects in the loop.
I recently found myself wondering whether I've been totally spoiled by LINQ. Yes, I now use it all the time to pick all sort of things out from all sort of collections.
I started to, but found out in some cases, I saved time by using this approach:
for (var i = 0, len = list.Count; i < len; i++) { .. }
Not necessarily in all cases, but some. Most extension methods use the foreach approach of querying.
I try to follow these rules:
Whenever I'm just querying (filtering, projecting, ...) collections, use LINQ.
As soon as I'm actually 'doing' something with the result (i.e, introduce side effects), I'll use a for loop.
So in this example, I'll use LINQ.
Also, I always try to split up the 'query definition' from the 'query evaluation':
Dim query = From r In table.AsEnumerable()
Select r.Field(Of Integer)("Id")
Dim result = query.ToList()
This makes it clear when that (in this case in-memory) query will be evaluated.