Extending WCF Data Service to synthesize missing data on request - wcf

I have got a WCF Data Service based on a LINQ to SQL data provider.
I am making a query "get me all the records between two dates".
The problem is that I want to synthesize two extra records such that I always get records that fall on the start and end dates, plus all the ones in between which come from the database.
Is there a way to "intercept" the request so I can synthesize these records and return them to the client?
Thanks

I suspect the answer involves using "Interceptors".
Just stumbled across this...
http://msdn.microsoft.com/en-us/library/dd744842.aspx

The more I think about this the more I would say "please don't do this". The problem is that in WCF Data Services (or OData for that matter), each entity you return (entity == record) needs to have its unique URI. The clients also assume that if the entity is returned from the server (unless it was deleted), the entity can be accessed again.
In your case though, the boundary entities are defined by the query and they really only exist in the context of the query. Given a different query they are different. So all in all, they do not behave like entities, they behave more like some kind of query metadata.
Anyway, if you really think this is the right thing to do... It's rather hard to do this. The only approach I can think of is to hook into the IQueryable returned from the entity set (layer your own IQueryable on top of the one from LINQ to SQL). Then when a query gets executed, you parse the expression tree and find the conditions which define the range, then you return a custom implementation of IEnumerable which will "synthesize" the two special entities at the begining and at the end and it will return the rest from the underlying LINQ to SQL results. All of this is lot of code and it's definitely not easy to do.
A second possible way would be to implement this as a Service operation (requires that the client knows that there's a special operation on the server to do this though). It would also make a bit more sense, as the service operation would get the range as its parameters instead of as a filter and thus is much easier for you to figure out the range (no expression tree parsing).

Related

How Can a Data Access Object (DAO) Allow Simultaneous Updates to a Subset of Columns?

Please forgive me if I misuse any OOP terminology as I'm still getting my feet wet on the subject.
I've been reading up on object oriented programming (OOP) - specifically for web applications. I have been going over the concept of a data access object (DAO). The DAO is responsible for CRUD (Create, Read, Update, and Delete) methods and connecting your application's service (business logic) layer to the database.
My question specifically pertains to the Update() method within a DAO. In the examples I've read about, developers typically pass a bean object into the DAO update() method as its main argument updateCustomer(customerBean) The method then executes some SQL which updates all of the columns based on the data in the bean.
The problem I see with this logic is that the update() method updates ALL columns within the database based on the bean's data and could theoretically cause it to overwrite columns another user or system might need to update simultaneously.
A simplified example might be:
User 1 updates field A in the bean
User 2 updates field B in the bean
User 2 passes bean to DAO, DAO updates all fields.
User 1 passes bean to DAO, DAO updates all fields.
User 2's changes have been lost!
I've read about Optimistic Locking and Pessimistic Locking as possible solutions for only allowing one update at a time but I can think of many cases where an application needs to allow for editing different parts of a record at the same time without locking or throwing an error.
For example, lets say an administrator is updating a customer's lastName at the same time the customer logs into the web site and the login system needs to update the dateLastLoggedIn column while simultaneously a scheduled task needs to update a lastPaymentReminderDate. In this crazy example, if you were passing a bean object to the update() method and saving the entire record of data each time its possible that whichever process runs the update() method last would overwrite all of the data.
Surely there must be a way to solve this. I've come up with a few possibilities based on my research but I would be curious to know the proper/best way to accomplish this.
Possible solution 1: DAO Update() Method Does Not Accept Bean as Argument
If the update() method accepted a structure of data containing all of the columns in the database that need updating instead of a bean object you could make your SQL statement smart enough to only update the fields that were passed to the method. For example, the argument might look like this:
{
customerID: 1,
firstName: 'John'
}
This would basically tell the update() method to only update the column firstName based on the customerID, 1. This would make your DAO extremely flexible and would give the service layer the ability to dynamically interact with the database. I have a gut feeling that this violates some "golden rule" of OOP but I'm not sure which. I've also never seen any examples online of a DAO behaving like this.
Possible Solution 2: Add additional update() methods to your DAO.
You could also solve this by adding more specific update() methods to your DAO. For example you might have one for dateLastLoggedIn()' and 'dateLastPaymentReminderDate(). This way each service that needs to update the record could theoretically do so simultaneously. Any locking could be done for each specific update method if needed.
The main downside of this approach is that your DAO will start to get pretty muddy with all kinds of update statements and I've seen many blog posts writing about how messy DAOs can quickly become.
How would you solve this type of conundrum with DAO objects assuming you need to allow for updating subsets of record data simultaneously? Would you stick with passing a bean to the DAO or is there some other solution I haven't considered?
If you do a DAO.read() operation that returns a bean, then update the bean with the user's new values, then pass that bean to the DAO.update(bean) method, then you shouldn't have a problem unless the two user operations happen within milliseconds of each other. Your question implies that the beans are being stored in the session scope or something like that before passed to the update() method. If that's what you're doing, don't, for exactly the reasons you described. You don't want your bean getting out of sync with the db record. For even better security, wrap a transaction around the read and update operations, then there'd be no way the two users could step on each other's toes, even if user2 submits his changes at the exact same time as user 1.
Read(), set values, update() is the way to go, I think. Keep the beans fresh. Nobody wants stale beans.

Best Practice For Updating Entity in Web Api

I'm researching best practice for updating entity from action that called by client. There are several ways to do that but none of them seem best practice.
1- Getting datas that will be updated via reflection from request model and update entity with these properties. But reflection doesn't recommended to use in web api.
2- Sending all datas of entity to client and getting it's updated version from request. It seems make unnecessary traffic.
3- Getting datas that will be updated and check them with if else conditions for getting which ones changed. It's so basic and not generic, seems so unprofessionally.
Request Model that i talked about is clone of entity model.
First off, don't use Reflection. It's slow as hell and makes your code extra fragile.
When it comes to EF, usually there are 3 possible solutions:
1; The client sends the whole updated entity, and only the updated entity. In this case, you simply have to attach the entity to the corresponding entityset and mark the entity state as Modified.
2; The client sends both the original entity and the updated entity. You attach the original and set the currentvalues to the the update entity.
3; The client only sends the modified properties, not the whole entity. In this case you have to query the original entity from the db and set the properties either one by one or again override the currentvalues.
The 3 approaches differ in their bandwith requirement and the number of queries they make.
1; If we take this as the baseline, it has a bandwith requirement of sending one entity from the client to the server, and then sending this one entity from the server to the db. This makes 1 db query altogehter (attaching does not require querying, so only the saving changes part initiates the query).
2; This has a bandwith of sending two entities from the client to the server. Here you have to send less data from the server to the db, because the changed properties are calculated when you set the currentvalues. Again, just 1 query (attaching and setting currentvalues don't initiate queries, so only the saving changes part creates a query).
3; This has the least bandwith requirement both from the client to the server and from the server to the db (both times only the changed properties are sent). However, this does need one more query besides saving, because you have to query the original values from the db, before setting the changes.
I ususally find the the first approach is a good trade-off between the other two. It does send more data than the third, but still less than the second, and it only initiates the one query for saving data. Also I like to minimize the traffic between the client and the server even if it means there is more traffic between the server and the db. The clients (for me at least) are usually mobile, so no guaranteed bandwith, no guaranteed battery lifetime. The server and the db are much "closer" and they don't have these restrictions. But of course this can be different for your application.

web app OO concept confusion

This is a concept question, regarding "best practice" and "efficient use" of resources.
Specifically dealing with large data sets in a db and on-line web applications, and moving from a procedural processing approach to a more Object Oriented approach.
Take a "list" page, found in almost all CRUD aspects of the application. The list displays a company, address and contact. For the sake of argument, and "proper" RDBM, assume we've normalized the data such that a company can have multiple addresses, and contacts.
- for our scenario, lets say I have a list of 200 companies, each with 2-10 addresses, each address has a contact. i.e. any franchise where the 'store' is named 'McDonalds', but there may be multiple addresses by that 'name').
TABLES
companies
addresses
contacts
To this point, I'd make a single DB call and use joins to pull back ALL my data, loop over the data and output each line... Some grouping would be done at the application layer to display things in a friendly manner. (this seems like the most efficient way, as the RDBM did the heavy lifting - there was a minimum network calls (one to the db, one from the db, one http request, one http response).
Another way of doing this, if you couldn't group at the application layer, is to query for the company list, loop over that, and inside the loop make separate DB call(s) for the address, contact. less efficient, because you're making multiple DB calls
Now - the question, or sticking point.... Conceptually...
If I have a company object, an address object and a contact object - it seems that in order to achieve the same result - you would call a 'getCompanies' method that would return a list, and you'd loop over the list, and call 'getAdderss' for each, and likewise a 'getContact' - passing in the company ID etc.
In a web app - this means A LOT more traffic from the application layer to the DB for the data, and a lot of smaller DB calls, etc. - it seems SERIOUSLY less effective.
If you then move a fair amount of this logic to the client side, for an AJAX application, you're incurring network traffic ON TOP of the increased internal network overhead.
Can someone please comment on the best ways to approach this. Maybe its a conceptual thing.
Someone suggested that a 'gateway' is when you access these large data-sets, as opposed to smaller more granular object data - but this doesn't really help my understanding,and Im not sure it's accurate.
Of course getting everything you need at once from the database is the most efficient. You don't need to give that up just because you want to write your code as an OO model. Basically, you get all the results from the database first, then translate the tabular data into a hierarchical form to fill objects with. "getCompanies" could make a single database call joining addresses and contacts, and return "company" objects that contain populated lists of "addresses" and "contacts". See Object-relational mapping.
I've dealt with exactly this issue many times. The first and MOST important thing to remember is : don't optimize prematurely. Optimize your code for readability, the DRY principle, etc., then come back and fix things that are "slow".
However, specific to this case, rather than iteratively getting the addresses for each company one at a time, pass a list of all the company IDs to the fetcher, and get all the addresses for all those company ids, then cache that list of addresses in a map. When you need to fetch an address by addressID, fetch it from that local cache. This is called an IdentityMap. However, like I said, I don't recommend recoding the flow for this optimization until needed. Most often there are 10 things on a page, not 100 so you are saving only a few milliseconds by changing the "normal" flow for the optimized flow.
Of course, once you've done this 20 times, writing code in the "optimized flow" becomes more natural, but you also have the experience of when to do it and when not to.

Where should we calculate fields?

I'm currently working in a Silverlight / MS SQL project where the Entity Framework has not been implemented and I would like to know what's the best practice to deal with calculated fields in this particular situation.
Considering that some external system might also consume my data directly in the DB or thru a web service, here's the 3 options I can see right now.
1) Force any external system to consume data thru a web service and create all the calculated fields in the objects only.
2) Create the calculated fields in a DB view and resync your object with the server each time a value needs to be calculated.
3) Replicate the calculation rules in the object and the database view.
Any other suggestions would also be welcomed.
I would recommend to follow two principles: data decoupling and minimum functionality duplication. Both would suggest to put your calculations in one place only, and serve them already calculated. So I would implement the calculations in the DB, and serve them via a web service.
However, you have to consider your particular case. For example, if the calculations are VERY heavy, you could delegate them to the client to spare server resources. This could even be the reason you are using Silverlight. I am in a similar situation on a project, and I found that the best compromise is to push raw data to the client and have it do the heavy computations.
Having a best practice or approach for this kind of problem is difficult as circumstances change what was formerly a good approach might start to seem less useful. That said where possible I would do anything data related at the DB level including calculated fields. This way you know no matter where you are looking at the data from you will see the same results. So your web service, SQL reporting and anything else that needs to look at or receive data will see the same result.

WCF/RIA with one common set of CRUD methods

I am very new to WCF/RIA services. I am looking to build an application using PRISM/MEF where I can offer new plug-ins for the application from time to time. Now, my database structure is pretty much static. It will not see many changes during its life (but there still might be a few). The new plug-ins will use the entity classes exposed by the database.
My Question is when I create new plug-in controls, these controls might need some special server side methods to be run. Which would mean I update my WCF/RIA service to account for the new methods. I really want to avoid that and was wondering if it is possible to create a WCF service that has just 4 CRUD mehods. I can pass any entity to these methods and depending upon the type, the entity gets saved, updated or deleted. Also it lets me pass any kind of LINQ query to the get method and returns me the appropriate results. The goal is to avoid making changes to WCF service unless the underlying DB structure changes.
Whatever special methods I add to my plug-in, they could simply mean passing complex LINQ queries to the generic Get method and get the results on the client side. Most of entity management happens on the client. WCF becomes a simple (yet powerful) layer over my database that lets me access any entity and process any complex query based on client side LINQ queries.
Thanks,
M
Have these 4 CRUD operations in a seperated Domain Service.