CMIS: cache data on server side - apache

I'm writing a CMIS interface(server) for my application. The server needs to load data from a database to process the request. At the moment I'm loading the same data for every request.
Is there a common way to cache this data. Are cookies supported for each cmis client? Is there an other chance to cache this data?
Thank you

You should not rely on cookies. Several clients and client libraries support them but not all. Cookies can help and you should make use of them, but be prepared for simple clients without cookie support.
Since your data is usually bound to a user, you can build a cache based on the username. But it depends on your repository and your use case what you can and should cache. Repository infos and type definitions are good candidates. But you should be careful with everything else.

Related

Assigning variable to session nodejs

I have created an authentication system using db2 for database, and nodejs for backend code. Yes, I know I could have used passport, but I wanted to try to build this auth system without it. I have everything working, and the way I am making sure users are logged into the system before it redirects to the desired page is checking if a global variable value is = true, and if it is, render the page. The problem is, the value of the variable is not unique to every session. I'm wondering if the best way to go about this is somehow setting the variable unique to every session, or maybe using db2 somehow. Thanks in advance for the help :)
The standard way to handle such cases is to store a session cookie on the client side. Then, on the server side, build a cache map (mapping a session id to a data structure, that will include info about the session.
For this to be robust, support multiple instances of the service (for scale), and make sure sessions can be "forgotten" with TTL so the cache will not grow forever, it will be best to use a cache DB, such as Redis.
Here is an excellent tutorial on this: How to Manage Session using Node.js and Express
Db2 session global variable values are unique to a connection. See CREATE VARIABLE

Is it possible to configure .Net Core to use the file system to cache responses?

Specifically, is there any way in .Net Core (3.0 or earlier) to use the local file system as a Response Cache instead of just in-memory?
After a fair amount of researching, the closest thing seems to be the Response Caching middleware [1], but this does not:
allow pages to be cached indefinitely,
preserve caches between application and server restarts,
allow invalidating the cache on a per-page basis (e.g. blog entry updated),
allow invalidating the entire cache when global changes are made (e.g. theme update, menu changes, etc.).
I'm guessing these features will require custom implementation of ResponseCaching that hits the local file system, but I don't want to reinvent it if it already exists.
Some background:
This will replace our use of a static site-generator, which is problematic for site-wide changes because of the sheer quantity of data (nearly 24 hours to generate and copy to all of the servers).
The scenario is very similar to an encyclopedia or news site -- the vast majority of the content changes infrequently, a few things are added per day, and there is no user-specific content (and if or when there is, it would be dynamically loaded via JS/Ajax). Additionally, the page loads happen to be processor/memory/database intensive.
We will be using a reverse proxy like CloudFlare or AWS CloudFront, but AWS automatically expires their edge caches daily. Edge node cache misses are still frequent.
This is different than IDistributedCache [2] in that it should be response caching, not just caching data used by the MVC Model.
We will also use in-memory cache [3], but again, that solves a different caching scenario.
References
[1] https://learn.microsoft.com/en-us/aspnet/core/performance/caching/middleware
[2] https://learn.microsoft.com/en-us/aspnet/core/performance/caching/distributed
[3] https://learn.microsoft.com/en-us/aspnet/core/performance/caching/memory
I implemented this.
https://www.nuget.org/packages/AspNetCore.ResponseCaching.Extensions/
https://github.com/speige/AspNetCore.ResponseCaching.Extensions
Doesn't support .net core 3 yet though.
Currently (April 2019) the answer appears to be: no, there is nothing out-of-the-box for this.
There are three viable approaches to accomplish this using .Net Core:
Fork the built-in ResponseCaching middleware and create a flag for cache-to-disk:
https://github.com/aspnet/AspNetCore/tree/master/src/Middleware/ResponseCaching
This might be annoying to maintain because the namespaces and class names will collide with the core framework.
Implement this missing feature in EasyCaching, which apparently already has caching to disk on their radar:
https://github.com/dotnetcore/EasyCaching/blob/master/ToDoList.md
A pull request may be more likely accepted, since it's a planned feature.
There is apparently a port of Strathweb.CacheOutput to .Net Core, which would allow one to implement IApiOutputCache to save to disk:
https://github.com/Iamcerba/AspNetCore.CacheOutput#server-side-caching
Although this question is about caching within .Net Core using the local file system, this could also be accomplished using a local instance of Sqlite on each server node, and then configuring EasyCaching for response caching and to point it to the Sqlite instance on localhost.
I hope this helps someone else who finds themselves in this scenario!

How can we setup DB and ORM for the absence of Data Consistency requierement?

Imagine we have a web-site which sends write and read requests into some DB via Hibernate. I use Java, but it doesn't matter for this question.
Usually we want to read the fresh data from DB. But I want to introduce some delay between the written data becomes visible to reads just to increase the performance. I.e. I dont need to "publish" the rows inserted into DB immediately. Its OK for me to "publish" fresh data after some delay.
How can I achieve it?
As far as I understand this can be set up on several different tiers of my system.
I can cache some requests in front-end. Probably I should set up proxy server for this. But this will work only if all the parameters of the query match.
I can cache the read requests in Hibernate. OK, but can I specify or estimate the average time the read query will return stale data after some fresh insert occurred? In other words how can I control the delay time between fresh data becomes visible to the users?
Or may be I should use something like a memcached system instead of Hibernate cache?
Probably I can set something in DB. I dont know what should I do with DB. Probably I can ease the isolation level to burst the performance of my DB.
So, which way is the best one?
And the main question, of course: does the relaxation of requirements I introduce here may REALLY help to increase the performance of my system?
If I am reading your architecture correct you have client -> server -> database server
Answers to each point
This will put the burden on the client to implement the caching if you only use your own client I would go for this method. It will have the side effect of improving client performance possibly and put less load on the server and database server so they will scale better.
Now caching on the server will improve scalability of the database server and possibly performance in the client but will put a memory burden on the server. This would be my second option
Implement something in the database. At this point what are you gaining? the database server still has to do work to determine what rows to send back. And also you will get no scalability benefits.
So to sum up I would cache at the client first if you can if not cache at the server. Leave the DB out of the loop.
To answer your main question - caching is one of the most effective ways of increasing both performance and scalability of web applications which are constrained by database performance - your application may or may not fall into this category.
In general, I'd recommend setting up a load testing rig, and measure the various parts of your app to identify the bottleneck before starting to optimize.
The most effective cache is one outside your system - a CDN or the user's browser. Read up on browser caching, and see if there's anything you can cache locally. Browsers have caching built in as a standard feature - you control them via HTTP headers. These caches are very effective, because they stop requests even reaching your infrastructure; they are very efficient for static web assets like images, javascript files or stylesheets. I'd consider a proxy server to be in the same category. The major drawback is that it's hard to manage this cache - once you've said to the browser "cache this for 2 weeks", refreshing it is hard.
The next most effective caching layer is to cache (parts of) web pages on your application server. If you can do this, you avoid both the cost of rendering the page, and the cost of retrieving data from the database. Different web frameworks have different solutions for this.
Next, you can cache at the ORM level. Hibernate has a pretty robust implementation, and it provides a lot of granularity in your cache strategies. This article shows a sample implementation, including how to control the expiration time. You get a lot of control over caching here - you can specify the behaviour at the table level, so you can cache "lookup" data for days, and "transaction" data for seconds.
The database already implements a cache "under the hood" - it will load frequently used data into memory, for instance. In some applications, you can further improve the database performance by "de-normalizing" complex data - so the import routine might turn a complex data structure into a simple one. This does trade of data consistency and maintainability against performance.

Html5 local datastore, and sync across devices

I am building a full featured web application. Naturally, you can save when you are in 'offline' mode to the local datastore. I want to be able to sync across devices, so people can work on one machine, save, then get on another machine and load their stuff.
The questions are:
1) Is it a bad idea to store json on the server? Why parse the json on the server into model objects when it is just going to be passed back to the (other) client(s) as json?
2) Im not sure if I would want to try a NoSql technology for this. I am not breaking the json down, for now the only relationships in the db would be from a user account to their entries. Other than the user data, the domain model would be a String, which is the json. Advice welcome.
In theory, in the future I might want to do some processing on the server or set up more complicated relationships. In other words, right now I would just be saving the json, but in the future I might want a more traditional relational system. Would NoSQL approach get in the way of this?
3) Are there any security concerns with this? JS injection for example? In theory, for this use case, the user doesn't get to enter anything, at least right now.
Thank you in advance.
EDIT - Thanx for the answers. I chose the answer I did because it went into the most detail on the advantages and disadvantages of NoSql.
JSON on the SERVER
It's not a bad idea at all to store JSON on the server, especially if you go with a noSQL solution like MongoDB or CouchDB. Both use JSON as their native format(MongoDB actually uses BSON but it's quite similar).
noSQL Approach: Assuming CouchDB as the storage engine
Baked in replication and concurrency handling
Very simple Rest API, talk to the data base with HTTP.
Store data as JSON natively and not in blobs or text fields
Powerful View/Query engine that will allow you to continue to grow the complexity of your documents
Offline Mode. You can talk to CouchDb directly using javascript and have the entire app continue to run on the client if the internet isn't available.
Security
Make sure you're parsing the JSON documents with the browers JSON.parse or a Javascript library that is safe(json2.js).
Conclusion
I think the reason I'd suggest going with noSQL here, CouchDB in particular, is that it's going to handle all of the hard stuff for you. Replication is going to be a snap to setup. You won't have to worry about concurrency, etc.
That said, I don't know what kind of App you're building. I don't know what your relationship is going to be to the clients and how easy it'll be to get them to put CouchDB on their machines.
Links
CouchDB # Apache
CouchOne
CouchDB the definitive guide
MongoDB
Update:
After looking at the app I don't think CouchDB will be a good client side option as you're not going to require folks to install a database engine to play soduku. That said, I still think it'd be a great server side option. If you wanted to sync the server CouchDb instance with the client you could use something like BrowserCouch which is a JavaScript implementation of CouchDB for local-storage.
If most of your processing is going to be done on the client side using JavaScript, I don't see any problem in storing JSON directly on the server.
If you just want to play around with new technologies, you're most welcome to try something different, but for most applications, there isn't a real reason to depart from traditional databases, and SQL makes life simple.
You're safe as long as you use the standard JSON.parse function to parse JSON strings - some browsers (Firefox 3.5 and above, for example) already have a native version, while Crockford's json2.js can replicate this functionality in others.
Just read your post and I have to say I quite like your approach, it heralds the way many web applications will probably work in the future, with both an element of local storage (for disconnected state) and online storage (the master database - to save all customers records in one place and synch to other client devices).
Here are my answers:
1) Storing JSON on server: I'm not sure I would store the objects as JSON, its possible to do so if your application is quite simple, however this will hamper efforts to use the data (running reports and emailing them on a batch job for example). I would prefer to use JSON for TRANSFERRING the information myself and a SQL database for storing it.
2) NoSQL Approach: I think you've answered your own question there. My preferred approach would be to setup a SQL database now (if the extra resource needed is not a problem), that way you'll save yourself a bit of work setting up the data access layer for NoSQL since you will probably have to remove it in the future. SQLite is a good choice if you dont want a fully-featured RDBMS.
If writing a schema is too much hassle and you still want to save JSON on the server, then you can hash up a JSON object management system with a single table and some parsing on the server side to return relevant records. Doing this will be easier and require less permissioning than saving/deleting files.
3) Security: You mentioned there is no user input at the moment:
"for this use case, the user doesn't
get to enter anything"
However at the begining of the question you also mentioned that the user can
"work on one machine, save, then get
on another machine and load their
stuff"
If this is the case then your application will be storing user data, it doesn't matter that you havent provided a nice GUI for them to do so, you will have to worry about security from more than one standpoint and JSON.parse or similar tools only solve half the the problem (client-side).
Basically, you will also have to check the contents of your POST request on the server to determine if the data being sent is valid and realistic. The integrity of the JSON object (or any data you are tying to save) will need to be validated on the server (using php or another similar language) BEFORE saving to your data store, this is because someone can easily bypass your javascript-layer "security" and tamper with the POST request even if you didnt intend them to do so and then your application will be sending the evil input out the client anyway.
If you have the server side of things tidied up then JSON.parse becomes a bit obsolete in terms of preventing JS injection. Still its not bad to have the extra layer, specially if you are relying on remote website APIs to get some of your data.
Hope this is useful to you.

How to upload a file in WCF along with identifying credentials?

I've got an issue with WCF, streaming, and security that isn't the biggest deal but I wanted to get people's thoughts on how I could get around it.
I need to allow clients to upload files to a server, and I'm allowing this by using the transferMode="StreamedRequest" feature of the BasicHttpBinding. When they upload a file, I'd like to transactionally place this file in the file system and update the database with the metadata for the file (I'm actually using Sql Server 2008's FILESTREAM data type, that natively supports this). I'm using WCF Windows Authentication and delegating the Kerberos credentials to SQL Server for all my database authentication.
The problem is that, as the exception I get helpfully notes, "HTTP request streaming cannot be used in conjunction with HTTP authentication." So, for my upload file service, I can't pass the Windows authentication token along with my message call. Even if I weren't using SQL Server logins, I wouldn't even be able to identify my calling client by their Windows credentials.
I've worked around this temporarily by leaving the upload method unsecured, and having it dump the file to a temporary store and return a locator GUID. The client then makes a second call to a secure, non-streaming service, passing the GUID, which uploads the file from the temporary store to the database using Windows authentication.
Obviously, this isn't ideal. From a performance point of view, I'm doing an extra read/write to the disk. From a scalability point of view, there's (in principle, with a load balancer) no guarantee that I hit the same server with the two subsequent calls, meaning that the temporary file store needs to be on a shared location, meaning not a scalable design.
Can anybody think of a better way to deal with this situation? Like I said, it's not the biggest deal, since a) I really don't need to scale this thing out much, there aren't too many users, and b) it's not like these uploads/downloads are getting called a lot. But still, I'd like to know if I'm missing an obvious solution here.
Thanks,
Daniel