Optimizing for low bandwidth - optimization

I am charged with designing a web application that displays very large geographical data. And one of the requirements is that it should be optimized so the PC still on dial-ups common in the suburbs of my country could use it as well.
Now I am permitted to use Flash and/or Silverlight if that will help with the limited development time and user experience.
The heavy part of the geographical data are chunked into tiles and loaded like map tiles in Google Maps but that means I need a lot of HTTP requests.
Should I go with just javascript + HTML? Would I end up with a faster application regarding Flash/Silverlight? Since I can do some complex algorithm on those 2 tech (like DeepZoom). Deploying desktop app though, is out of the question since we don't have that much maintenance funds.
It just needs to be fast... really fast..
p.s. faster is in the sense of "download faster"

I would suggest you look into Silverlight and DeepZoom

Is something like Gears acceptable? This will let you store data locally to limit re-requests.
I would also stay away from flash and Silverlight and go straight to javascript/AJAX. jQuery is a ton-O-fun.

I don't think you'll find Flash or Silverlight is going to help too much for this application. Either way you're going to be utilizing tiled images and the images are going to be the same size in both scenarios. Using Flash or Silverlight may allow you to add some neat animations to the application but anything you gain here will be additional overhead for your clients on dialup connections. I'd stick with plain Javascript/HTML.

You may also want to look at asynchronously downloading your tiles via one of the Ajax libraries available. Let's say your user can view 9 tiles at a time and scroll/zoom. Download those 9 tiles they can see plus whatever is needed to handle the zoom for those tiles on the first load; then you'll need to play around with caching strategies for prefetching other information asynchronously.
At one place I worked a rules engine was taking a bit too long to return a result so they opted to present the user with a "confirm this" screen. The few seconds it took the users to review and click next was more than enough time to return the results. It made the app look lightening fast to the user when in reality it took a bit longer. You have to remember, user perception of performance is just as important in some cases as the actual performance.

I believe Microsoft's Seadragon is your answer. However, I am not sure if that is available to developers.
It looks like some of it has found its way into Silverlight

Related

Cloudbees instance sizing

I am preparing to configure our production Cloudbees instance, and would like some advice on app-cell sizing. We will be using multiple instances with auto scaling enabled, but need to choose the app-cell size. This is obviously a compromise between horizontal and vertical scaling and the best choice will be different for each app.
Our app is a shopping service written in GWT, so there is one JSP and an initial download of the application HTML, JS, CSS, and Image files. After that, everything runs in the clients' browser with only search calls hitting the server and returning plain JSON with no serialization. Also, we are not using sessions, and do not need any type of state on the server, so the memory footprint should be low.
Given all of this, my gut is to go with a more horizontally scaled deployment with a larger number of smaller instances. Any suggestions would be welcome.
There is no general answer to this question. Application design and internal processing comes in the equation and changes the result. Only load test can give you answer about the best sizing for your application.
Statelessness implies you can scale horizontally very easily - so this opens up both ways to scale.
My suggestion is to fine which "container size" works well enough for a single instance (usually a memory restriction) and once you are happy with that, you can then scale horizontally based on usage.

Concurrent page request comparisons

I have been hoping to find out what different server setups equate to in theory for concurrent page requests, and the answer always seems to be soaked in voodoo and sorcery. What is the approximation of max concurrent page requests for the following setups?
apache+php+mysql(1 server)
apache+php+mysql+caching(like memcached or similiar (still one server))
apache+php+mysql+caching+dedicated Database Server (2 servers)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/single dbserver)
apache+php+mysql+caching+dedicatedDB+loadbalancing(multi webserver/multi dbserver)
+distributed (amazon cloud elastic) -- I know this one is "as much as you can afford" but it would be nice to know when to move to it.
I appreciate any constructive criticism, I am just trying to figure out when its time to move from one implementation to the next, because they each come with their own implementation feat either programming wise or setup wise.
In your question you talk about caching and this is probably one of the most important factors in a web architecture r.e performance and capacity.
Memcache is useful, but actually, before that, you should be ensuring proper HTTP cache directives on your server responses. This does 2 things; it reduces the number of requests and speeds up server response times (if you have Apache configured correctly). This can also be improved by using an HTTP accelerator like Varnish and a CDN.
Another factor to consider is whether your system is stateless. By stateless, it usually means that it doesn't store sessions on the server and reference them with every request. A good systems architecture relies on state as little as possible. The less state the more horizontally scalable a system. Most people introduce state when confronted with issues of personalisation - i.e serving up different content for different users. In such cases you should first investigate using the HTML5 session storage (i.e store the complete user data in javascript on the client, obviously over https) or if the data set is smaller, secure javascript cookies. That way you can still serve up cached resources and then personalise with javascript on the client.
Finally, your stack includes a database tier, another potential bottleneck for performance and capacity. If you are only reading data from the system then again it should be quite easy to horizontally scale. If there are reads and writes, its typically better to separate the read write datasets into a separate database and have the read only in another. You can then use more relevant methods to scale.
These setups do not spit out a single answer that you can then compare to each other. The answer will vary on way more factors than you have listed.
Even if they did spit out a single answer, then it is just one metric out of dozens. What makes this the most important metric?
Even worse, each of these alternatives is not free. There is engineering effort and maintenance overhead in each of these. Which could not be analysed without understanding your organisation, your app and your cost/revenue structures.
Options like AWS not only involve development effort but may "lock you in" to a solution so you also need to be aware of that.
I know this response is not complete, but I am pointing out that this question touches on a large complicated area that cannot be reduced to a single metric.
I suspect you are approaching this from exactly the wrong end. Do not go looking for technologies and then figure out how to use them. Instead profile your app (measure, measure, measure), figure out the actual problem you are having, and then solve that problem and that problem only.
If you understand the problem and you understand the technology options then you should have an answer.
If you have already done this and the problem is concurrent page requests then I apologise in advance, but I suspect not.

Bigger cookie-like files for local data storage (browser "caching" of complex structures)

I am developing a browser based game, and I have a big map there. The terrain of the map is static. Therefore, I have some thousands of tiles that will not change (whether they represent a forest, a desert, whatever), just the players above it can change.
Hence, I wanted to store all my map in the player's computer. I am working with Ruby on Rails, and those map information are passed from the server to the javascript that runs on the user browser, in order to render a pretty map. But it makes me pretty sad to have a 200kb .html file, containing all those map related information.
What would be the simplest way to solve this issue? Cookies! Well. That's what I thought. A complete map information can get to almost 200kb (they are pretty big). A cookie can have at most 4kb.. I don't feel that the right way to achieve my objective is to create tons of cookies, one for each row of the map, for instance. Is there any more elegant way to have this static information lie on the player's browser, without creating lots of cookies? A way to cache it on his browser? I mean.. I can cache a 400kb image, why can't I cache a 200kb map structure?
Thanks in advance!
Fernando.
Well, HTML Local Storage gives you 5 MB (though data is stored *as strings*, so the actual amount of data you can fit in the container is likely a lot less than 5 MB.
This limit is oddly fluid. For one thing, it's just a recommended limit; and for another, i.e., Webkit-based browsers use UTF-16, which immediately cuts that in half (2.5 MB).
Browser support for Local Storage is good: IE, Firefox, Safari 4.0+, Chrome 4.0+, and Opera 10.5+. Both iPhone and Android are supported above versions 3.0 and 2.0. respectively.
Using Local Storage to preserve game state appears to be a proto-typical use case.
Finally, Paul Kinlan published an excellent step-by-step tutorial on HTML5Rocks, which i highly recommend (though it's a little more than a year old).
Have you considered storing it in a js file? Most browser will cache linked js files, allowing you to only serve it every once in a while. It would be very simple to deploy.

Random-access data object in J2ME

I'm planning to develop a small J2ME utility for viewing local public transport schedules using a mobile phone. The data part for those is mostly a big bunch of numbers representing the times when the buses arrive or leave.
What I'm trying to figure out is what is the best way to store that data. The representation needs to
be considerably small (because of persistent storage limitations of a mobile phone)
fit into a single file (for the ease of updating the schedule database afterwards over HTTP) fit into a constant number of files, i.e. (routes.dat, times.dat, ..., agencies.dat), and not (schedule_111.dat, schedule_112.dat, ...)
have a random access ability (unserializing the whole data object into memory would be just too much for a mobile phone :))
if there's some library for accessing that data format, a Java implementation should be present
In other words, if you had to squeeze a big part of GTFS-like data into a mobile device, how would you do that?
Google Protocol Buffers seemed like a good candidate for defining data but it didn't have random access.
What would you suggest?
Persistent storage on J2ME is a tricky business; see this related question for more general background: Best practice for storing large amounts of data with J2ME
In my experience, J2ME persistent storage tends to work best/most reliably with many small records rather than a few monolithic ones. Think about how the program is going to want to access the data, then try to store it in those increments in the J2ME persistent store.
I'd generally recommend decoupling your client-server protocol for downloading updates from the on-device storage format. You can change the latter with every code update, but you're pretty much stuck supporting a client-server protocol forever, unless you want to break older clients out in the field.
Finally, I know there are some people on the Transit Developers group who have built offline transit apps in J2ME, so it's worth asking for tips there.
I made app like this and I used xml-s generated with php. This enabled us to have a single provider for 3 presentation layers which were:
j2me app
website for mobile phones
usual website
We used xslt to convert xml to html on websites and kXML - very light pull parser to do it on j2me app. This worked well even on very old phones with b/w screens and small amounts of memory.
Besides on j2me there is no concept of file. You have the db in which you can store information.
This is a link to "mobile" website.
http://mobi.krakow.pl/rozklady/
and here to the app:
http://www.mobi.krakow.pl/rozklady/j2me/rjk.jar
This is in polish, but I think it's not hard to figure out what's this and that.
If you want, I can provide you with more help and advice or if this is a commercial product then I think we can figure out something too ;)
I think your issue is requirement 2.
Updating 10MB of data just because 4 digits changed somewhere in the middle of the file seems highly inefficient.
Spliting the data into several files allows for a better update granularity that will be well worth the added code complexity.
Real-time public transport schedules are usually modified one bus/train/tram line at a time.

How important is size in an application?

When creating applications (Java, run on a normal computer). How important is program size for users? For example, would it be necessary to replace .png's with .jpg's, convert .wav's to .midi's, or strip down libraries to save space, or do users generally not care if my program is 5mb when it could be 50kb if stripped down?
Thanks.
That depends on the delivery mechanism.
Size is generally only relevant in terms of the bandwidth required to download it. If you download it often, then it matters a lot. If its only once, it matters less and you have to weigh up the time involved in reducing that vs how much space you save.
After that, nobody cares until you get into gigabytes. Well, mobile applications will probably start caring at about 10MB+.
Users definitely care (after all, not only does space cost money, but affects program load time). However, the question becomes how much do you optimize. I suggest the 80/20 rule. 80% of your benefit comes from the first 20% of the effort.
If you use a utility like TreePie you might be able to see what parts of a large application are consuming most of your resources. If you find it's just a few large images, or one big DLL with a bunch of embedded resources, it's probably worth taking a look at reducing the size, if it's easy.
But there's a cost/benefit tradeoff. I just saw a terrabyte drive for $100 the other day. Saving the user 1 gig is about 10 cents in terms of storage space, and perhaps some hard to quantify amount of time spent loading every time they load. If you have 100,000 users, it probably worth your time to optimize a bit, but if you're writing custom software for one user it's probably not worth it unless they're complaining.
As mentioned by Graham Lee, a great deal of this is very dependant on your users. If you are writing something that needs to be optimized to fit on the chip of a 68000 processor, then you'd better believe that program size matters. Assuming you're not programming 30 years ago, you probably won't run across that particular issue.
But in general, you should be making your application as small as possible while still achieving the quality you want. That is to say, if your application is likely to be viewed on an 640x480 screen, then you don't need hi-res 6mg pngs for all your images. On the other hand, if your application is designed to be blown up on a big screen at conferences, then you probably want to upsize your images.
Another option that is very common is creating installers with separate options ranging from full to minimal. That way you can allow your users to decide whether size matters to them. It allows you to create the pretty pretty version of your app, and a scaled back version that doesn't include tutorials or mp3 files of a soothing woman's voice telling you that you've push the wrong button.
Know your users. And if you don't, then let them decide for themselves.
Consider yourself, what would you use? Would you rather save space with 5KB programs or waste it with 5MB programs?
I think that smaller is better, especially if the program doesn't use/need much graphics and can be optimized.
I would say not important at all, unless it's obscenely large.
I would argue that startup time is far more important to users that application size.
However if you include a lot of media files with your system it is logical to optimise this data as much as possible. But don't compromise the quality - switching to jpeg might be okay for photos, but it sucks for technical diagrams. A .wav could be an .aac or .mp3, but not if you're writing a professional audio application.