Dropbox differential/incremental uploads using REST API - dropbox

We know that Dropbox desktop clients use a binary diff algorithm to break down all files into blocks, and only upload blocks that it doesn't already have in the cloud (https://serverfault.com/questions/52861/how-does-dropbox-version-upload-large-files).
Nevertheless, the Dropbox API, as far as I see, can only upload the whole file (/files_put, /files (POST)) when a sync is needed.
Is there any way to do differential/incremental syncing using the Dropbox API, i.e. upload only the changed portion of the file like the desktop clients do?
If this is not possible, then what are the best practices to periodically sync large files that has small changes using the Dropbox API?

Unfortunately this isn't possible and I would suspect that it may never be available.
After doing a bit of research, I found a feature request for delta-syncing to be integrated into the API[*]. Dropbox hasn't responded, nor has the community upvoted this request.
I would make an educated guess that the reason why Dropbox hasn't provided this functionality, and likely never will, is because this is a dangerous feature in the hands of unknown developers.
Consider the case where you write an application that uses such a delta-change update system for updating large files. You thoroughly test your app and publish it to an app store. A couple of weeks after your initial release, and numerous downloads, you start receiving bad reviews and complaints because you managed to miss a very specific test case.
Within this specific, buggy case you've miscalculated a differential offset by 1-byte. Oh no! You've now corrupted thousands of files, for hundreds of users!
Considering such a possibility, I think I would personally request that Dropbox NEVER provide such a dev feature. If they integrated such a function into the API, they would be breaking their #1 purpose-- to provide consistent, safe, & reliable cloud backups of your important files.
[*]: This was the original reference location, but it is now a dead link.
(https://www.dropbox.com/votebox/1515/delta-sync-api-for-mobile-applications)

Related

Storing files on the API or on the microservice filesystem

I work on an app that consists of a
Frontend app
API, that I like to think of as a gateway
Microservices that handle the business logic and db work
Upon implementing a file store-like feature, for uploading both small and large files, I just assumed that I'd store these files on the microservice's filesystem and save paths, along with metadata, into the microservice's db.
Because the microservices don't implement any Http API endpoints, I upload files over my API gateway. But after realizing how much work must go into transferring these files from the API to the microservice, aswell as serving the same back, I just went with storing them on the API's file system and saving the paths into the microservice's db.
Is this approach ok?
Is it weird that my API gateway stores and serves files from it's own file system?
If so, should I transfer the files from the API to the microservice, upon an upload, even considering the files can be large - or should the microservice implement a specific API itself?
I hope this question doesn't get interpreted as opinion-based - I'd like to know what approach would be best considering the frontend-api-microservice pattern and if there are any architecture standards that address this scenario, and also if any approach has it's gotchas.
Based on comments above
API Gateway
The purpose of gateway is to redirect the requests and handle cross cutting concerns like authentication , logging etc. It shouldn't be doing more than that. Gateway has to be highly available and any problem to gateway means you can't access associated services.
File Upload
The file upload should be handled by microservice itself. Your gateway will only be used to pass and get the stream. Depending on nature of your system and if you are using cloud store you can use of pattern like "valet key".
https://learn.microsoft.com/en-us/azure/architecture/patterns/valet-key
After some time and some experience, the right answer to this question would be API Gateway. Microservices are complex enough on their own, and storing any files, small or large, would rather be a waste of networking, bandwith etc. and would just introduce latency issues and degrade performance as well as UX.
I'll leave this out here just so people can hear this, as neither approach would be wrong, while the API Gateway choice just provides more practical benefits and thus is more appropriate. If this question was targetting data or files that are stored within a DB, microservice and it's DB would be the obvious choice.
If you have the convenience to add an file server to your whole stack, then sure, that would be the correct approach, but that as well introduces more complexity and other stuff described above.

OneDrive client status/health check

I would like to programmatically validate that the OneDrive (for Business) client is successfully connected and syncing (SDK, file, event log, registry, etc.) on our Windows 10 desktops.
I have seen the OneDriveLib project, which claims to offer this through PowerShell, although it’s not working for me because of the known bug when Files On-Demand is enabled.
We’re looking to implement OneDrive as the default save location for our 5000+ users. When it works, it works great, but how can we know it’s working for all our users? There’s a good possibility that some of the OneDrive clients will break at over time, so any locally saved data will not be synced. At best it will mean that the data will not roam with the user, but worst case scenario would be a machine goes pop with months/years of unrecoverable un-synced data.
there are some local data files here, but i've yet to decipher their meaning:
$env:LOCALAPPDATA\Microsoft\OneDrive\logs\Business1\DeviceHealth.json |
and
$env:LOCALAPPDATA\Microsoft\OneDrive\logs\Business1\DeviceHealthSummaryConfiguration.ini

How does the Dropbox Datastore API differ from Parse?

How does the Dropbox Datastore API differ from similar offerings like Parse? One difference that I see is that my users pay for server storage instead of me. Are there other differences?
Disclaimer: I'm a Dropbox engineer who worked on the Datastore API, and know about the Parse API only indirectly. Weigh my opinion appropriately. Major differences I know of (pro and con):
Dropbox Datastores are free to the developer, and free the user for the first 5MB per-app (after which their Dropbox quota applies). Parse charges developers based on how many API requests they’re making.
Parse has minimal offline support, while Dropbox has full offline operation. With Dropbox, if the developer modifies data while offline, those modifications will be reflected in subsequent queries (with Parse, those changes are not reflected). Dropbox provides on-device query logic (unlike Parse) so that apps can continue to generate the views they need to, even when there’s no Internet available. In addition, Parse does not provide conflict resolution or querying offline.
Parse provides the ability to share data between users, and global data for all users of the app. Dropbox Datastores only support per-user data (for each app) for now (sharing is on the roadmap).
I would also add that:
Parse is full feature of backend of as service. You can find a pretty complete list of the other player in this field: http://en.wikipedia.org/wiki/Backend_as_a_service. They provide feature like:
Data service
User registration/auth
Push notification
Social
The dropbox Datastore APIs is more focusing on data services. (You also got the User part for free too?) Also it works full offline.
The Parse framework can store data that can be ready by any user in the application.
The Dropbox datastore, store data for each user, and you can't accesss data from other user. That's the main difference.
So easy to get lost in this since you have to read between the lines. My take is that with Datastore you are working with objects stored offline locally as json. I'm hoping they will soon release a Xamarin Android component - they released an IOS component last month. Since Xamarin targets both Android and IOS and Winphone, who knows why they made a dedicated IOS DLL for Xamarin but I digress. With Parse, it appears to me their intent is the always-connected-device. Sure you can save queries locally and you can save (save eventually) locally where Parse will push to the server when it is connected. But saving "eventually" and saving queries for offline work is a different design than just saving and letting Parse do it all in the background for you - which it does not unless I have missed something that would make this attractive to me. I cannot see Parse useable for devices that you know will be sometimes-connected, without a lot of code to make this happen and sync.

Skydrive sync REST API

I have read the docs for SkyDrive REST APIs but didn't find any API using which i can sync with the SkyDrive, without recursive polling the folders for update check.
Is there any API to get only the update for a user Drive?
A commonplace reality of epistemology is that...
It is typically much easier to prove that something exists than to prove that it does not exist
Never the less I can say with a high level of confidence that the official REST API for Skydrive doesn't include a way of getting a list of updated documents for synchronization purposes.
Furthermore I didn't see any evidence of a non-supported/non-official API that would serve this purpose and by observing the way the Windows Client for SkyDrive interacts with the server (within limit of fair-use reverse engineering), it appears that the synchronization is done by reviewing the directory tree rather than getting a differential list.
I believe the closes you can go is: Get a list of the user's most recently used documents
To get a list of SkyDrive documents that the user has most recently
used, use the wl.skydrive scope to make a GET request to
/USER_ID/skydrive/recent_docs, where USER_ID is either me or the user
ID of the consenting user. Here's an example.
GET http://apis.live.net/v5.0/me/skydrive/recent_docs?access_token=ACCESS_TOKEN

How does Http live streaming works?

I have created one sample application for demonstrating a working of HTTP live streaming.
What I have done is, I have one library that takes input as video file (avi, mpeg, mov, .ts) and generating segments (.ts) and playlist (.m3u8) files for the given video file. I am storing playlist (as string) in a linked list, as an when i am getting playlist data from the library.
I have written one basic web server which will server the user requested segment and playlist files. I am requesting playlist.m3u8 file from the iPhone safari browser and it is launching the QuickTime player where it is requesting the segment.ts files listed in the received playlist files. after playing every segments (listed in current playlist) it again requests for the playlist, where i am responding with the next playlist file which contains the next set of segment.ts files listed in it.
Is this what we call HTTP live streaming?
Is there anything else, other that this i need to do for implementing HTTP live streaming?
Thanks.
Not much more. If you are taking input streams of media, encoding them, encapsulating in a format suitable for delivery and preparing the encapsulated media for distribution by placing it in such a way that they can be requested from the HTTP server, you are done. The idea behind the live streaming is that it leverages existing Internet architecture that is already optimized for serving HTTP requests for reasonably sized resources.
HTTP streaming renders many existing CDN solutions obsolete with their custom streaming protocols, custom routing and custom content caching.
You can also use media stream validator command line application for mac os x for validating streams generated by the HTTP Web server.
More or less but there's also adaptive bit-rate streaming to take care of if you want your server to push files to iOS devices. Which means your scope expands from having a single "index.m3u8" file that tracks all the TS files to a master index that then tracks the index files for each bitrate you'd want to support in your application which then individually track the TS files encoded at the respective bit-rates.
It's a good amount of work, but mostly routine/repetitive once you've got the hang of the basics.
For more on streaming, your bible, from the iOS standpoint, should ALWAYS be TN2224. Adhering closely to the specs in the Technote, is your best chance of getting through the App Store approval process vis-a-vis streaming.
Some people don't bother (building a streaming app over the past couple of months and looked at the HTTP logs of a whole bunch of video apps that don't quite seem to stick by the rules) - sometimes Apple notices, sometimes they don't, and sometimes the player is just too big for Apple to interfere.
So it's not very different there from every other aspect of the functionality of your app that undergoes Apple's scrutiny. It's just that there are ways you can be sure you're on the right track.
And of course, from a purely technical standpoint, as #psp1 mentioned the mediastreamvalidator tool can help you figure out if your streams are - at their very core, even if not in terms of their overall abilities - compatible with what's expected of HLS implementations.
Note: You can either roll with your own encoding solution (with ffmpeg, the plus being you have more control, the minus being it takes time to configure and get working just RIGHT. Plus once you start talking even the least amount of scale, you run into a whole host of other problems. And once you're done with all the technical hard-work, you'd find that was easy. Now you'd have to actually figure out which license you need to get for having a fancy H.264 encoder with you and jump through all the legal/procedural hoops to get one).
The easier solution for a developer without a legal/accounting team that could fill a football field, IMO, it's easier to go third-party with sites like Encoding.com, Zencoder etc who provide their encoding services a-la-carte or with a monthly fee. The plus is that they've taken care of all the licensing BS and are just providing you a simple "pay to use" service, which could also be extremely useful when you're building a project for a client. The minus is that you're now DEPENDENT on Zencoder/Encoding, the flip-side of which you'd know when your encoding jobs fail for a whole day because their servers are down, or even otherwise, when the API doesn't quite act as you expect or has been documented!
But anyhow that's about all the factors you got to Grok before pushing a HLS server into production!