Limiting simultaneous downloads using RxAlamofire

Limiting simultaneous downloads using RxAlamofire - alamofire

Given my App will download files from a server and I only want 1 download to be progressed at the same time, then how could this be done with RxAlamofire? I might simply be missing an Rx operator.
Here's the rough code:
Observable
.from(paths)
.flatMapWithIndex({ (ip, idx) -> Observable<(Int, Video)> in
let v = self.files![ip.row] as! Video
return Observable.from([(idx, v)])
})
.flatMap { (item) -> Observable<Video> in
let req = URLRequest(url: item.1.downloadURL())
return Api.alamofireManager()
.rx
.download(req, to: { (url, response) -> (destinationURL: URL, options: DownloadRequest.DownloadOptions) in
...
})
.flatMap({ $0.rx.progress() })
.flatMap { (progress) -> Observable<Float> in
// Update a progress bar
...
}
// Only propagate finished items
.filter { $0 >= 1.0 }
// Return the item itself
.flatMap { _ in Observable.from([item.1]) }
}
.subscribe(
onNext: { (res) in
...
},
onError: { (error) in
...
},
onCompleted: {
...
}
)
My problem is a) RxAlamofire will download multiple items at the same time and b) the (progress) block is called multiple times for those various items (with different progress infos on each, causing the UI to behave a bit weird).
How to ensure the downloads are done one by one instead of simultaneously?

Does alamofireManager().rx.download() download concurrently or serially?
I'm not sure how it does, so test that first. Isolate this code and see if it does execute multiple downloads at once. If it does, then read up on the documentation for serial downloads instead of concurrent downloads.
If it downloads one at a time, then it means it has something to do with your Rx code that triggers the progress bar update issue. If it doesn't download one at a time, then it means we just need to read up on Alamofire's documentation on how to download one at a time.
Complex transformations and side effects
Something to consider is that your data streams are becoming more complex and difficult to debug because so many things are happening in one stream. Because of the multiple flat maps, there can be a lot more emissions coming out affecting the progress bar update. It is also possible that the numerous flat maps operations that acquired an Observable are the cause for the multiple triggering of the updates on the progress bar.
Complex data streams
In one data stream you (a) performed the network call (b) updated the progress bar (c) filtered finished videos (d) and went back to the video you wanted by using flatMapWithIndex at the start to pair together id and the video model so that you can return back to the model at the end. Kind of complicated... My guess is that the weird progress bar updates might be caused by creating a hot observable on call of $0.rx.progress().
I made a github gist of my Rx Playground that tries to model what you're trying to do.
In functional reactive programming, it would be much more readable and easier to debug if you first define your data streams/observables. In my gist, I began with the observables and how I planned to model the download progress.
This code will avoid the concurrency issues if the RxAlamofire query downloads 1 at a time, and it properly presents the progress value for a UIProgressBar.
Side note
Do you need to track the individual progress downloads per download item? Or do you want your progress bar to just increment per finished download item?
Also, be wary with the possible dangers of misusing a chain of multiple flatMaps as explained here.

Related

Hwo to convert Flux<Item> to List<Item> by blocking

Background
I have a legacy application where I need to return a List<Item>
There are many different Service classes each belonging to an ItemType.
Each service class calls a few different backend APIs and collects the responses to create a SubType of the Item.
So we can say, each service class implementation returns an Item
All backend API access code is using WebClient which returns Mono of some type, and I can zip all Mono within the service to create an Item
The user should be able to look up many different types of items in one call. This requires many backend calls
So for performance sake, I wanted to make this all asynchronous using reactor, so I introduced Spring Reactive code.
Problem
If my endpoint had to return Flux<Item> then this code work fine,
But this is some service code which is used by other legacy code caller.
So eventually I want to return the List<Item> but When I try to convert my Flux into the List I get an error
"message": "block()/blockFirst()/blockLast() are blocking,
which is not supported in thread reactor-http-nio-3",
Here is the service, which is calling a few other service classes.
Flux<Item> itemFlux = Flux.fromIterable(searchRequestByItemType.entrySet())
.flatMap(e ->
getService(e.getKey()).searchItems(e.getValue()))
.subscribeOn(Schedulers.boundedElastic());
Mono<List<Item>> listMono = itemFlux
.collectList()
.block(); //This line throws error
Here is what the above service is calling
default Flux<Item> searchItems(List<SingleItemSearchRequest> requests) {
return Flux.fromIterable(requests)
.flatMap(this::searchItem)
.subscribeOn(Schedulers.boundedElastic());
}
Here is what a single-item search is which is used by above
public Mono<Item> searchItem(SingleItemSearchRequest sisr) {
return Mono.zip(backendApi.getItemANameApi(sisr.getItemIdentifiers().getItemId()),
sisr.isAddXXXDetails()
?backendApi.getItemAXXXApi(sisr.getItemIdentifiers().getItemId())
:Mono.empty(),
sisr.isAddYYYDetails()
?backendApi.getItemAYYYApi(sisr.getItemIdentifiers().getItemId())
:Mono.empty())
.map(tuple3 -> Item.builder()
.name(tuple3.getT1())
.xxxDetails(tuple3.getT2())
.yyyDetails(tuple3.getT3())
.build()
);
}
Sample project to replicate the problem..
https://github.com/mps-learning/spring-reactive-example
I’m new to spring reactor, feel free to pinpoint ALL errors in the code.
UPDATE
As per Patrick Hooijer Bonus suggestion, updating the Mono.zip entries to always contain some default.
#Override
public Mono<Item> searchItem(SingleItemSearchRequest sisr) {
System.out.println("\t\tInside " + supportedItem() + " searchItem with thread " + Thread.currentThread().toString());
//TODO: how to make these XXX YYY calls conditionals In clear way?
return Mono.zip(getNameDetails(sisr).defaultIfEmpty("Default Name"),
getXXXDetails(sisr).defaultIfEmpty("Default XXX Details"),
getYYYDetails(sisr).defaultIfEmpty("Default YYY Details"))
.map(tuple3 -> Item.builder()
.name(tuple3.getT1())
.xxxDetails(tuple3.getT2())
.yyyDetails(tuple3.getT3())
.build()
);
}
private Mono<String> getNameDetails(SingleItemSearchRequest sisr) {
return mockBackendApi.getItemCNameApi(sisr.getItemIdentifiers().getItemId());
}
private Mono<String> getYYYDetails(SingleItemSearchRequest sisr) {
return sisr.isAddYYYDetails()
? mockBackendApi.getItemCYYYApi(sisr.getItemIdentifiers().getItemId())
: Mono.empty();
}
private Mono<String> getXXXDetails(SingleItemSearchRequest sisr) {
return sisr.isAddXXXDetails()
? mockBackendApi.getItemCXXXApi(sisr.getItemIdentifiers().getItemId())
: Mono.empty();
}

Edit: Below answer does not solve the issue, but it contains useful information about Thread switching. It does not work because .block() is no problem for non-blocking Schedulers if it's used to switch to synchronous code.
This is because the block operator inherited the reactor-http-nio-3 Thread from backendApi.getItemANameApi (or one of the other calls in Mono.zip), which is non-blocking.
Most operators continue working on the Thread on which the previous operator executed, this is because the Thread is linked to the emitted item. There are two groups of operators where the Thread of the output item differs from the input:
flatMap, concatMap, zip, etc: Operators that emit items from other Publishers will keep the Thread link they received from this inner Publisher, not from the input.
Time based operators like delayElements, interval, buffer(Duration), etc. will schedule their tasks on the provided Scheduler, or Schedulers.parallel() if none provided. The emitted items will then be linked to the Thread the task was scheduled on.
In your case, Mono.zip emits items from backendApi.getItemANameApi linked to reactor-http-nio-3, which gets propagated downstream, goes outside both the flatMap in searchItems and in itemFlux, until it reaches your block operator.
You can solve this by placing a .publishOn(Schedulers.boundedElastic()), either in searchItem, searchItems or itemFlux. This will cause the item to switch to a Thread in the provided Scheduler.
Bonus: Since you requested to pinpoint errors: Your Mono.zip will not work if sisr.isAddXXXDetails() is false, as Mono.zip discards any element it could not zip. Since you return a Mono.empty() in that case, no items can be zipped and it will return an empty Mono.

If we have only spring-boot-starter-webflux defined as application dependency, then springbok spin up a `Netty server.
One is not expected to block() in a reactive application using a non-blocking server.
However, once we add spring-boot-starter-web dependency then even with the presence of spring-boot-starter-webflux, springboot spinup a tomcat server. Which is a thread-per-request model and is expected to have blocking calls
So to solve my problem, all I had to do above is, to add spring-boot-starter-web dependency in pom.xml. After that applications is started in Tomcat
with timcat .collectList().block() works in Controller class to return the List<Item>.
Whereas with the Netty server I could return only Flux<Item> not List<Item>, which is expected.

Spring data - webflux - Chaining requests

i use reactive Mongo Drivers and Web Flux dependancies
I have a code like below.
public Mono<Employee> editEmployee(EmployeeEditRequest employeeEditRequest) {
return employeeRepository.findById(employeeEditRequest.getId())
.map(employee -> {
BeanUtils.copyProperties(employeeEditRequest, employee);
return employeeRepository.save(employee)
})
}
Employee Repository has the following code
Mono<Employee> findById(String employeeId)
Does the thread actually block when findById is called? I understand the portion within map actually blocks the thread.
if it blocks, how can I make this code completely reactive?
Also, in this reactive paradigm of writing code, how do I handle that given employee is not found?

Yes, map is a blocking and synchronous operation for which time taken is always going to be deterministic.
Map should be used when you want to do the transformation of an object /data in fixed time. The operations which are done synchronously. eg your BeanUtils copy properties operation.
FlatMap should be used for non-blocking operations, or in short anything which returns back Mono,Flux.
"how do I handle that given employee is not found?" -
findById returns empty mono when not found. So we can use switchIfEmpty here.
Now let's come to what changes you can make to your code:
public Mono<Employee> editEmployee(EmployeeEditRequest employeeEditRequest) {
return employeeRepository.findById(employeeEditRequest.getId())
.switchIfEmpty(Mono.defer(() -> {
//do something
}))
.map(employee -> {
BeanUtils.copyProperties(employeeEditRequest, employee);
return employee;
})
.flatMap(employee -> employeeRepository.save(employee));
}

stream data into my vega view progressively

I am using Papaparse to parse the CSV and on each data, I run an insert into the view, like so:
Papa.parse(createReadStream('geo.csv'), {
header: true,
chunk(data) {
console.log('chunk: ', data.data.length)
// data.data.length > 0 && tally.push(...data.data)
view.insert('test1', data.data)
},
complete() {
view.data('test1').length // this will return 0
console.log('memory:', process.memoryUsage().heapUsed / 1024 / 1024, ` == time: ${Date.now() - start}`)
},
})
the only way to keep inserting new data is to either:
call run() after insert, insert('test1', data.data).run() to "commit", but I do not need it to run yet, not until I have all of the data (which is why I run() in the complete() callback).
I would have to parse everything at once in memory then pass it using data('test1', allRows) (which I think, will use a lot more memory)
how do I progressively stream data into my vega view? Note that I am running this inside a web worker, as far as I know, vega loader does not support browser's File instance (only URLs for browser environment) this I'm using papaparse.

You need to run runAsync and await it before inserting more data into the view or otherwise updates may bet lost. See https://github.com/vega/vega/issues/2513 for more information on this.
If you don't care about intermediate updates while more data comes in, I would recommend collecting all the data you want to insert and then adding it at once. Memory won't be an issue since you will need all the data in memory anyway. Vega will keep the full data in memory anyway.

Consumable channel

Use Case
Android fragment that consumes items of T from a ReceiveChannel<T>. Once consumed, the Ts should be removed from the ReceiveChannel<T>.
I need a ReceiveChannel<T> that supports consuming items from it. It should function as a FIFO queue.
I currently attach to the channel from my UI like:
launch(uiJob) { channel.consumeEach{ /** ... */ } }
I detach by calling uiJob.cancel().
Desired behavior:
val channel = Channel<Int>(UNLIMITED)
channel.send(1)
channel.send(2)
// ui attaches, receives `1` and `2`
channel.send(3) // ui immediately receives `3`
// ui detaches
channel.send(4)
channel.send(5)
// ui attaches, receiving `4` and `5`
Unfortunately, when I detach from the channel, the channel is closed. This causes .send(4) and .send(5) to throw exceptions because the channel is closed. I want to be able to detach from the channel and have it remain usable. How can I do this?
Channel<Int>(UNLIMITED) fits my use case perfect, except that is closes the channel when it is unsubscribed from. I want the channel to remain open. Is this possible?

Channel.consumeEach method calls Channel.consume method which has this line in documentation:
Makes sure that the given block consumes all elements from the given channel by always invoking cancel after the execution of the block.
So the solution is to simply not use consume[Each]. For example you can do:
launch(uiJob) { for (it in channel) { /** ... */ } }

You can use BroadcastChannel. However, you need to specify a limited size (such as 1), as UNLIMITED and 0 (for rendez-vous) are not supported by BroadcastChannel.
You can also use ConflatedBroadcastChannel which always gives the latest value it had to new subscribers, like LiveData is doing.
BTW, is it a big deal if you new Fragment instance receives only the latest value? If not, then just go with ConflatedBroadcastChannel. Otherwise, none of BroacastChannels may suit your use case (try it and see if you get the behavior you're looking for).

App Folder files not visible after un-install / re-install

I noticed this in the debug environment where I have to do many re-installs in order to test persistent data storage, initial settings, etc... It may not be relevant in production, but I mention this anyway just to inform other developers.
Any files created by an app in its App Folder are not 'visible' to queries after manual un-install / re-install (from IDE, for instance). The same applies to the 'Encoded DriveID' - it is no longer valid.
It is probably 'by design' but it effectively creates 'orphans' in the app folder until manually cleaned by 'drive.google.com > Manage Apps > [yourapp] > Options > Delete hidden app data'. It also creates problem if an app relies on finding of files by metadata, title, ... since these seem to be gone. As I said, not a production problem, but it can create some frustration during development.
Can any of friendly Googlers confirm this? Is there any other way to get to these files after re-install?

Try this approach:
Use requestSync() in onConnected() as:
#Override
public void onConnected(Bundle connectionHint) {
super.onConnected(connectionHint);
Drive.DriveApi.requestSync(getGoogleApiClient()).setResultCallback(syncCallback);
}
Then, in its callback, query the contents of the drive using:
final private ResultCallback<Status> syncCallback = new ResultCallback<Status>() {
#Override
public void onResult(#NonNull Status status) {
if (!status.isSuccess()) {
showMessage("Problem while retrieving results");
return;
}
query = new Query.Builder()
.addFilter(Filters.and(Filters.eq(SearchableField.TITLE, "title"),
Filters.eq(SearchableField.TRASHED, false)))
.build();
Drive.DriveApi.query(getGoogleApiClient(), query)
.setResultCallback(metadataCallback);
}
};
Then, in its callback, if found, retrieve the file using:
final private ResultCallback<DriveApi.MetadataBufferResult> metadataCallback =
new ResultCallback<DriveApi.MetadataBufferResult>() {
#SuppressLint("SetTextI18n")
#Override
public void onResult(#NonNull DriveApi.MetadataBufferResult result) {
if (!result.getStatus().isSuccess()) {
showMessage("Problem while retrieving results");
return;
}
MetadataBuffer mdb = result.getMetadataBuffer();
for (Metadata md : mdb) {
Date createdDate = md.getCreatedDate();
DriveId driveId = md.getDriveId();
}
readFromDrive(driveId);
}
};
Job done!
Hope that helps!

It looks like Google Play services has a problem. (https://stackoverflow.com/a/26541831/2228408)
For testing, you can do it by clearing Google Play services data (Settings > Apps > Google Play services > Manage Space > Clear all data).
Or, at this time, you need to implement it by using Drive SDK v2.

I think you are correct that it is by design.
By inspection I have concluded that until an app places data in the AppFolder folder, Drive does not sync down to the device however much to try and hassle it. Therefore it is impossible to check for the existence of AppFolder placed by another device, or a prior implementation. I'd assume that this was to try and create a consistent clean install.
I can see that there are a couple of strategies to work around this:
1) Place dummy data on AppFolder and then sync and recheck.
2) Accept that in the first instance there is the possibility of duplicates, as you cannot access the existing file by definition you will create a new copy, and use custom metadata to come up with a scheme to differentiate like-named files and choose which one you want to keep (essentially implement your conflict merge strategy across the two different files).
I've done the second, I have an update number to compare data from different devices and decide which version I want so decide whether to upload, download or leave alone. As my data is an SQLite DB I also have some code to only sync once updates have settled down and I deliberately consider people updating two devices at once foolish and the results are consistent but undefined as to which will win.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas