How to sort all lists in a specific bin in Aerospike using aql? - aerospike

I have a few question about ordered lists in Aerospike:
How can I see in the DB, using aql, if the list is ordered or not?
Does ordered list means it’s sorted?
I want to scan the db and change all lists (in a specific bin) to be ordered. I want to do is using set_type, but I can’t seem to make it work. Is that possible? how can I do it?
Thanks

I'm posting my answer from your cross-posted question here https://discuss.aerospike.com/t/list-oprations/5282:
You could scan the namespace with a ScanPolicy.includeBinData=false and for each record digest you get back use operate() to wrap the following operations into a single transaction:
ListOperation.setOrder() to ListOrder.ORDERED
ListOperation.sort() with a ListSortFlags.DROP_DUPLICATES
You will only need to run this once to clean up your database.
The ordering type will stick for all future operations. You'd just continue to use the ListWriteFlags.ADD_UNIQUE list policy.
This is for the Java client, but all other clients have these operations and policies in them.

I don't think AQL is the right tool to exploit the full power of lists. Perhaps it is not yet updated to the full functionality of lists. It is built on top of the C client. At least AQL ver 3.15.2.1 that I checked with is not. You might want to write a java client application.

Related

Does Arangodb support stored query?

For example, I have Node Collection A and Links Collection B
I would like for each elements of A, add a new properties depend on the number of links it has from B
This operation will run every day once.
Normally, in RDBM such as MySQL, I would use stored query for it.
Is something of equivalent can be done in ArangoDb?
Currently ArangoDB doesn't offer prepared statements, but some users want it.
If you're interested too, subscribe that github issue by giving it a thumbs up.

Summing the scores of specific items in Sorted Set

Other answers give are outdated
The only option I found in the docs is ZUNIONSCORE which is great, but it forces me to save the results somewhere, and then fetch to retrieve it.
Is there a way to do the sum, on redis side, and not save the results? (since then I need to manually expire it)
Yes, you can use Lua scripting embedded in Redis.

Deleting rows in datastore by time range

I have a CKAN datastore with a column named "recvTime" of type timestamp (i.e. using "timestamp" as type at datastore_create time, as shown in this link). Example value for this column is "2014-06-12T16:08:39.542000".
I have a large numbers of records in the datastore (thousands) and I would like to delete the rows before a given date in "recvTime". My first thought was doing it using the REST API with the datastore_delete operation using a range filter, but it is not possible as described in the following Q&A.
Is there any other way of solving the issue, please?
Given that I have access to the host where CKAN server is running, I wonder if this could be achieved executing a regular SQL sentence on the Postgresql engine where the datastore is persisted. However, I haven't found information about manipulating the CKAN underlying datamodel in the CKAN documentation, so don't know if this a good idea or if it is risky...
Any workaround or information pointer is highly welcome. Thanks!
You could definitely do this directly on the underlying database if you were willing to dig in there (the structure is pretty simple with tables named after the corresponding resource id). You could even turn this into an API of your own using an extension (though you'd want to be careful about permissions).
You might also be interested in the new support (master only atm) for extending the DataStore API via a plugin in an extension - see https://github.com/ckan/ckan/pull/1725

How to access results of Sonar metrics for use with applications like PowerPivot

I'm trying to run a number of applications with known failure rates through Sonar, with hopes of deciding which metrics are most valuable in determining whether a particular application will fail. Ultimately I'll be making some sort of algorithm that will look at the outputs of whatever metrics I'm using and generate a score from 1 - 100. I've got about 21 applications put through Sonar, and the results have been stored in a MySQL database. I originally planned to use PowerPivot to find relationships in the data, but it seems like the formatting of the tables doesn't lend itself well to that. Other questions on stackoverflow have told me that Sonar's tables are unformatted, and I should instead use the Web Service API to get the information. I'm unfamiliar with API and was unsuccessful in trying to do what I wanted by looking at Sonar's documentation for API.
From an answer to another question:
http://nemo.sonarsource.org/api/timemachine?resource=org.apache.cxf:cxf&format=csv&metrics=ncloc,violations_density,comment_lines_density,public_documented_api_density,duplicated_lines_density,blocker_violations,critical_violations,major_violations,minor_violations
This looks very similar to what I'd like to have, except I'm only looking at each application once (I'm analyzing a sample of all the live applications on a grid), which means Timemachine isn't really what I'm looking for. Would it be possible to generate a similar table, except instead of the stats for a particular application per date, it showed the statistics for an application and all of its classes, etc?
If you're not familiar with the WS API, you can also create your own Sonar plugin to achieve whatever you want: it is written in Java and it will execute on every analysis you run. This way, in the code ot this custom plugin, you can do whatever you want: flush the metrics you need in an output file, push them into a third party system, ... etc.
Just take a look on how to write a plugin (most probably you will create a Decorator). You have concrete examples also to get started faster.

Ajax autocomplete extender populated from SQL

OK, first let me state that I have never used this control and this is also my first attempt at using a web service.
My dilemma is as follows. I need to query a database to get back a certain column and use that for my autocomplete. Obviously I don't want the query to run every time a user types another word in the textbox, so my best guess is to run the query once then use that dataset, array, list or whatever to then filter for the autocomplete extender...
I am kinda lost any suggestions??
Why not keep track of the query executed by the user in a session variable, then use that to filter any further results?
The trick to preventing the database from overloading I think is really to just limit how frequently the auto updater is allowed to update, something like once per 2 seconds seems reasonable to me.
What I would do is this: Store the current list returned by the query for word A server side and tie that to a session variable. This should be basically the entire list I would think. Then, for each new word typed, so long as the original word A exists, you can filter the session info and spit the filtered results out without having to query again. So basically, only query again when word A changes.
I'm using "session" in a PHP sense, you may be using a different language with different terminology, but the concept should be the same.
This question depends upon how transactional your data store is. Obviously if you are looking for US states (a data collection that would not change realistically through the life of the application) then I would either cache a System.Collection.Generic List<> type or if you wanted a DataTable.
You could easily set up a cache of the data you wish to query to be dependent upon an XML file or database so that your extender always queries the data object casted from the cache and the cache object is only updated when the datasource changes.
RAM is cheap and SQL is harder to scale than IIS so cache everything in memory:
your entire data source if is not
too large to load it in reasonable
time,
precalculated data,
autocomplete webservice responses.
Depending on your autocomplete desired behavior and performance you may want to precalculate data and create redundant structures optimized for reading. Make use of structs like SortedList (when you need sth like 'select top x ... where z like #query+'%'), Hashtable,...
While caching everything is certainly a good idea, your question about which data structure to use is an issue that wasn't fully answered here.
The best data structure for an autocomplete extender is a Trie.
You can find a good .NET article and code here.