Aerospike operations on list/map - aerospike

Does aerospike go client support operations such as add/remove to list/map in a bin directly from the client i.e, without doing get and then put operation?
aql> select * from test_ns.test_set where PK='12345678'
+----------------------------+---------------------------+
| map_bin | list_bin |
+----------------------------+---------------------------+
| MAP('{22370:1, 23471:1}') | LIST('[22370, 1234543]') |
+----------------------------+---------------------------+
In the above example, I want to add an entry to the list in list bin or add an entry in the map,
I know we can use UDF's for that but Can I do it directly from aerospike client without writing UDF as UDF operations are costly?
P.S. I am using aerospike Go client

The Go client for Aerospike, like all other clients, supports list and map API operations such as list-append (ListAppendOp). In most clients there are wrapper methods for it, but you can always use the operate() method to execute multiple operations on a single record, including atomic list and map operations.

Related

lock redis key when two processes are accessing the same key

Application 1 set a value in Redis.
And we have two instance of application 2 which are running and we would like only one instance should read this value from Redis (please note application 2 takes around 30 sec to 1 min process data )
Can Instance-1 application 2 acquire lock redis key which is created by application 1 , so that instance-2 of application 2 will not read and do the same operation ?
No, there's no concept of record lock in Redis. If you need to achieve some sort of locking you have to use another set of data structures to mimic that behavior. For example
List: You can use a list and then POP the item from the list or...
Redis Stream: Using Redis Stream with ConsumerGroup so that each consumer in your Group only sees a portion of the whole data the needs to be processed and it guarantees you that, when an item is delivered to a consumer, it is not going to be delivered to another one.

Airflow: BigQueryOperator vs BigQuery Quotas and Limits

Is there any pratical way to control quotas and limits on Airflow?.
I'm specially interested on controlling BigQuery concurrency.
There are different levels of quotas on BigQuery . So according to the Operator inputs, there should be a way to check if conditions are met, otherwise waiting for it to fulfill.
It seems to be a composition of Sensor-Operators, querying against a database like redis for example:
QuotaSensor(Project, Dataset, Table, Query) >> QuotaAddOperator(Project, Dataset, Table, Query)
QuotaAddOperator(Project, Dataset, Table, Query) >> BigQueryOperator(Project, Dataset, Table, Query)
BigQueryOperator(Project, Dataset, Table, Query) >> QuotaSubOperator(Project, Dataset, Table, Query)
The Sensor must check conditions like:
- Global running queries <= 300
- Project running queries <= 100
- .. etc
Is there any lib that already does that for me? A plugin perhaps?
Or any other easier solution?
Otherwise, following the Sensor-Operators approach.
How can I encapsulate all of it under a single operator? To avoid repetition of code,
a single operator: QuotaBigQueryOperator
Currently, it is only possible to get the Compute Engine quotas programmatically. However, there is an opened feature request to get/set other project quotas via API. You can post there about the specific case you would like to have implemented and follow it to track it and ask for updates.
Meanwhile, as workaround you can try to use the PythonOperator. With it you can define your own custom code and you would be able to implement retries for the queries that you send that get a quotaExceeded error (or the specific error you are getting). In this way you wouldn't have to explicitly check for the quota levels. You just run the queries and retry until they get executed. This is a simplified code for the strategy I am thinking about:
for query in QUERIES_TO_RUN:
while True:
try:
run(query)
except quotaExceededException:
continue # Jumps to the next cycle of the nearest enclosing loop.
break

Best practice to store List<T> in StackExchange.Redis

I am trying to find best practice(efficient) way of storing set of List objects against ReportingDate key.
List could be serailised as Xml/DataContract or ProtoBuf....
And given some of the data could be big (for that slice of key):
I was wondering if there is any of getting data from redis cache in IEnum/streamed fashion? Atm we using ProtoBuf.NET to have file based cache. And we retrieve data into mem in streamed fashion (we also have an option of selecting what props/fields we want in that T object as ProtoBuf allows us to do it)
Is there any way can force (after some inactivity) certain part of the data to be offloaded from mem and back into file if it is not being used. But load it up again if it is called
Tnx
It sounds like you want a sorted set - see https://redis.io/topics/data-types#sorted-sets. You would use the date as the value, perhaps in epoch time (since it needs to be a number). SE.Redis supports all the operations you would expect to get ranges of values (either positional ranges - the first 20 records, etc; or absolute ranges bases on the value - all items between two dates expressed in the same unit). Look at the methods starting " SortedSet...".
The value can be binary, so protobuf-net is fine (you would serialize the value for each date separately). Just pass a byte[] as the value. You need to handle serialization separately to the redis library.
As for swapping data out: no. Redis has date-based expiration, but doesn't have hot and cold storage. It is either there, or it isn't. You could use scheduled tasks to purge or move data based on date ranges, again using any of the Z* (redis) or SortedSet* (SE.Redis) methods.
For the complete list of Z* operations, see: https://redis.io/commands#sorted_set. They should all be available in SE.Redis.

Adding up values(aggregating) with Redis

Using the latest Redis, is there a way to aggregate values with Redis.
Eg.
set (or hset or sadd) a 3.55
set b 7.66
set c 13.32
etc.
How do I get a + b + c?
I have searched online everywhere.
I have no idea how to do this.
If I can't do it (easily) Redis is the wrong place for me to do what I am trying to do.
Many thanks.
No - this isn't a part of Redis at the moment.
For ad-hoc aggregates you should use a Lua script for that - see the EVAL command - or bring the data to the application and aggregate it there.
If you know what your aggregates need to be, an alternative is just to keep them updated with the rest of the data (i.e. with every write operation).

Datamodel design for an application using Redis

I am new to redis and I am trying to figure out how redis can be used.
So please let me know if this is a right way to build an application.
I am building an application which has got only one data source. I am planning to run a job on nightly basis to get data into a file.
Now I have a front end application, that needs to render this data in different formats.
Example application use case
Download processed applications by a university on nightly basis.
Display how many applications got approved or rejected.
Display number of applications by state.
Let user search for an application by application id.
Instead of using postgres/mysql like relational database, I am thinking about using redis. I am planning to store data in following ways.
Application id -> Application details
State -> List of application ids
Approved -> List of application ids (By date ?)
Declined -> List of application ids (By date ?)
Is this correct way to store data into redis?
Also if someone queries for all applications in california for a certain date,
I will be able to pull application ids in one call but to get details for each application, do I need to make another request?
Word of caution:
Instead of using postgres/mysql like relational database, I am thinking about using redis.
Why? Redis is an amazing database, but don't use the right hammer for the wrong nail. Use Redis if you need real time performance at scale, but don't try make it replace an RDBMS if that's what you need.
Answer:
Fetching data efficiently from Redis to answer your queries depends on how you'll be storing it. Therefore, to determine the "correct" data model, you first need to define your queries. The data model you proposed is just a description of the data - it doesn't really say how you're planning to store it in Redis. Without more details about the queries, I would store the data as follows:
Store the application details in a Hash (e.g. app:<id>)
Store the application IDs in a per state in Set (e.g. apps:<state>)
Store the approved/rejected applications in two Sorted Sets, the id being the member and the date being the score
Also if someone queries for all applications in california for a certain date, I will be able to pull application ids in one call but to get details for each application, do I need to make another request?
Again, that depends on the data model but you can use Lua scripts to embed this logic and execute it in one call to the database.
First of all you can use a Hash to store structured Data. With Lists (ZSets) and Sets you can create indexes for an ordered or unordered access. (Depending on your requirements of course. Make a list of how you want to access your data).
It is possible to get all data as json of an index in one go with a simple redis script (example using an unordered set):
local bulkToTable = function(bulk)
local retTable = {};
for index = 1, #bulk, 2 do
local key = bulk[index];
local value = bulk[index+1];
retTable[key] = value;
end
return retTable;
end
local functionSet = redis.call("SMEMBERS", "app:functions")
local returnObj = {} ;
for index = 1, #functionSet, 1 do
returnObj[index] = bulkToTable(redis.call("HGETALL", "app:function:" .. functionSet[index]));
returnObj[index]["functionId"] = functionSet[index];
end
return cjson.encode(returnObj);
more information about redis scripts see here : http://www.redisgreen.net/blog/intro-to-lua-for-redis-programmers/