Calculate number of matching properties in Redis - redis

I'd like to see if a migration of a dataset from PostgreSQL to Redis has a positive influence on a particular search query. Unfortunately, I don't really know how to organize the keys and values.
What I want is a that users are able to supply a list of properties and the application delivers a list of items in an ascending order of properties that have not been entered.
For example:
Item #1 { prop1, prop2, prop4, prop7 } Query: "prop1 prop3 prop4 prop5"
Item #2 { prop7, prop8 } Result: Item #3
Item #3 { prop2, prop3, prop5 } Item #1
Item #2
What I have come up with so far:
#!/usr/bin/python
properties = (1, 3, 4, 5)
items = ["Properties:%s:items" % id for property in properties ]
redis.zunionstore("query:related_items", items)
redis.zinterstore("query:result", { "Items:all": 1, "query:related_items": -1 })
This builds a sorted set of Items (all with a score of 1) that are connected with the user-entered Propertys. Afterwards, an intersection with the sorted set of all Items (where each value's score is the number of Propertys) is calculated. The weights are set to create a score of 0 if all Propertys of an Item are supplied in the query.
As the number of Items is about 600.000 entries this query takes approximately 4-6 seconds. Is there a better way to accomplish this?

I imagine you're looking for a Python solution, but the Ohm library for Ruby is my favorite of the Redis-based database analogues. Considering the similarities between Python and Ruby and Ohm's exceptional documentation, you might find some inspiration.

EDIT: Used real properties as stated in comments.
I think I did it (once more). I used PHPRedis.
I used sorted sets too, but I inverted your schema: Each zset represents an ingredient, and each recipe id is a member of that zset. So every zset has the same number of members, i.e., every recipe on the application. Every recipe uses or not a ingredient. That defines the score.
Loading is somewhat expensive, but query is done under 3s, for a sample with 12 ingredients and 600.000 recipes. (you've got a lot of them for sure!).
Loading
Pseudo-code:
For every ingredient i on the system
For every recipe j on the system
If recipe j uses the ingredient i Then
score = 1
INCR recipe:j:ing_count //Will help sorting
RPUSH recipe:j:ing_list i //For listing all ingredients in recipe
Else
score = 0
End If
ZADD ing:i score j
End For
End For
Code:
#!/usr/bin/php
<?
### Total of ingredients
define('NUM_OF_ING',12);
### Total of recipes
define('NUM_OF_RECIPES',600000);
$redis = new \Redis();
$redis->connect('localhost');
for ($ing=1; $ing<=NUM_OF_ING; $ing++) {
for ($recipe=1; $recipe<=NUM_OF_RECIPES; $recipe++) {
$score = rand() % 2;
if ($score == 1) {
$redis->incr("recipe:$recipe:ing_count");
$redis->rpush("recipe:$recipe:ing_list", $ing);
}
$redis->zAdd("ing:$ing", $score, $recipe);
}
}
echo "Done.\n";
?>
Querying
Before I paste the PHP code and measured running time, let me do some observations:
Sorting is done based on the number of ingredients used (sum of the zsets in query). If two recipes use all ingredients that are in query, then the tie-break is done by the number of additional ingredients that one recipe has. More ingredients, higher position.
The sum is handled by ZINTERSTORE. The zset with sums is stored in result.
Then a SORT command looks in the count key for each recipe, tailoring the order with this additional constraint.
Code:
#!/usr/bin/php
<?
$redis = new \Redis();
$redis->connect('localhost');
//properties in query
$query = array('ing:2', 'ing:4', 'ing:5');
$weights = array(1, 1, 1);
//intersection
$redis->zInter('result', $query, $weights, 'sum');
//sorting
echo "Result:\n";
var_dump($redis->sort('result', array('by'=>'recipe:*:ing_count', 'sort'=>'desc', 'limit'=>array(0,10))));
echo "End.\n";
?>
Output and running time:
niloct#Impulse-Ubuntu:~$ time ./final2.php
Result:
array(10) {
[0]=>
string(4) "5230"
[1]=>
string(5) "79549"
[2]=>
string(4) "2871"
[3]=>
string(3) "336"
[4]=>
string(6) "109279"
[5]=>
string(4) "5352"
[6]=>
string(5) "16868"
[7]=>
string(3) "690"
[8]=>
string(4) "3174"
[9]=>
string(4) "8795"
}
End.
real 0m2.930s
user 0m0.016s
sys 0m0.004s
niloct#Impulse-Ubuntu:~$ redis-cli lrange recipe:5230:ing_list 0 -1
1) "12"
2) "11"
3) "10"
4) "9"
5) "8"
6) "7"
7) "6"
8) "5"
9) "4"
10) "3"
11) "2"
12) "1"
Hope that helps.
PS: Can you post your performance measures after trying this ?

Related

getProduct()->getTag() return null, when it should return tags associated to the Product

In my project, we have products that has tag called serviceItem. Those item with that tag when ordered should be separated by the quantity into individuals order.
It issue is that getTags() returns null, and getTagIds gets "Call to a member function getTagIds() on null" when it gets to the next loop.
Is there a reason for why getTags() returns null?
private function transformOrderLines(OrderEntity $order): array
{
/**
* TODO: If we need to send advanced prices,
* the price value of the the lines array should be changed to caldulate the advanced price,
* with the built in quantity calculator
*/
$lines = [];
foreach ($order->getLineItems() as $orderLine) {
$hasDsmServiceItemTag = $orderLine->getProduct()->getTags();
$lines[] = [
'name' => $orderLine->getLabel(),
'sku' => substr($orderLine->getProduct()->getProductNumber(), 0, 19),
'price' => (string) ($orderLine->getProduct()->getPrice()->first()->getNet()
* $order->getCurrencyFactor()), //gets original price, calculates factor
'quantity' => (string) $orderLine->getQuantity()
];
}
$shipping = $this->transformShipping($order);
if ($shipping) {
$lines = array_merge($lines, $shipping);
}
return $lines;
}`
I also tried $orderLine->getProduct()->getTags()->getName() it also return "Call to a member function getTags() on null"
The problem is wherever the $order is fetched from the DB the orderLineItem.product.tag association is not included in the criteria.
For performance reasons shopware does not lazily load all association when you access them on entities, but you have to exactly define which associations should be included when you fetch the entities from the database.
For the full explanation take a look at the docs.

Kotlin: Efficient way of sorting list using another list and alphabetical order

I want to sort the students list based on their popularity (this list is always sorted by their points) and then sort the ones that aren't on that list in alphabetical order
The two lists look like:
students = listOf<Student>(
Student(id = 3, name ="mike"),
Student(id = 2,name ="mathew"),
Student(id = 1,name ="john"),
Student(id = 4,name ="alan")
)
val popularity = listOf<Ranking>(
Ranking(id= 2, points = 30),
Ranking(id= 3, points = 15)
)
The result I'm looking for would be:
[
Student(id=2,name"mathew"), <- first, because of popularity
Student(id=3,name="mike"),
Student(id=4,name="alan"), <- starting here by alphabetical order
Student(id=1,name="john")
]
If anyone knows about an efficient way of doing this I would kindly appreciate the help
Having the rankings as a list is suboptimal, because to look things up you need to browse the list everytime, which is slow if you have many rankings. If you do have a lot of them, I would suggest to get them as a map first. You can do it easily from the list using associate.
Then you can create a custom comparator to compare by popularity and then by name, and use it in sortedWith:
val studentIdToPopularity = popularity.associate { it.id to it.points }
val studentComparator = compareByDescending<Student> { studentIdToPopularity[it.id] ?: 0 }.thenBy { it.name }
val sortedStudents = students.sortedWith(studentComparator)
You can see the result here: https://pl.kotl.in/a1GZ736jZ
Student(id=2, name=mathew)
Student(id=3, name=mike)
Student(id=4, name=alan)
Student(id=1, name=john)

How to sort flatlist in React Native?

I have data stored in a state that is shown in flatlist, I want to sort the data based on ratings. So if I click on sort button they should be sorted in ascending order and when I click again, they should be sorted in descending order.
I have an array of objects stored in state, below is just a piece of data that is important.
show_data_list = [{ ratings : { overall : 4, ...other stuff } } ]
Is there a way I could do it, I tried using the map function which sorts array
list.map((a,b) => a-b)
But how can I use this to sort my array of objects, I cant pass in 2 item.rating.overall as the parameter.
Thanks in advance for the help :)
You can use javascript's built in sort function. You can provide a custom comparer function for it, which should return a negative value if the first item takes precedence, 0 if they are the same and a positive value if the second value should take precedence.
show_data_list.sort((a, b) => { return a.ratings.overall - b.ratings.overall; }). This will sort the data in the ascending order.
show_data_list.sort((a, b) => { return b.ratings.overall - a.ratings.overall; }). This will sort it in the descending order.
This is how I solved it stored the data in a variable and then sorted it based on condition
let rating = this.state.show_data_list;
rating.sort(function(a,b) {
return a.ratings.Overall < b.ratings.Overall
Try This
Sort your List Ascending/ Descending order with name or other string value in list
const [productSortList, setProductSortList] = useState(productarray);
where productarray is your main array
Ascending
productarray.sort((a, b) => a.products_name < b.products_name)
Descending
productarray.sort((a, b) => b.products_name < a.products_name),
You can reset the state of array you have passed as data in Flatlist
ascending ->
setProductSortList(
productarray.sort((a, b) => a.products_name < b.products_name),
);
do same for descending

Raven query returns 0 results for collection contains

I have a basic schema
Post {
Labels: [
{ Text: "Mine" }
{ Text: "Incomplete" }
]
}
And I am querying raven, to ask for all posts with BOTH "Mine" and "Incomplete" labels.
queryable.Where(candidate => candidate.Labels.Any(label => label.Text == "Mine"))
.Where(candidate => candidate.Labels.Any(label => label.Text == "Incomplete"));
This results in a raven query (from Raven server console)
Query: (Labels,Text:Incomplete) AND (Labels,Text:Mine)
Time: 3 ms
Index: Temp/XWrlnFBeq8ENRd2SCCVqUQ==
Results: 0 returned out of 0 total.
Why is this? If I query for JUST containing "Incomplete", I get 1 result.
If I query for JUST containing "Mine", I get the same result - so WHY where I query for them both, I get 0 results?
EDIT:
Ok - so I got a little further. The 'automatically generated index' looks like this
from doc in docs.FeedAnnouncements
from docLabelsItem in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
select new { CreationDate = doc.CreationDate, Labels_Text = docLabelsItem.Text }
So, I THINK the query was basically testing the SAME label for 2 different values. Bad.
I changed it to this:
from doc in docs.FeedAnnouncements
from docLabelsItem1 in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
from docLabelsItem2 in ((IEnumerable<dynamic>)doc.Labels).DefaultIfEmpty()
select new { CreationDate = doc.CreationDate, Labels1_Text = docLabelsItem1.Text, Labels2_Text = docLabelsItem2.Text }
Now my query (in Raven Studio) Labels1_Text:Mine AND Labels2_Text:Incomplete WORKS!
But, how do I address these phantom fields (Labels1_Text and Labels2_Text) when querying from Linq?
Adam,
You got the reason right. The default index would generate 2 index entries, and your query is executing on a single index entry.
What you want is to either use intersection, or create your own index like this:
from doc in docs.FeedAnnouncements
select new { Labels_Text = doc.Labels.Select(x=>x.Text)}
And that would give you all the label's text in a single index entry, which you can execute a query on.

How to get count of all items in a criteria GORM query

So i have this criteria query that is getting 10 feature articles that have itemchannel objects that are of type 4 and in a channel of id 1 i.e get me top 10 articles which are of type feature and in channel x.
def criteria = Feature.createCriteria()
list = criteria.list {
maxResults(params.max)
itemChannels {
eq ('itemType.id',(long)4)
eq ('channel.id',(long)1)
}
}
How do i get the total count efficiently i.e. i have the articles for page 1 but i need the total number for pagination?
Thanks
Think i sorted this.
criteria = Feature.createCriteria()
count = criteria.get{
projections {
countDistinct('id')
}
itemChannels {
eq ('itemType.id',(long)4)
eq ('channel.id',(long)2)
}
}