Is Keyed Node meant to be used with Lazy? - elm

I am reading about optimization in the Elm Guide. It talks about keyed nodes, using US Presidents as an example:
import Html exposing (..)
import Html.Keyed as Keyed
import Html.Lazy exposing (lazy)
viewPresidents : List President -> Html msg
viewPresidents presidents =
Keyed.node "ul" [] (List.map viewKeyedPresident presidents)
viewKeyedPresident : President -> (String, Html msg)
viewKeyedPresident president =
( president.name, lazy viewPresident president )
viewPresident : President -> Html msg
viewPresident president =
li [] [ ... ]
Then give this as an explanation:
Now the Virtual DOM implementation can recognize when the list is resorted. It first matches all the presidents up by key. Then it diffs those. We used lazy for each entry, so we can skip all that work. Nice! It then figures out how to shuffle the DOM nodes to show things in the order you want. So the keyed version does a lot less work in the end.
My confusion is this: If I don't use lazy inside the keyed nodes, the Virtual DOM still has to diff every entry of the list, even if it can match some keys. It seems keyed nodes' usefulness really depends on the lazy inside. Is my understanding correct?

Let's consider an example:
name: Apple, price: $3.2, pic: 🍏
name: Banana, price: $2, pic: 🍌
name: Orange, price: $2.8, pic: 🍊
Now let's imagine that the user sorts by price:
name: Banana, price: $2, pic: 🍌
name: Orange, price: $2.8, pic: 🍊
name: Apple, price: $3.2, pic: 🍏
without keyed nodes, the diffing is going to look like this:
name: AppleBanana, price: $3.22, pic: 🍏🍌
name: BananaOrange, price: $22.8, pic: 🍌🍊
name: OrangeApple, price: $2.83.2, pic: 🍊🍏
which is going to issue in this example 9 replaceElement operations with 9 createTextElement operations (for example, the exact semantics might work slighly differently, but I think the point stands).
The keyed version will understand that the order changed and will issue a single removeChild and appendChild for the apple node.
Hence all the performance savings are on the DOM side. Now this is not just for performance, if those lists had input elements, keeping them keyed if you had your cursor in the Apple input, it would stay in the apple input, but if they weren't keyed, it would now be in the banana input.
You are correct that without lazy the diffing still happens, but the diffing is generally the cheap part, the more expensive part is actually patching the DOM, which is what keyed helps prevent.

Related

How to approach graph modeling using Cypher - should I use property or node?

I have books and authors. In SQL, I would have two tables and then I would create relations between them.
How does this work in the graph world? Should books and authors be separate nodes or are authors just additional node properties?
I've come up with the following code, but I'm not sure if it is redundant. I added authors both to Book node, and I've created a relationship to Author node.
CREATE
(b1:Book {title: "The Catcher in the Rye", author: "J.D. Salinger"}),
(b2:Book {title: "The Great Gatsby", author: "F. Scott Fitzgerald"}),
(b3:Book {title: "The Old Man and the Sea", author: "Ernest Hemingway"}),
(b4:Book {title: "For Whom The Bell Tolls", author: "Ernest Hemingway"}),
(a1:Author {name: "J.D. Salinger"}),
(a2:Author {name: "F. Scott Fitzgerald"}),
(a3:Author {name: "Ernest Hemingway"}),
(a1)-[:WROTE]->(b1),
(a1)-[:WROTE]->(b2),
(a3)-[:WROTE]->(b3),
(a3)-[:WROTE]->(b4)
Is adding authors to Book nodes redundant?
The answer to this question in some ways is "it depends".
I think in general, you would start by not including the author names as properties, so you would just build a structure along the lines of:
(:Author {name: "x"})-[:WROTE]->(:Book {title: "y"})
I have seen cases where it makes sense to store the information also as a property to avoid additional dereferences, but in general I would start with the structure above and only resort to having the information in multiple places if some very good reason arises.
The reasons tend to be unique to particular implementations. In terms of general graph data modeling I would start with the simple "Author wrote book" structure. That also has the advantage of only having to maintain one version of the information (in this case the edge between the author and the book)

Product attributes db structure for e-commerce

Backstory:
I'm building an e-commerce web app (online store)
Now I got to the point of choosing a database system and an appropriate design.
I got stuck with developing a design for product attributes
I've been considering of choosing NoSQL (MongoDB) or SQL database systems
I need you advice and help
The problem:
When you choose a product type (e.g. table) it should show you the corresponding filters for such a type (e.g. height, material etc.). When you choose another type, say "car", it provides you with the car specific filter attributes (e.g. fuel, engine volume)
For example, here on one popular online store if you choose a data storage type you get a filter fo this type attributes, such as hard drive size or connection type
Question
What approach is the best for such a problem? I described some below, but maybe you have your own thoughts in regard to it
MongoDB
Possible solution:
You can implement such product attrs structure pretty easy.
You can create one collection with a field attrs for each product and put there whatever you want, like they suggest here (field "details"):
https://docs.mongodb.com/ecosystem/use-cases/product-catalog/#non-relational-data-model
The structure will be
Problem:
With such a solution you don't have product types at all so you can't filter the products out by their types. Each product contains it's own arbitrary structure in attrs field and don't follow any pattern
Ir maybe I can somehow go with this approach?
SQL
There are solutions like single table where all the products store in one table and you end up with as many fields as an attribute number of all the products taken together.
Or for every product type you create a new table
But I won't consider these ones. One is very bulky and another one isn't much flexible and requires a dynamic scheme design
Possible solution
There is one pretty flexible solution called EAV https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Our schema would be:
EAV
Such a design may be done on MongoDB system, but I'm not sure it's been made for such a normalised structure
Problem
The schema is going to get really huge and really hard to query and grasp
If you choose SQL database, take a look PostgreSQL which supports JSON features. Not necessarily you need to follow Database normalization.
If you choose MongoDB, you need to store attrs array with generic {key:"field", value:"value"} pairs.
{id:1, attrs:[{key: "prime", value: true}, {key:"height", value:2}, {key:"material", value:"wood"},{key:"color", "value":"brown"}]}
{id:2, attrs:[{key: "prime", value: true}, {key:"fuel", value:"gas"}, {key:"volume", "value":3}]}
{id:3, attrs:[{key: "prime", value: true}, {key:"fuel", value:"diesel"}, {key:"volume", "value":1.5}]}
Then you define Multi-key index like this:
db.collection.createIndex({"attrs.key":1, "attrs.value":1})
If you want apply step-by-step filters, use MongoDB aggregation with $elemMatch operator
☑ Prime
☑ Fuel
☐ Other
...
☑ Volume 3
☐ Volume 1.5
Query's representation
db.collection.aggregate([
{
$match: {
$and: [
{
attrs: {
$elemMatch: {
key: "prime",
value: true
}
}
},
{
attrs: {
$elemMatch: {
key: "fuel"
}
}
},
{
attrs: {
$elemMatch: {
key: "volume",
"value": 3
}
}
}
]
}
}
])
MongoPlayground

Should the response body of GET all parent resource return a list of child resource?

Please bear with me if the title is a bit confusing, I will try my best to explain my question below.
Say I have the following two endpoints
api/companies (returns a list of all companies like below)
[{name: "company1", id: 1}, {name: "company2", id: 2}]
api/companies/{companyeId}/employees (returns a list of all employees for a specific company like below)
[{name: "employee1", id: 1}, {name: "employee2", id: 2}]
What the client side needs is a list of companies, each one of which has a list of employees. The result should looks like this:
[
{
name: "company1",
id: 1,
employees: [ {name: "employee1", id: 1}, {name: "employee2", id: 2} ]
},
{
name: "company2",
id: 2,
employees: [ {name: "employee3", id: 3}, {name: "employee4", id: 4} ]
},
]
There are two ways I can think of to do this:
Get a list of company first and loop through the company list to
make a api call for each company to get its list of employees. (I'm wondering if this is a better way of design because of HATEOAS principle if I understand correctly? Because the smallest unit of resource of api/companies is company but not employees so client is expected to discover companies as the available resource but not employees.)
a REST client should then be able to use server-provided links dynamically to discover all the available actions and resources it needs
Return a list of employees inside each company object and then return a list of companies through api/companies. Maybe add a query parameter to this endpoint called responseHasEmployees which is a boolean default to be false, so when user make a GET through api/companies?responseHasEmployees=true, the response body will have a list of employees inside each company object.
So my question is, which way is a better way to achieve the client side's goal? (Not necessarily has to be the above two.)
Extra info that might be helpful: companies and employees are stored in different tables, and employees table has a company_fk column.
Start by asking yourself a couple of questions:
Is this a common scenario?
Is it logical to request data in this way?
If so, it might make sense to make data available in this way.
Next, do you already have api calls that pass variables implemented?
Based on your HATEOAS principle, you probably shouldn't. Clients shouldn't need to know, or understand, variable values in your url.
If not, stay away from it. Make it as clean to the client side as possible. You could make a third distinct api "api/companiesWithEmployees" This fits your HATEOAS principle, the client doesn't need to know anything about parameters or other workings of the api, only that they will get "Companies with Employees".
Also, the cost is minimal; an additional method in the code base. It's simpler for the client side at a low cost.
Next think about some of the developmental consequences:
Are you opening the door to more specific api requests?
Are you able to maintain a hard line on data you want accessible through the api?
Are you able to maintain your HATEOAS principle in that the clients know everything they need to know based on the api url?
Next incorporate scenarios like this into future api design:
Can you preemptively make similar api calls available? ie (Customers and Orders, would you simply make a single api call available that gets the two related to each other?)
Ultimately, my answer to your question would be to go ahead and make this a new api call. The overhead for setting up, testing, and maintaining this particular change seem extremely small, and the likelihood of data being requested in this way appears high.
I assume that the client you build is going to have an interface to view a list of companies where there will be an option to view employees of the company. So it is best to do it by pull on demand and not load the whole data at once.
If you can consider a property of your resource as a sub-resource, do not add the whole sub-resource data into the main resource API. You may include a referral link which can be used by the client to fetch the sub-resource data.
Here, in your case,
Main-Resource - Companies
Sub-Resource - Employees
Company name, contact number, address - These are properties of the company object and not the sub-resource of a company, whereas, employees can be very well considered as sub-resource.

Ravendb document design

In our raven based application we are starting to experience major performance issues when the master document starts increasing in size, as it holds a lot of collections that keep growing. As such I am now planning a major data redesign that is likely to take months and I want to be sure I'm on the right track before I do so.
The current design looks like this:
Community
{
id:,
name,
//other properties
members[
{
id:,
name:,
date of birth:
//etc
},
{
//another member, this list could potentially grow to hundreds of thousands
}
],
league
[
{
id,
name,
seasons[
{...},
{
id:,
divisions
[
{
id:,
name:
matches[
{
id:,
//match details
},
{
//another match. there could be hundreds here in a big league
},
{}
As we started hitting performance issues, we started using transformers to only load what is needed, but that didn't solve the problem fully as some of our leagues are a couple mb's just on their own. The other issue is we always need to be doing member checks to check for admin/membership rights so the members list is always needed.
I understand I could omit the member list completely using a transformer and use an index for membership checks, but the problem remains about what to do, when a member is added , that list will need to be loaded and with an upcoming project there is a potential for it to grow to half a million people or more.
So my plan is separate each entity into it's own document, so in the case of leagues, I will have a league document, and a match document, with matches containing {leagueId, season number, division number, other match details}.
Each member will have their own document with a list of community document Id's they're a member of.
I'm just a bit worried, that using this design, is missing the whole point of a document db and we may as well have used sql, or do you think I'm on the right track with approach?

Improving rendering performance with Jbuilder and Rails 3

The app I'm working on responds to most requests with JSON objects or collections thereof. We're using Jbuilder to construct those responses. The amount of data rendered is fairly large (several thousand objects in various nested structures - once formatted and fully expanded, there are as many as 10,000 lines of JSON for a typical response). This rendering is taking a significant amount of time - about 1/3 of the total request time, according to NewRelic.
I'm looking for some kind of guide, set of tips, or other resource that will help me make sure I'm getting the best possible performance out of JBuilder. I'm also curious if there are performance comparisons available for Jbuilder vs. RABL or other similar tools.
Edit: I've found a GitHub Issue that complains about Jbuilder performance, but the only actual suggestion anyone's made is 'don't use Jbuilder'. Well, actually, they used slightly stronger language, but there's still no word on why Jbuilder is so slow, what, if anything, can be done to get around it, or how other tools for the same task compare.
jbuilder builds up a big hash containing your data and then uses ActiveSupport::JSON to turn it into json. There are faster json emitters as the following micro benchmark shows (make sure you have the multijson and yajl-ruby gems installed)
require 'benchmark'
require 'active_support'
require 'multi_json'
sample = {menu: {
header: "SVG Viewer",
items: [
{id: "Open"},
{id: "OpenNew", label: "Open New"},
nil,
{id: "ZoomIn", label: "Zoom In"},
{id: "ZoomOut", label: "Zoom Out"},
{id: "OriginalView", label: "Original View"},
nil,
{id: "Quality"},
{id: "Pause"},
{id: "Mute"},
nil,
{id: "Find", label: "Find..."},
{id: "FindAgain", label: "Find Again"},
{id: "Copy"},
{id: "CopyAgain", label: "Copy Again"},
{id: "CopySVG", label: "Copy SVG"},
{id: "ViewSVG", label: "View SVG"},
{id: "ViewSource", label: "View Source"},
{id: "SaveAs", label: "Save As"},
nil,
{id: "Help"},
{id: "About", label: "About Adobe CVG Viewer..."}
]
}}
MultiJson.engine = :yajl
Benchmark.bmbm(5) do |x|
x.report 'activesupport' do
1000.times {ActiveSupport::JSON.encode(sample)}
end
x.report 'yajl' do
1000.times {MultiJson.encode(sample)}
end
end
On my machine this produces
user system total real
activesupport 1.050000 0.010000 1.060000 ( 1.068426)
yajl 0.020000 0.000000 0.020000 ( 0.021169)
ie to encode the sample object 1000 times active support took a hair over 1 second, MultiJson (using the yajl engine) took 21ms.
JBuilder is hardcoded to use ActiveSupport::JSON, but MultiJSON (a gem that lets you switch between json libraries) is a trivial drop in and is already a dependency of ActiveSupport - see my fork of jbuilder. I've opened a pull request, but until then you could try using this fork (or create your own - it's a one line change)
Consider switching to Rabl and adding some caching. Given you have thousands of objects in nested structures, some nodes of your resulting JSON can be rendered as partials and cached - the performance gain can be huge.
Apart from this Rabl performance is slightly better than performance of JBuilder, but I find Rabl syntax sometimes confusing and I'd switch to JBuilder once it has fragment caching implemented.
As stated before JBuilder builds a hash, then serializes that hash to JSON.
The same with caching, there is the main hash and the cached hash get merged into the main hash which still needs to be converted to JSON.
My solution was TurboStreamer. TurboStreamer outputs directly to an IO/Stream/String therefore skipping the serialization step that JBuilder (and at first glance this still applies to Rabl, and to_json depending on usage).
For us this has significantly reduced render time & GC times (due to building the hash in jbuilder) and allows us to start streaming JSON out to the client as we get our results. The downside is TurboStreamer is a little more verbose and explicit.
Performance Test A (no caching involved):
source
results
Performance Test B (mostly all caching):
source
results