Product attributes db structure for e-commerce - sql

Backstory:
I'm building an e-commerce web app (online store)
Now I got to the point of choosing a database system and an appropriate design.
I got stuck with developing a design for product attributes
I've been considering of choosing NoSQL (MongoDB) or SQL database systems
I need you advice and help
The problem:
When you choose a product type (e.g. table) it should show you the corresponding filters for such a type (e.g. height, material etc.). When you choose another type, say "car", it provides you with the car specific filter attributes (e.g. fuel, engine volume)
For example, here on one popular online store if you choose a data storage type you get a filter fo this type attributes, such as hard drive size or connection type
Question
What approach is the best for such a problem? I described some below, but maybe you have your own thoughts in regard to it
MongoDB
Possible solution:
You can implement such product attrs structure pretty easy.
You can create one collection with a field attrs for each product and put there whatever you want, like they suggest here (field "details"):
https://docs.mongodb.com/ecosystem/use-cases/product-catalog/#non-relational-data-model
The structure will be
Problem:
With such a solution you don't have product types at all so you can't filter the products out by their types. Each product contains it's own arbitrary structure in attrs field and don't follow any pattern
Ir maybe I can somehow go with this approach?
SQL
There are solutions like single table where all the products store in one table and you end up with as many fields as an attribute number of all the products taken together.
Or for every product type you create a new table
But I won't consider these ones. One is very bulky and another one isn't much flexible and requires a dynamic scheme design
Possible solution
There is one pretty flexible solution called EAV https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model
Our schema would be:
EAV
Such a design may be done on MongoDB system, but I'm not sure it's been made for such a normalised structure
Problem
The schema is going to get really huge and really hard to query and grasp

If you choose SQL database, take a look PostgreSQL which supports JSON features. Not necessarily you need to follow Database normalization.
If you choose MongoDB, you need to store attrs array with generic {key:"field", value:"value"} pairs.
{id:1, attrs:[{key: "prime", value: true}, {key:"height", value:2}, {key:"material", value:"wood"},{key:"color", "value":"brown"}]}
{id:2, attrs:[{key: "prime", value: true}, {key:"fuel", value:"gas"}, {key:"volume", "value":3}]}
{id:3, attrs:[{key: "prime", value: true}, {key:"fuel", value:"diesel"}, {key:"volume", "value":1.5}]}
Then you define Multi-key index like this:
db.collection.createIndex({"attrs.key":1, "attrs.value":1})
If you want apply step-by-step filters, use MongoDB aggregation with $elemMatch operator
☑ Prime
☑ Fuel
☐ Other
...
☑ Volume 3
☐ Volume 1.5
Query's representation
db.collection.aggregate([
{
$match: {
$and: [
{
attrs: {
$elemMatch: {
key: "prime",
value: true
}
}
},
{
attrs: {
$elemMatch: {
key: "fuel"
}
}
},
{
attrs: {
$elemMatch: {
key: "volume",
"value": 3
}
}
}
]
}
}
])
MongoPlayground

Related

Error trying to reorder items within another list in Keystone 6

I'm using KeystoneJS v6. I'm trying to enable functionality which allow me to reorder the placement of images when used in another list. Currently i'm setting up the image list below, however I'm unable to set the defaultIsOrderable to true due to the error pasted.
KeystoneJS list:
Image: list({
fields: {
title: text({
validation: { isRequired: true },
isIndexed: 'unique',
isFilterable: true,
isOrderable: true,
}),
images: cloudinaryImage({
cloudinary: {
cloudName: process.env.CLOUDINARY_CLOUD_NAME,
apiKey: process.env.CLOUDINARY_API_KEY,
apiSecret: process.env.CLOUDINARY_API_SECRET,
folder: process.env.CLOUDINARY_API_FOLDER,
},
}),
},
defaultIsOrderable: true
}),
Error message:
The expected type comes from property 'defaultIsOrderable' which is declared here on type 'ListConfig<BaseListTypeInfo, BaseFields<BaseListTypeInfo>>'
Peeking at the definition of the field shows
defaultIsOrderable?: false | ((args: FilterOrderArgs<ListTypeInfo>) => MaybePromise<boolean>);
Looking at the schema API docs, the defaultIsOrderable lets you set:
[...] the default value to use for isOrderable for fields on this list.
You're trying to set this to true but, according to the relevant section of the field docs, the isOrderable field option already defaults to true.
I believe this is why the defaultIsOrderable type doesn't allow you to supply the true literal – doing so would be redundant.
So that explains the specific error your getting but I think you also may have misunderstood the purpose of the orderBy option.
The OrderBy Option
The field docs mention the two effects the field OrderBy option has:
If true (default), the GraphQL API and Admin UI will support ordering by this field.
Take, for example, your Image list above.
As the title field is "orderable", it is included in the list's orderBy GraphQL type (ImageOrderByInput).
When querying the list, you can order the results by the values in this field, like this:
query {
images (orderBy: [{ title: desc }]) {
id
title
images { publicUrl }
}
}
The GraphQL API docs have some details on this.
You can also use the field to order items when listing them in the Admin UI, either by clicking the column heading or selecting the field from the "sort" dropdown:
Note though, these features order items at runtime, by the values stored in orderable fields.
They don't allow an admin to "re-order" items in the Admin UI (unless you did so by changing the image titles in this case).
Specifying an Order
If you want to set the order of items within a list you'd need to store separate values in, for example, a displayOrder field like this:
Image: list({
fields: {
title: text({
validation: { isRequired: true },
isIndexed: 'unique',
isFilterable: true,
}),
displayOrder: integer(),
// ...
},
}),
Unfortunately Keystone doesn't yet give you a great way to manage this the Admin UI (ie. you can't "drag and drop" in the list view or anything like that). You need to edit each item individually to set the displayOrder values.
Ordering Within a Relationship
I notice your question says you're trying to "reorder the placement of images when used in another list" (emphasis mine).
In this case you're talking about relationships, which changes the problem somewhat. Some approaches are..
If the relationship is one-to-many, you can use the displayOrder: integer() solution shown above but the UX is worse again. You're still setting the order values against each item but not in the context of the relationship. However, querying based on these order values and setting them via the GraphQL API should be fairly straight forward.
If the relationship is many-to-many, it's similar but you can't store the "displayOrder" value in the Image list as any one image may be linked to multiple other items. You need to store the order info "with" the relationship itself. It's not trivial but my recent answer on storing additional values on a many-to-many relationship may point you in the right direction.
A third option is to not use the relationship field at all but to link items using the inline relationships functionality of the document field. This is a bit different to work with - easier to manage from the Admin UI but less powerful in GraphQL as you can't traverse the relationship as easily. However it does give you a way to manage a small, ordered set of related items in a many-to-many relationship.
You can save an ordered set of ids to a json field. This is similar to using a document field but a more manual.
Hopefully that clears up what's possible with the current "orderBy" functionality and relationship options. Which of these solutions is most appropriate depends heavily on the specifics of your project and use case.
Note too, there are plans to extend Keystone's functionality for sorting and reordering lists from both the DX and UX perspectives.
See "Sortable lists" on the Keystone roadmap.

How to achieve generic Audit.NET json data processing?

I am using Audit.Net library to log EntityFramework actions into a database (currently everything into one AuditEventLogs table, where the JsonData column stores the data in the following Json format:
{
"EventType":"MyDbContext:test_database",
"StartDate":"2021-06-24T12:11:59.4578873Z",
"EndDate":"2021-06-24T12:11:59.4862278Z",
"Duration":28,
"EntityFrameworkEvent":{
"Database":"test_database",
"Entries":[
{
"Table":"Offices",
"Name":"Office",
"Action":"Update",
"PrimaryKey":{
"Id":"40b5egc7-46ca-429b-86cb-3b0781d360c8"
},
"Changes":[
{
"ColumnName":"Address",
"OriginalValue":"test_address",
"NewValue":"test_address"
},
{
"ColumnName":"Contact",
"OriginalValue":"test_contact",
"NewValue":"test_contact"
},
{
"ColumnName":"Email",
"OriginalValue":"test_email",
"NewValue":"test_email2"
},
{
"ColumnName":"Name",
"OriginalValue":"test_name",
"NewValue":"test_name"
},
{
"ColumnName":"OfficeSector",
"OriginalValue":1,
"NewValue":1
},
{
"ColumnName":"PhoneNumber",
"OriginalValue":"test_phoneNumber",
"NewValue":"test_phoneNumber"
}
],
"ColumnValues":{
"Id":"40b5egc7-46ca-429b-86cb-3b0781d360c8",
"Address":"test_address",
"Contact":"test_contact",
"Email":"test_email2",
"Name":"test_name",
"OfficeSector":1,
"PhoneNumber":"test_phoneNumber"
},
"Valid":true
}
],
"Result":1,
"Success":true
}
}
Me and my team has a main aspect to achieve:
Being able to create a search page where administrators are able to tell
who changed
what did they change
when did the change happen
They can give a time period, to reduce the number of audit records, and the interesting part comes here:
There should be an input text field which should let them search in the values of the "ColumnValues" section.
The problems I encountered:
Even if I map the Json structure into relational rows, I am unable to search in every column, with keeping the genericity.
If I don't map, I could search in the Json string with LIKE mssql function but on the order of a few 100,000 records it takes an eternity for the query to finish so it is probably not the way.
Keeping the genericity would be important, so we don't need to modify the audit search page every time when we create or modify a new entity.
I only know MSSQL, but is it possible that storing the audit logs in a document oriented database like cosmosDB (or anything else, it was just an example) would solve my problem? Or can I reach the desired behaviour using relational database like MSSQL?
Looks like you're asking for an opinion, in that case I would strongly recommend a document oriented DB.
CosmosDB could be a great option since it supports SQL queries.
There is an extension to log to CosmosDB from Audit.NET: Audit.AzureCosmos
A sample query:
SELECT c.EventType, e.Table, e.Action, ch.ColumnName, ch.OriginalValue, ch.NewValue
FROM c
JOIN e IN c.EntityFrameworkEvent.Entries
JOIN ch IN e.Changes
WHERE ch.ColumnName = "Address" AND ch.OriginalValue = "test_address"
Here is a nice post with lot of examples of complex SQL queries on CosmosDB

Ravendb document design

In our raven based application we are starting to experience major performance issues when the master document starts increasing in size, as it holds a lot of collections that keep growing. As such I am now planning a major data redesign that is likely to take months and I want to be sure I'm on the right track before I do so.
The current design looks like this:
Community
{
id:,
name,
//other properties
members[
{
id:,
name:,
date of birth:
//etc
},
{
//another member, this list could potentially grow to hundreds of thousands
}
],
league
[
{
id,
name,
seasons[
{...},
{
id:,
divisions
[
{
id:,
name:
matches[
{
id:,
//match details
},
{
//another match. there could be hundreds here in a big league
},
{}
As we started hitting performance issues, we started using transformers to only load what is needed, but that didn't solve the problem fully as some of our leagues are a couple mb's just on their own. The other issue is we always need to be doing member checks to check for admin/membership rights so the members list is always needed.
I understand I could omit the member list completely using a transformer and use an index for membership checks, but the problem remains about what to do, when a member is added , that list will need to be loaded and with an upcoming project there is a potential for it to grow to half a million people or more.
So my plan is separate each entity into it's own document, so in the case of leagues, I will have a league document, and a match document, with matches containing {leagueId, season number, division number, other match details}.
Each member will have their own document with a list of community document Id's they're a member of.
I'm just a bit worried, that using this design, is missing the whole point of a document db and we may as well have used sql, or do you think I'm on the right track with approach?

How to construct intersection in REST Hypermedia API?

This question is language independent. Let's not worry about frameworks or implementation, let's just say everything can be implemented and let's look at REST API in an abstract way. In other words: I'm building a framework right now and I didn't see any solution to this problem anywhere.
Question
How one can construct REST URL endpoint for intersection of two independent REST paths which return collections? Short example: How to intersect /users/1/comments and /companies/6/comments?
Constraint
All endpoints should return single data model entity or collection of entities.
Imho this is a very reasonable constraint and all examples of Hypermedia APIs look like this, even in draft-kelly-json-hal-07.
If you think this is an invalid constraint or you know a better way please let me know.
Example
So let's say we have an application which has three data types: products, categories and companies. Each company can add some products to their profile page. While adding the product they must attach a category to the product. For example we can access this kind of data like this:
GET /categories will return collection of all categories
GET /categories/9 will return category of id 9
GET /categories/9/products will return all products inside category of id 9
GET /companies/7/products will return all products added to profile page of company of id 7
I've omitted _links hypermedia part on purpose because it is straightforward, for example / gives _links to /categories and /companies etc. We just need to remember that by using hypermedia we are traversing relations graph.
How to write URL that will return: all products that are from company(7) and are of category(9)? In otherwords how to intersect /categories/9/products and /companies/7/products?
Assuming that all endpoints should represent data model resource or collection of them I believe this is a fundamental problem of REST Hypermedia API, because in traversing hypermedia api we are traversing relational graph going down one path so it is impossible to describe such intersection because it is a cross-section of two independent graph paths.
In other words I think we cannot represent two independent paths with only one path. Normally we traverse one path like A->B->C, but if we have X->Y and Z->Y and we want all Ys that come from X and Z then we have a problem.
So far my proposition is to use query strings: /categories/9/products?intersect=/companies/9 but can we do better?
Why do I want this?
Because I'm building a framework which will auto-generate REST Hypermedia API based on SQL database relations. You could think of it as a trans compiler of URLs to SELECT ... JOIN ... WHERE queries, but the client of the API only sees Hypermedia and the client would like to have a nice way of doing intersections, like in the example.
I don't think you should always look at REST as database representation, this case looks more of a kind of specific functionality to me. I think I'd go with something like this:
/intersection/comments?company=9&product=5
I've been digging after I wrote it and this is what I've found (http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api):
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really make sense to be applied to a specific resource's endpoint. In this case, /search would make the most sense even though it isn't a resource. This is OK - just do what's right from the perspective of the API consumer and make sure it's documented clearly to avoid confusion.
What You want to do is to filter products in one of the categories ... so following Your example if we have:
GET /categories/9/products
Above will return all products in category 9, so to filter out products for company 7 I would use something like this
GET /categories/9/products?company=7
You should treat URI as link to fetch all data (just like simple select query in SQL) and query parameters as where, limit, desc etc.
Using this approach You can build complex and readable queries fe.
GET /categories/9/products?company=7&order=name,asc&offset=10&limit=20
All endpoints should return single data model entity or collection of
entities.
This is NOT a REST constraint. If you want to read about REST constraints, then read the Fielding dissertation.
Because I'm building a framework which will auto-generate REST
Hypermedia API based on SQL database relations.
This is a wrong approach and has nothing to do with REST.
By REST you describe possible resource state transitions (or operation call templates) by sending hyperlinks in the response. These hyperlinks consist of a HTTP methods and URIs (and other data which is not relevant now) if you build the uniform interface using the HTTP and URI standards, and we usually do so. The URIs are not (necessarily) database entity and collection identifiers and if you apply such a constraint you will end up with a CRUD API, not with a REST API.
If you cannot describe an operation with the combination of HTTP methods and already existing resources, then you need a new resource.
In your case you want to aggregate the GET /users/1/comments and GET /companies/6/comments responses, so you need to define a link with GET and a third resource:
GET /comments/?users=1&companies=6
GET /intersection/users:1/companies:6/comments
GET /intersection/users/1/companies/6/comments
etc...
RESTful architecture is about returning resources that contain hypermedia controls that offer state transitions. What i see here is a multistep process of state transitions. Let's assume you have a root resource and somehow navigate over to /categories/9/products using the available hypermedia controls. I'd bet the results would look something like this in hal:
{
_links : {
self : { href : "/categories/9/products"}
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
If you want your client to be able to intersect this with another collection you need to provide to them the mechanism to perform this. You have to give them a hypermedia control. HAL only has links, templated links, and embedded as control types. let's go with links..change the response to:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
Now the client just picks the right hypermedia control (aka link) based on the title field of the link.
That's the simplest solution. But you'll probably say there's 1000's of companies i don't want 1000's of links...well ok if that;s REALLY the case...you just offer a state transition in the middle of the two we have:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-options : { href : "URL to a Paged collection of all intersect options"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
See what i did there? an extra control for an extra state transition. JUST LIKE YOU WOULD DO IF YOU HAD A WEBPAGE. You'd probably put it in a pop up, well that's what the client of your app can do too with the result of that control.
It's really that simple...just think how you'd do it in HTML and do the same.
The big benefit here is that the client NEVER EVER needed to know a company or category id or ever plug that in to some template. The id's are implementation details, the client never knows they exist, they just executed Hypermedia controls..and that is RESTful.

ElasticSearch mapping for nested enumerable objects (i18n)

I'm at a loss as to how to map a document for search with the following structure:
{
"_id": "007ff234cb2248",
"ids": {
"source1": "123",
"source2": "456",
"source3": "789"
}
"names": [
{"en":"Example"},
{"fr":"exemple"},
{"es":"ejemplo"},
{"de":"Beispiel"}
],
"children" : [
{
"ids": {
"source1": "CXXIII",
"source2": "CDLVI",
"source3": "DCCLXXXIX",
}
names: [
{"en":"Example Child"},
{"fr":"exemple enfant"},
{"es":"Ejemplo niño"},
{"de":"Beispiel Kindes"}
]
}
],
"relatives": {
// Typically no "ids" at this level.
"relation": 'uncle',
"children": [
{
"ids": {
"source1": "0x7B",
"source2": "0x1C8",
"source3": "0x315"
},
"names": [
{"en":"Example Cousin"},
{"fr":"exemple cousine"},
{"es":"Ejemplo primo"},
{"de":"Beispiel Cousin"}
]
}
]
}
}
The child object may appear in the children section directly, or further nested in my document as uncle.children (cousins, in this case). The IDs field is common to levels one (the root), level two (the children and the uncle), and to level three (the cousins), the naming structure is also common to levels one and three.
My use-case is to be able to search for IDs (nested objects) by prefix, and by the whole ID. And also to be able to search for child names, following an (as yet undefined) set of analyzer rules.
I haven't been able to find a way to map these in any useful way. I don't believe I'll have much success using the same technique for ids and names, as there's an extra level of mapping between names and the document root.
I'm not even certain that it is even mappable. I believe at least in principle that the ids should be mappable as terms, and perhaps that if I index the names as terms in some way, too.
I'm simply at a loss, and the documentation doesn't seem to cover anything like this level of complex mapping.
I have limited (read: no) control of the document as it's coming from the CouchDB river, and the upstream application already relies on this format, so I can't really change it.
I'm looking for being able to search by the following pseudo conditions, all of which should match:
ID: "123"
ID by source (I don't know how best to mark this up in pseudo language)
ID prefix: "CDL"
Name: "Example", "Example Child"
Localized name (I don't even know how best to pseudo-mark this up!
The specifics of tokenising and analysis I can figure out for myself, when I at least know how to map
Objects when both the key and the value of the object properties are important
Enumerable objects when the key and value are important.
If the mapping from an ID to its children is 1-to-many, then you could store the children's names in a child field, as a field can have multiple values. Each document would then have an ID field, possibly a relation field, and zero or more child fields.