composite entities in story version of wit.ai - wit.ai

I am trying to familiarize myself with wit.ai story version using a pizza ordering example. As suggested, I used the start point as an all inclusive example:
"Hi I would like to order a large pan crust pepperoni pizza with medium fries and a small tomato juice and a bundt cake"
In the above example, I can see the need for below composite entities
pizza:{type:pepperoni,size:large,crust:pan}
sides:{type:fries,size:medium}
drink:{type:juice,subType:tomato,size:small}
dessert:{type:cake, subType: bundt}
How do I create a composite entity in the "Understanding" tab ?
thanks
venu

Composite Entities are disabled in Stories for now. We are in the middle of an infrastructure update that should increase robustness considerably. This is a high pri item, so as soon as that lands, we will release composite entities in Stories.

Related

Single OR Grouped/Combined Items for Entries in a Database

I have 2 Similar Types of Things that I want to point to in a certain field in a Database. One of them is a combination of 1 or more of the other.
How should I Design my Database in this kind of situation?
In my current example I have (Simple)Food Ingredients and (Combined)Food Dishes and I want Either One or these Things to be entries in a Meals/Eating table.
So a User can Either Eat a simple Food like an Apple OR a complex food like an Apple Pie that consists of 200g of Apples and 100g of Flour and 30g of Sugar etc. at one point in time in a Meal. I'm thinking something like this:
Ingredients |IID| |Name| |Calories|
Dishes |DID| |Name| (|Calories|???)
Food Data |DID| |IID| |Amount|
.
Users |UID| |FirstName| |LastName| etc.
Meals |UID| |DID| |Date/Time| |Amount|
I Find this really annoying tho because Every Single Ingredient would have to have Two (Basically Identical)Entries to start with: 1 in the Ingredients Table and 1 in the Dishes Table so it could be paired up in a Meal. Am I missing something Here? Is there a way around this?
Also I don't know if a Dish should have the Calories Listed in the Database. Having the Calories for a Dish in the database is rather Redundant because it could be Calculated when Making a Query(by summing up&calculating its respective ingredients). BUT this seems quite inefficient since it this calculation would have to be done for every single query of a dish(and it would get worse by adding things like Macros/Nutrional Values/Price which I left out for clarity/simplicity here).
Also If I DO have Calories(and other things relating to food in general) for a Dish I could just have 1 single table in this scenario like:
Food |FID| |Name| |Calories| (|Simple[bool]|?)
Food Data |FID| |FID| |Amount|
This would Seem better in general. The Simple field would distinguish between Simple Ingredients or Dish which I think is worth putting in so you don't have to search in Food Data for every item.
BUT If I want to introduce Specific Dish-Only Data then I would to make some Other Table like:
DISH DATA |FID| |TimetoCook| |Presentation| etc. (which seems pretty weird/unintuitive to me)
.
So the Question is: What the BEST General Practice in this kind of scenario?
Is it generally better to do extra calculations when querying rather than have redundant data in these kinds of situations?
Is there something I'm missing that would make this simpler/better in general?
I'm not sure this can be answered as generally as you would like, because the semantics and the use of the database should be taken into account. Even in the simple/complex food context of your example, either of the approaches you describe (ingredients/dishes/food_data or food/food_data/dish_data) can be right, depending on the specifics.
Let me get this out of the way first: I wouldn't look for a third approach. Any other thing I can think of would be semantically obscure, hell to maintain or a nightmare to query.
So your first concern is the semantics of the database. Your first approach seems more natural; most people will easily see the semantic distinction between ingredients and dishes. It is also the only option if the "ingredient" entity has another reason of existence besides being part of a dish, e.g. for managing orders of raw ingredients. If you choose to go with the second approach you will have to make sure that a) it fits your data and b) you choose your table names very very carefully.
For the second approach to "fit your data" semantically, simple dishes must fully fit the description: "dishes that don't have the extra dish_data". The [Simple] flag is also acceptable as a property of dish, though a real need for it can be a hint that you're off base with this approach. But if ingredients and dishes only partially overlap, i.e. if you have ingredients that cannot be dishes, or if they have different properties in general, then you are definitely off base. If you find yourself in need of enforcing business rules that would prevent a customer from ordering a serving of "flour"; if you raise questions like what to put under "calories" for the "pickles" (would it be the calories per 100gr for pickles-as-an-ingredient, or the calories per serving for pickles-as-a-side-dish?); if you find you have fields like "measuring unit" that are meaningless for dishes, then you're dealing with two separate entities (ingredients and dishes), not one entity (dish) with two subcategories (simple and complex). If you are only going to duplicate a tiny bit of information between the two tables and save yourself a lot of trouble and ambiguity, by all means do that.
Your second concern is how the data will be used. Try to answer questions like: Are you going to be querying calories of dishes millions of times per second? Are the ingredients - and therefore the calories - of dishes going to stay the same for ever? Will your customer or cook ever need to query what a dish is made of?
"Don't duplicate" and "don't store calculatable values" are two rules that are as hard as design rules come. Even such rules though should be, not really bent, just "critically adjusted" some times, if that makes sense.
This is a question of understanding the context of your data.
I imagine meals can be simple (unprocessed) or be complex and consist of other meals. If I were to generate a database for meals and their calorific value I would not separate them.
meal | calorific value per 100g | glicemic index
apple | 12345 | 34234
apple-pie | 3233 | 32334
Other table you would join it with could be a meal composition for a specific person.
2020-02-27|Johny Doe | Breakfast |apple | 300 g
2020-02-27|Johny Doe | Breakfast |sausage| 150 g
2020-02-27|Johny Doe | Breakfast |apple-juice | 500 g
By joinning the two tables you would learn how much Johny Doe ate callories and perhaps what was the glicemic index...
Then... it is not yet an SQL question but a the question of understanding first the process one would like to describe with SQL.

Table design for random versus

I am developping an application that creates random "battles / versus" of two things that have the same type. Let's say it's about cars and their features for example:
There would be many group of features, things related to safety, to comfort, etc.
Car A would have one security feature, airbags, Car B it would be ABS and air conditioning and Car C heated seats.
Now I have to store a list of versus: airbags vs. ABS, heated seats vs. air conditioning. Note that I can't do airbags vs. heated seats.
I've come up with two ideas to make this work.
users
id | username
cars
id | name
groups
id | name
features
id | car_id | group_id | value
versus
First version:
id | user_id | group_id | car_a_id | car_b_id | winner_id
Second version:
id | user_id | feature_a_id | feature_b_id | winner_id
Now with the first version, I have to use car_a_id, car_b_id and group_id to fetch features but that ensures I am not comparing features that are not in the same group. The thing is if any feature gets deleted I'll will have an invalid versus and I won't know that until I actually fetch the features.
The second version solves that, since I can just add a ON DELETE CASCADE to my foreign keys. But now I have to make sure each feature of a row is in the same group when fetching them (I can't rely on the fact that the list of versus is actually valid).
Now I don't like either of these solutions, I feel like I'm doing something completely wrong but I can't find out anything better.
Is there a better / simpler way to do that?
I commented above, but want to post a few solutions.
If a "versus" is a direct feature to feature comparison, you need to directly reference the features in the "versus." This is shown in your "Second version."
It sounds like the main concern is ensuring that both features in a "versus" are of the same group. You can accomplish this in a few different ways.
Eliminate the option for users to compare features in different groups via the UI or other code. For example, have drop down boxes that only show the features in a single group when the user is selecting features to compare.
You could also try to use subqueries or functions in a postgresql table constraints. I've never done something like this (nor would I recommend it), but it may be suitable for your specific application requirements. http://grokbase.com/t/postgresql/pgsql-general/052h6ybahr/checking-of-constraints-via-subqueries
You could store the group_id of both features in the "versus" table. This definitely violates the rules of normalization, but if you have no control over the calling code and need to ensure the groups do not conflict, you can create a simple constraint such that "feature1_group_id" == "feature2_group_id." Not a robust method, and I wouldn't recommend it, but is another option.
In summary, I think you need to coordinate with the UI to ensure that users cannot violate group membership constraints when comparing features (solution 1).

Structuring Categories in SQL Server 2008 R2

Hello I looked at a few similar posts to what I am looking to do but none are the same to what I need to accomplish. I am trying to come up with my structure for categories using SQL Server 2008 R2.
I want to make categories for lets say...Clothing, Electronics, Furniture, Tools......and so on.
I am looking at a 3 field table to start with a category table (category ID (PK), categoryname, parentID) which from what I am finding is a standard practice and can go several layers deep without having to restructure.
The problem lies where it is fine for lets say (electronics-cd players-cd changer), (electronics-lighting-studio lighting) or (clothing-womens-skirts), (clothing-womens-pants) perhaps one level deeper?
What do I do for brands? I was planning to have a brand table (brandID(PK),Brand)
then Category_Brand table (categoryID, BrandID) to link brands to categories when I want to use a cascading dropdown list that populates from the database.
What do I do for deeper attributes where the rest of the attributes apply to the item itself, but are dependent on the category? color, pattern, material, size? which can apply to clothing, but not to electronics or tools, also Mens clothing has different sizing than womens clothing.
Or furniture where I want to store dresser dimensions and color, or beds where I want to store bed size (king, queen, twin) and to store the type (Spring, air, foam, water)
What i need is to connect the item specific attributes to each item based on which category the item belongs to. On another forum I was suggested to just add all the misc. attributes to the item table and leave the ones I don't use null. I know that doesn't make sense, it seems to me that there should be different sub-attribute tables with fields that are related to the categories that they represent. i am thinking that clothing size for example would have a lookup table where each size has a (sizeid) and a link table for a many to many type relationship to connect the size with the (itemid), although there would need to be a few different size tables because men's sizes and women's sizes are different or put then all in one table with the (categoryid) as a sort of parent foreign key, and dimensions for another item like (length, width, height) would be stored into its own table along with the (itemid) as the foreign key?
Or is it a good idea to store the (sizeid) or (dimensionid) right into the item table?
This seemed to be simple to me when I started, but the more I look at it the more I am getting confused as to the correct way to structure this, I want it to work good for performance as this may become a high volume application. But doesn't everyone wish that?
try to understand normalization first. Here is a good article for you.

Should I Merge This Database data into one table?

I want to store some product data in my database. At first I thought having a product table and product info table but not sure if I should just merge it all into one table.
Example
Coke - 355 ml
Product.Name = Coke
ProductInfo.Size = 355
ProductInfo.UnitType = ml
Coke - 1 Liter
Product.Name = Coke (would not be duplicated...just for illustration purposes)
ProductInfo.Size = 1
ProductInfo.UnitType = L
if I did this of then of course I would not be duplicating the "Name" twice. My plan then was I could find all sizes of the same product very easily as all I would have to do is look at the many side of the relationship for any given item.
Here is the problem though, all the data will be user driven and entered. Someone might write "Coke a Cola" instead of "Coke" and now that would be treated as 2 different products as when I go to look if a product has been entered called "Coke a Cola" but it won't know to check for "Coke" as well.
This leads me to having to do like partial matches to maybe try to find it but what happens if someone has some generic brand what would be "Cola" and that would get matched as well.
This gets me to think maybe there is no point to keep the data separate as to me it seems like a good chance everything will end up to be it's own product anyways.
There's merit in both approaches. Keeping them separate, the table you're calling "Product", I'd call "Brand" instead, and "ProductInfo" is your actual "Product" table, containing the information about the actual sellable item of that brand (a 12oz can or liter bottle of Coke).
Alternately, you could further normalize it into Brand, Product (here being Coke Classic as maybe opposed to Diet Coke or Caffeine Free Coke) and UnitSize (can or bottle; these would apply not only to Coke Classic, but Diet Coke, Pepsi or Dr Pepper).
If you denormalize this this data, you aren't duplicating much on the naming side of things, but you are duplicating quite a bit of unit of measure data. The question is whether it's more useful to ensure consistent branding of your product records (denormalizing means you'll need some other means to ensure your products have the same brand), or to avoid the joins between the two tables (there is a cost to joining, though it's typically small if you can join between indexed fields).
The only compelling reason to make a Header-Detail arrangement, with two tables, would be if Coke has attributes that are the same no matter the packaging. Right now, I don't see any attributes like that; so one table covers it. You might say, "But I might think of something in the future like that." That may be a reason to make two tables; but (unlike many kinds of change to a database schema) this may not be too difficult to break into two tables later, when you know there is a need.
I see the point about mistakes that result in nearly-matching records. I think that's not a consideration at this table level and you should address it as a part of record editing.
The best way to do this would be to have your product or item table in its own table with fields like ID, SKU number, short description, active, and so on… Then you have your “many” table hold the other item attributes which can be joined on ID; a one to many relationship. And to solve the user input issue, you have a combo box which is tied to inventory choices or item choices. This way you enforce data integrity. Well, that is how I have done it.
This post has some helpful links on DB design

SQL is-a / has-a / is-a structure and validation

Consider the following example. When I write "is-a", I mean a column that is both a primary key for its table and a foreign key to another table to form a one-to-one relationship. When I write "has-a", I mean a regular foreign key column, to form a one-to-many relationship.
A fruitbasket table.
An applebasket table, "is-a" fruitbasket.
An orangebasket table, "is-a" fruitbasket.
A fruit table, "has-a" fruitbasket.
An apple table, "is-a" fruit.
An orange table, "is-a" fruit.
Assume, for the time being, that there are context-specific columns in applebasket, orangebasket, apple and orange sufficient to warrant the existence of that table instead of cluttering the parent table with nullable columns or a type enumeration.
Questions:
Is it better practice to relate between fruit and fruitbasket, or to relate apple and applebasket + orange and orangebasket? The former seems less redundant, but could potentially have invalid relations (apple -> fruit -> fruitbasket -> orangebasket, for example). The latter forces the relations to be valid, but is more redundant, and requires that any other inheriting fruit table declare its own basket foreign key.
Specifically for PostgreSQL, given the first choice (relating fruit to fruitbasket), what is the simplest way for me to check relational validity? It would have to perform three joins.
Any other suggestions to implement this cleanly?
Thanks...
I think you are looking at this somewhat wrong. Relational data modelling is about data while object modelling is about behavior. These are different disciplines and as much as I like to do object-relational data modelling has-a and is-a are not things that belong in the database. Instead look at functional dependencies and model them as such. Otherwise you can end up with problems if you ever have multiple apps trying to access the same data in different ways.
For example, suppose we have two applications. One pulls data out and manipulates it, and models behavior. The second pulls data out, treats it as static, and derives information. As yourself if the LSP allows you to say "a square is-a rectangle." In the first case, no. In the second case, yes. In the first case you might want to use a has-a "rectangular_area" and in the second case "is-a rectangle" is perfectly valid.
So this brings me to my second point. If you are looking at this sort of complex relationship, how you do your mapping is likely to depend on what you are doing with your data. In general it is better to constrain your data based on definitional elements rather than behavioral elements. So in this case you have mappings wherever you need them. I would then suggest the following:
fruit (stores apple, pear, orange, etc).
any supplemental tables you need for this.
fruit_basket
many-many mapping table showing what kinds of fruit is in the fruit basket.
This brings me specifically to your questions:
Is it better practice to relate between fruit and fruitbasket, or to relate apple and applebasket + orange and orangebasket?
Both. At the same time. See above.
Specifically for PostgreSQL, given the first choice (relating fruit to fruitbasket), what is the simplest way for me to check relational validity?
Declarative referential integrity will take you all the way there. Don't be afraid to use bidirectional foreign keys with one side set to DEFERRABLE INITIALLY DEFERRED.