I have a cube with the structure of n-to-n relationship to categories.
Each product can belong to many categories. In this scenario my product table looks like this:
There is one product in 3 different categories:
Fashion
Men
Women
Electronic
The Bridge_ProductCategories has 3 rows for mapping each category to this product. So I have 3 records in FactSales table showing the sales amount of this product.
In cube browsing when I filter the "Electronic" category, everything works fine.
But when I filter "Fashion" category, because of the belonging to 2 subcategories, the sales amount will be duplicated.
Does anyone have any solution for this situation?
Related
If I had a table as such:
How do I structure a SQL (mysql or redshift) to show a progression of what customers purchased into their subsequent purchases of a new category? like:
Where I use the first_time_customer flag to define if a parent_category is NEW (or first time buyer of this category) and progression_category defines the next categories being purchased (purchased is defined as customers buying the same progression of categories from new until their last purchase). Notice that customer A bought 2 different categories, and that drives the subsequent 4 rows at the end. The actual table is 90 million rows.
I have searched and the only answers I found were for cross joining.
I have 3 tables that are related by 1 field only. I'm trying to pull data from 2 tables that are linked to the other table.
The first table contains salesman data IDnumber, name, address, phone number, hire date, wage, etc.
There is a sales table that contains salesmanIDnumber, date of sale, object sold, and price.
There is a purchases table that contains salesmanIDnumber, date of purchase, object purchased, and price.
The date fields in sales and purchases are unrelated. I know the easiest solution would be to have the sales and purchase table combined with a column for buy/sell, but I didn't create the database and I'm working with what I've got. basically I want to pull all purchases or sales by salesmanID in one report.
I have linked the salesman table to the sales table and the purchases table with left outer joins by the salesman ID. What I'm getting in results is cross join with each result from the purchase table displayed once for each result in the sales table, which gives me multiplied results instead of added. for example, 4 sales and 6 purchases would be 10 entries, but I'm getting 24 results.
I tried entering an example but the site stripped the spacing and pushed everything together basically making it unreadable.
how can I get it to show data from both tables independently?
I do have access to create views in the database if that's the best solution, but I'm not proficient at it.
Create 2 views (one for sales, the other for purchases), each Grouped By SalesMan.
Since each SalesMan would have only one row in each view, you can join them without record inflation.
Or use a UNION to append Purchase records to Sales Records, taking care of including a 'Type' column ('Sales' as Type, or 'Puurchases' as Type) and/or reverse sign on quantities to allow summarizing things in a logical.
I have multiple models (events, chores, bills, and lists), which each have their own table. I want to be able to group any of these instances together, for example group an event with a list of items to buy for it, and a bill for the cost.
I was thinking each table could have a group id, and I could get other items in a group by merging records from each table where the group_id equals the items group_id.
group = Events.find_by_group_id(self.group_id).concat(Bills.find_by_group_id(self.group_id)) ...
But that seems like a bad way to do it.
Another way I thought to do it was to use a polymorphic relation between two of the items
tag
item_1_id | item_1_type | item_2_id | item_2_type
----------+-------------+-----------+------------
But in the example above (a group of three different items) would require six records, two between each pair, for each item to know of all other items in the group.
Is there a way to do this with joins, should I redesign some of the tables?
How to I represent the following datamodel in sql tables.
I have 3 entities, company, productcategory and product.
Business model is that Company can have product category1-N and each category can have many products.
The trick is that products are shared across companies under different categories. Product categories are not shared. Each company has its own categories.
for example,
product1 belongs to category1 under company1
product1 belongs to category2 under company2
I'm thinking of having the following tables. Only relevant Id fields are shown below.
Company
CompanyId
ProductCategory
ProductCategoryId
CompanyId
ParentCategoryId (To support levels)
Product
ProductId
ProductCategoryXProduct
ProductCategoryId
ProductId
This way I can query for all product categories for a product and filter by company to get the specific category structure for its products. This may be different for another company even if the product is the same.
Will this cover it? is there a better approach?
Looks like a fine 3NF design that fits what you have described.
Note that as your data set will grow, this design will start slowing down (mostly due to the required joins), so when the time comes you may want to denormalize some of these tables for faster reads.
Assuming you have the need for products to belong to multiple categories I think that this structure is fine.
Lets say i have a bunch of products. Each product has and id, price, and long description made up of multiple paragraphs. Each product would also have multiple sku numbers that would represent different sizes and colors.
To clarify: product_id 1 has 3 skus, product_id 2 has 5 skus. All of the skus in product 1 share the same price and description. product 2 has a different price and description than product 1. All of product 2's skus share product 2's price and description.
I could have a large table with different records for each sku. The records would have redundant fields like the long description and price.
Or I could have two tables. One named "products" with product_id, price, and description. And one named "skus" with product_id, sku, color, and size. I would then join the tables on the product_id column.
$query = "SELECT * FROM skus LEFT OUTER JOIN products ON skus.product_id=products.product_id WHERE color='green'";
or
$query = "SELECT * FROM master_table WHERE color='green'";
This is a dumbed down version of my setup. In the end there will be a lot more columns and a lot of products. Which method would have better performance?
So to be more specific: Let's say I want to LIKE search on the long_description column for all of the skus. I am trying to compare having one table that has 5000 long_description and 5000 skus vs OUTER JOINing two tables, one has 1000 long_description records and the other has 5000 skus.
It depends on the usage of those tables - in order to get a definitive answer you should do both and compare using representative data sets / system usage.
The normal approach is to only denormalise data in order to combat specific performance problems that you are having, so in this case my advice would be to default to joining across two tables and only denormalise to using a single table if you have a performance problem and find that denormalisation fixes it.
OLTP normalized tables better
Join them at query, easier data manupulation and good response for short queries
OLAP denormalized tables better
Tables mostly dont change and good for long queries