Looking up parent item based on a bill of materials - sql

I'm trying to figure out how to put together a SQL statement that will let me find an end-item in our database based on its bill of materials. I guess you could say this is like a reverse BOM lookup question.
My table structure is pretty simple.
-End-item table
-Component table
-Linking table to tie together multiple components to an end item record.
The data I have is just the component list, and I want to find the end item. Since every bill of material is unique it has to match the bill of materials perfectly ie exact number of components and exact matches to the component SKU numbers. In some cases 2 end-items might use all the same components, but one of them just uses an extra part or two that makes the end-item SKU number different, so it has to account for that. That is, again, it has to match the BOM perfectly.
If not an outright answer, could someone at least steer me on the correct path to finding one?
------ UPDATE ----------
Table structure would be something like this.
ManufacturedPart
,--------------------,
| ID | PART_NUM |
|--------------------|
| 1 | V3175-01 |
| 2 | V3367-01 |
| 3 | V3988-01 |
| 4 | V3175-CV |
`--------------------`
Component
,--------------------,
| ID | COMP_NUM |
|--------------------|
| 1 | V3175 |
| 2 | V3367 |
| 3 | V3369 |
| 4 | V3114 |
| 5 | V3370 |
| 6 | V4060 |
| 7 | V3550 |
| 8 | V3988 |
`--------------------`
ManufacturedComponent
,-------------------------------------------------,
| ID | MANUFACTURED_PART_ID | COMPONENT_ID |
|-------------------------------------------------|
| 1 | 1 | 1 |
| 2 | 1 | 4 |
| 3 | 1 | 6 |
| 4 | 2 | 2 |
| 5 | 2 | 3 |
| 6 | 2 | 5 |
| 7 | 2 | 7 |
| 8 | 3 | 1 |
| 9 | 3 | 8 |
| 10 | 4 | 1 |
| 11 | 4 | 4 |
`-------------------------------------------------`
Assuming I have only the COMP_NUMs (component numbers) to search with I want to match back to the ManufacturedPart that contains that exact list of components.
So some examples: If I have components V3175, V3114, and V4060, it should match back to V3175-01 manufactured part. But, if I only have components V3175 and V3114 it should match back to V3175-CV manufactured part. If I have components V3367, V3369, V3370, and V3550 it should match back to manufactured part V3367-01.
I have no SQL written at all yet as I'm unsure of how to break the problem down..

Related

Designing a database for a workout tracker

I'm designing a database for a workout tracker app. Each user should be able to track multiple workouts (routines). A workout can have multiple exercises an exercise can be used in many workouts. Each exercise will have a specific track type (weight and reps, distance and time, only reps).
My tables so far:
| User | |
|------|-------|
| id | name |
| 1 | Ilka |
| 2 | James |
| Exercise | | |
|----------|---------------------|---------------|
| id | name | track_type_id |
| 1 | Barbell Bench Press | 1 |
| 2 | Squats | 1 |
| 3 | Deadlifts | 1 |
| 4 | Rowing Machine | 3 |
| Workout | | |
|---------|---------|-----------------|
| id | user_id | name |
| 1 | 1 | Chest & Triceps |
| 2 | 1 | Legs |
| Workout_Exerice (Junction table) | |
|-----------------|------------------|------------|
| id | exersice_id | workout_id |
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 4 | 1 |
| Workout_Sets | | | |
|--------------|---------------------|------|--------|
| id | workout_exersice_id | reps | weight |
| 1 | 1 | 12 | 120 |
| 2 | 1 | 10 | 120 |
| 3 | 1 | 8 | 120 |
| 4 | 2 | 10 | 220 |
| 5 | 3 | null | null |
| TrackType | |
|-----------|-----------------|
| id | name |
| 1 | Weight and Reps |
| 2 | Reps Only |
| 3 | Distance Time |
My issue is how to incorporate the TrackType table for each workout set, my first option was to create columns in the Workout_Sets table for each tracking type (weight and reps, distance and time, only reps) but that means for many rows I will have many nulls. Another option I thought was to use an EAV type table but I'm not sure. Also do you think my design is efficient (Over-normalization)?
I would say that the most efficient way is to have nulls in your table. The alternative would require you to split many of the category's into separate tables. Also a recommendation is that you start factoring a User ID table into your database
Your description states that “Each exercise will have a specific track type” suggesting a one-to-one relationship between Exercise and TrackType, and that the relationship is unchanging. As such, the exercise table should have a TrackType column.
I suspect, however, that your problem description may be lacking specificity, making it difficult to give you sound advice. For instance, if the TrackType can vary for any given exercise, your TrackType column may belong on the Workout_Sets table. If the relationship between TrackType and Exercise/Workout_Sets is many-to-many, then you will need another junction table.
Your question regarding “over-normalization” depends upon many factors that are specific to your solution. In general, I would say no - the degree of normalization appears to be appropriate.

Is it a good idea to have SQL table entries refer to other ids in the same table?

I'm designing a table for product categories for a kinda-e-commerce site. The table currently looks a bit like this:
| id | name | level | value | parent_id |
+----+-------------+-------+-------------+-----------+
| 1 | Food | 0 | food | NULL |
| 2 | Phone | 0 | phone | NULL |
| 3 | Thing | 0 | thing | NULL |
| 4 | Pasta | 1 | pasta | 1 |
| 5 | Apple | 1 | apple | 2 |
| 6 | SubThing | 1 | subthing | 3 |
| 7 | Tagliatelle | 2 | tagliatelle | 4 |
| 8 | iPhone 11 | 2 | iphone_11 | 5 |
| 9 | SubSubThing | 2 | subsubthing | 6 |
Basically I don't want to create a whole new table and map the relationships every time people want to add a new sub-level to the category structure, and rely on level and parent_id columns to let my code know how to do with this category and what its parent is. I'm completely new to model designing and this is the best I could come up with. Is there any downside to this self-referencing structure that I'm just too noob to realize?
If you are certain the sub level (child) will only ever be referenced by that single row or parent then the design should suffice. You may run into issues if multiple child elements need to roll up into that parent entity.

How to name child elements of a parent element by value? SQL server

Please, tell me an example how to mark all the child nodes to the parent id. Only need to mark those branches whose parent has the value "need" (see example image). Using a recursive query, it is not possible to rename all the children of a particular parent...
Initial data:
+-----+----------+----------+
| id | parentid | selector |
+-----+----------+----------+
| 1 | | |
| 2 | 1 | |
| 3 | 1 | need |
| 4 | 2 | |
| 5 | 2 | need |
| 6 | 3 | |
| 7 | 5 | |
| 8 | 5 | |
| 9 | 6 | |
+-----+----------+----------+
Need data:
+-----+----------+----------+----------------+
| id | parentid | selector | parentSelector |
+-----+----------+----------+----------------+
| 1 | null | | null |
| 2 | 1 | | null |
| 3 | 1 | need | 3 |
| 4 | 2 | | null |
| 5 | 2 | need | 5 |
| 6 | 3 | | 3 |
| 7 | 5 | | 5 |
| 8 | 5 | | 5 |
| 9 | 6 | | 3 |
+-----+----------+----------+----------------+
The task is to make the grouping by those elements whose parent has the value "need". I think, I should create a column with a mark, as in the example in the table above, or are there any other options?
I use SQL Server 2012
I dont't know if it work on Sql server 2012, but i found this microsoft, i think is what you want, to make the parentSelector with condition, I use CASE (Transact-SQL).
This is another example: stackoverflow question

Keep newest duplicate row depending on multiple Columns

I seem to have a workflow problem with Open Refine (Google Refine 2.5 [r2407]) to do sophisticated duplicate row cleaning. All I have found so far is how to delete duplicate rows based on a single column.
My aim is to delete duplicate rows based on multiple columns, at best, in a specific hierarchy.
Example
Given the following dummy data in Refine
+----+---------+---------+--------+------------+------+-----------------------------------+
| id | timeAgo | title | author | date | val1 | [After Refine, keep Record] |
+----+---------+---------+--------+------------+------+-----------------------------------+
| 1 | 10 | Faust | Mr. A | 2014-01-15 | 10 | ->B, older entry |
| 2 | 11 | Faust | Mr. A | 2014-01-21 | 10 | A (because of Date) |
| 3 | 8 | Faust | Mr. A | 2014-01-15 | 10 | B |
| 4 | 8 | RedHead | Mr. B | 2014-01-21 | 34 | ->D, older entry |
| 5 | 7 | RedHead | Mr. B | 2014-01-21 | 34 | ->D, same time Ago, but lower ID |
| 6 | 7 | RedHead | Mr. A | 2014-01-01 | 13 | C (because of author, date, val1) |
| 7 | 7 | RedHead | Mr. B | 2014-01-21 | 34 | D |
+----+---------+---------+--------+------------+------+-----------------------------------+
I want to kill the duplicate rows based on following logic. If
title && auther && date && val1 are the same, than
keep the newest (least timeAgo) row, if there are multiple, than
keep the one with the highest id
The Result would be:
+---------+----+---------+---------+--------+------------+------+
| Refined | id | timeAgo | title | author | date | val1 |
+---------+----+---------+---------+--------+------------+------+
| A | 2 | 10 | Faust | Mr. A | 2014-01-21 | 10 |
| B | 3 | 8 | Faust | Mr. A | 2014-01-15 | 10 |
| C | 6 | 7 | RedHead | Mr. A | 2014-01-01 | 13 |
| D | 7 | 7 | RedHead | Mr. B | 2014-01-21 | 34 |
+---------+----+---------+---------+--------+------------+------+
Easy Approach?
If there is no other solution, I thankfully take a scripting/GREL one.
But could it be done by Refines famous workflow "recording" to achieve above logic, so it could be extracted and applied to other same format datasets?
My motivation behind this is to enable employees to work more thoughtfully with data (beyond excel) but without confronting them right away with a full blown scripting language.
That sounds like a straightforward sorting problem.
Sort the records by title, author, time ago, and ID
Re-order rows permanently (IMPORTANT - it won't work if you forget this step)
Blank down on Title & Author
Move those two columns to the two left most positions
Join multivalued cells on remaining columns
Transform all columns from step 5 using value.split(',')[0] to extract the first value (which should be the value for the record you want if you sorted them in the right order

Rails 3 - complex query with joins and counts, possible subqueries?

Ok, so i have a bit of a complex query i am trying to come up with in my rails application. I have four tables: Clients, Projects, Invoices, Invoice_Line_Items. I am trying to get certain bits of data from all of those tables and display it in a "reports" type view in my application. This is what the structures look like for the four tables:
Clients
| id | name | archive |
----------------------------------------
| 1 | Client 1 | 0 |
| 2 | Client 2 | 0 |
Projects
| id | client_id | name | archive |
------------------------------------------------------
| 1 | 1 | Project 1 | 0 |
| 2 | 1 | Project 2 | 1 |
| 3 | 2 | Project 3 | 0 |
| 4 | 2 | Project 4 | 1 |
Invoices
| id | client_id | project_id | name | archive |
----------------------------------------------------------------------
| 1 | 1 | 1 | Invoice 1 | 0 |
| 2 | 1 | 1 | Invoice 2 | 0 |
| 3 | 1 | 2 | Invoice 3 | 1 |
| 4 | 1 | 2 | Invoice 4 | 1 |
| 5 | 2 | 3 | Invoice 5 | 0 |
| 6 | 2 | 3 | Invoice 6 | 0 |
| 7 | 2 | 4 | Invoice 7 | 1 |
| 8 | 2 | 4 | Invoice 8 | 1 |
Invoice_Line_Items
| id | invoice_id | name | amount_due |
---------------------------------------------------------
| 1 | 1 | Item 1 | 500 |
| 2 | 1 | Item 2 | 500 |
| 3 | 2 | Item 3 | 500 |
| 4 | 2 | Item 4 | 500 |
| 5 | 3 | Item 5 | 500 |
| 6 | 3 | Item 6 | 500 |
| 7 | 4 | Item 7 | 500 |
| 8 | 4 | Item 8 | 500 |
| 9 | 5 | Item 9 | 500 |
| 10 | 5 | Item 10 | 500 |
| 11 | 6 | Item 11 | 500 |
| 12 | 6 | Item 12 | 500 |
| 13 | 7 | Item 13 | 500 |
| 14 | 7 | Item 14 | 500 |
| 15 | 8 | Item 15 | 500 |
| 16 | 8 | Item 16 | 500 |
Ok, hope those diagrams make sense enough. What i am looking for as a result set is this (example data set taken from above example data):
| clients.name | current_projects | archived_projects | total_amount_due | total_amount_paid |
-----------------------------------------------------------------------------------------------------------
| Client 1 | 1 | 1 | 2000 | 2000 |
| Client 2 | 1 | 1 | 2000 | 2000 |
Ok, so here's what's going on there:
Getting all non-archived clients
Getting a count of all non-archived projects
Getting a count of all archived projects
Getting a total_amount_due from the invoice_line_items table that is a sum of all of the non-archived invoices
Getting a total_amount_paid from the invoice_line_items table that is a sum of all of the archived invoices
I am relatively new to Rails and this is a fairly complex query (at least in my head). Please let me know if there is a simpler solution that i am overlooking or if i am just over complicating it. If i need to do multiple queries in my controller that's fine, i was just wanting to see if i could get away with one sql call. I'm pretty sure i can do this pretty easily with some subqueries but i'm not sure how to write those in the controller in Rails.
Thanks for any help or direction you can provide and if this question is just outrageous or whatever just let me know and i'll delete it and go search the Googles more (have tried already to no avail).
Ok, well i ended up figuring out a solution myself. Not quite sure it's the best solution....feels heavy and messing but i just created quite a few objects in the controller to get the sql statements i needed to pull the data from the database. I basically have one object for each column (column, not each row). Let me know if anyone can figure out a better solution.