I am trying to split a record in a table to 2 records based on a column value. The input table displays the 3 types of products and their price. For a specific product (row) only its corresponding column has value. The other columns have Null.
My requirement is - whenever the product column value (in a row) is composite (i.e. has more than one product, e.g. Bolt + Brush), the record must be split into two rows - 1 row each for the composite product types.
So, in this example, notice how the 2nd row (in the input) gets split into 2 rows -> 1 row for "Bolt" and another for the "Brush", with their price extracted from their corresponding columns (i.e in this case, "Bolt" = $3.99 and "Brush" = $6.99)
Note: For composite product values there can be at most 2 products as shown in this example (e.g. Bolt + Brush)
CustId | Product | Hammer | Bolt | Brush
--------------------------------
12345 | Hammer | $5.99 | Null | Null
53762 | **Bolt+Brush** | Null | $3.99 | $4.99
43883 | Brush | Null | Null | $4.99
I have tried creating 2 predetermined records via UNION ALL using a CTE and then main_table Left Outer Join with CTE, so that the join yields 2 records instead.
#CustId | Product | Price #
12345 | Hammer | $5.99
**53762** | **Bolt** | $3.99
**53762** | **Brush** | $4.99
43883 | Brush | $4.99
This has to be solved by Spark-SQL only.
I think this will work:
select CustId, 'Hammer' as product, Hammer
from t
where Product like '%Hammer%'
union all
select CustId, 'Bolt' as product, Bolt
from t
where Product like '%Bolt%'
union all
select CustId, 'Brush' as product, Brush
from t
where Product like '%Brush%';
This would work also
select custid, product,
case when product like '%Hammer%' then hammer
when product like '%Bolt%' then bolt
else brush end as Price from
(select custid, explode(split(product,'\\+')) as product, hammer, bolt, brush
from t) x;
Related
I'm trying to develop a database for my inventory. But I have no idea how to keep track of multilevel packaging.
For example:
I currently have a products and positions table
products
Id | Name
================
1013 | Metal
1014 | Wood
positions
id | Name
================
1 | 1-1-1-1
2 | 1-1-1-2
And my inventory table I was thinking of doing something like this:
Let's say I stored 1 box with 1000 Metal and 1 box with 500 Wood at position 1-1-1-1
ItemId | ProductId | Quantity | PositionId
==========================================
1 | 1013 | 1000 | 1
2 | 1014 | 500 | 1
So I'll label those two boxes with a barcode 1 and 2 respectively, so if I scan them, I can check this table to see the product and quantity inside them.
But I can also put these 2 boxes (1 and 2) inside another box (let's call it box 3), which would generate a new barcode for it that, if scanned, will show both previous boxes and its items. And store this box 3 in another position
And I can also put this box 3 inside a pallet, generating a new code and so on. So basically I can multilevel package N times.
What is the best table structure to keep track of all of this? Thanks in advance for any help!
I would add another column to the products table, make it a BIT and maybe call it BOM, BillOfMaterials, or whatever makes sense to you
So your products Table would look like this
Then you could create another table called BillOfMaterials
Quantity is how many of your products are needed to make up your new product. So for this example 2 metal and 1 wood make a pencil.
I was able to make a good structure:
My products and positions are the same but I created a stock table like:
id | product_id | amount | parent_id | position_id
=====================================================
1 | 1013 | 1000 | 4 | 1
2 | 1013 | 1000 | 4 | 1
3 | 1014 | 500 | 4 | 1
4 | 1234 | NULL | NULL | 1
The 1234 (random id) is a box that contains 2000 metal and 500 wood. I dont save this box in the product table.
When I scan the box with id 3, I perform a recursive cte query:
with recursive bom as (
select *, 1 as level
from testing.stock
where id = '4' #scanned id
union all
select c.*, p.level + 1
from testing.stock c
join bom p on c.parent_id = p.id
)
select product_id as product, sum(amount), position_id
from bom b
left join testing.product pd on b.product_id = pd.id
where pd.id is not null
group by product_id, position_id
which returns:
sum | product | position
2000 | 1013 | 3
500 | 1014 | 3
to get by position I just run a variation of the above query. To perform an update I get the Ids inside that box and run a
update testing.stock set position = '2' where id in (#variation of above query)
I hope this helps someone. This works for N packaging level
One of my tables in my database contains rows with requisition numbers and other related info. I am trying to create a second table (populated with an INSERT INTO statement) that duplicates these rows and adds a series value based on the value in the QuantityOrdered column.
For example, the first table is shown below:
+-------------+----------+
| Requisition | Quantity |
+-------------+----------+
| 10001_01_AD | 4 |
+-------------+----------+
and I would like the output to be as follows:
+-------------+----------+----------+
| Requisition | Quantity | Series |
+-------------+----------+----------+
| 10001_01_AD | 4 | 1 |
| 10001_01_AD | 4 | 2 |
| 10001_01_AD | 4 | 3 |
| 10001_01_AD | 4 | 4 |
+-------------+----------+----------+
I've been attempting to use Row_Number() to sequence the values but it's numbering rows based on instances of Requisition values, not based on the Quantity value.
Non-recursive way:
SELECT *
FROM tab t
CROSS APPLY (SELECT n
FROM (SELECT ROW_NUMBER() OVER(ORDER BY 1/0) AS n
FROM master..spt_values s1) AS sub
WHERE sub.n <= t.Quantity) AS s2(Series);
db<>fiddle demo
You need recursive way :
with t as (
select Requisition, 1 as start, Quantity
from table
union all
select Requisition, start + 1, Quantity
from t
where start < Quantity
)
select Requisition, Quantity, start as Series
from t;
However, by default it has limited to only 100 Quantities, if you have a more then you need to specify the query hint by using option (maxrecursion 0).
A simple method uses recursive CTEs:
with cte as (
select requsition, quantity, 1 as series
from t
union all
select requsition, quantity, 1 + series
from t
where lev < quantity
)
select requsition, quantity, series
from cte;
With default setting, this works up to a quantity of 100. For larger quantities, you can add option (maxrecursion 0) to the query.
The below code references two tables. Each table are identical in structure, only difference being the "PRICE" and "PRICE_DATE" values. This is because it's the same table created one year ago. All I want to do is have a new table which takes the latest price in each table for each fund and inserts that into a new table. In addition to this, I also want another column which calculates the growth.
The code below works for this purpose.
SELECT [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE AS
[PRICE_#_112015], [2016_11_Fund_Prices].PRICE AS [PRICE_#_112016]
([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1) AS Growth INTO 2016_11_Monthly_Fund_Prices
FROM 2016_11_Fund_Prices INNER JOIN 2015_11_Fund_Prices ON [2016_11_Fund_Prices].FUND_CODE = [2015_11_Fund_Prices].FUND_CODE
GROUP BY [2015_11_Fund_Prices].FUND_CODE, [2015_11_Fund_Prices].PRICE_DATE, [2015_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE, [2016_11_Fund_Prices].PRICE_DATE, ([2016_11_Fund_Prices].[PRICE]/[2015_11_Fund_Prices].[PRICE]-1)
HAVING ((([2015_11_Fund_Prices].PRICE_DATE)=#24/11/2015#) AND (([2016_11_Fund_Prices].PRICE_DATE)=#24/11/2016#));
However, this code assumes that the latest price is 24/11 in both tables. I want to replace this with a max function that will result in the query referencing only the price in the row with the highest date value.
Can anyone help?
Tabels used are
+-----------+------------+-------+
| Fund_Code | PRICE_DATE | PRICE |
+-----------+------------+-------+
| 1 | 12/12/12 | 1 |
| 1 | 13/12/12 | 1.2 |
| 1 | 14/12/12 | 1.1 |
| 2 | 12/12/12 | 1.12 |
| 2 | 13/12/12 | 1.13 |
| 2 | 14/12/12 | 1.11 |
So the second table is exactly the same but dates corresponding to the following year.
All I want is a table with:
Fund_Code Price1 Price2 Growth
Thanks
You need a sub-query like this:
SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE
If you add this sub-query to the above and link it to the 2016_11_Fund_Prices table on FUND_CODE and PRICE_DATE=MaxPriceDate it should do what you need.
SELECT 2016_11_Fund_Prices.FUND_CODE, PRICE, PRICE_DATE
FROM 2016_11_Fund_Prices
INNER JOIN (SELECT FUND_CODE, MAX(PRICE_DATE) AS MaxPriceDate FROM 2016_11_Fund_Prices GROUP BY FUND_CODE) mp
ON 2016_11_Fund_Prices.FUND_CODE=mp.FUND_CODE AND 2016_11_Fund_Prices.PRICE_DATE=mp.MaxPriceDate
I'm working on an SQL database and I need a way to return a list of all products that have a tag attached to them. Here's a quick example of my xml
<ArrayOfString>
<string>fee</string>
<string>activation</string>
</ArrayOfString>
I also have a column for ProductType and ProductID. What I need to do is pull out all entries of a certain product type, and list off the different strings in the xml column. So what I need to see for example is something like...
------------------------------------------------
PRODUCTID | TAG | PRODUCT TYPE
------------------------------------------------
1 | fee | 1
1 | activation | 1
2 | fee | 1
3 | fee | 1
So basically if the xml in that column has more than 1 'string' node, I need to know EACH node, and the productID that it goes with.
I've tried using Tags.values() and Tags.query() to get my values, and I can get close in as much as it returns values like...
------------------------------------------------
PRODUCTID | TAG | PRODUCT TYPE
------------------------------------------------
1 | <string>fee</st| 1
2 | fee | 1
3 | fee | 1
but this lumps all of the strings together in one view...
Also, I was trying to sort that out since I could make that work, and was trying to say
SELECT ProductID, Tags.query('/ArrayOfString/string') AS AttachedTags
FROM dbo.products
WHERE ProductType = 1
ORDER BY AttachedTags
but it says that AttachedTags is not a valid column name... any ideas on if I'm trying to sort it out by certain tags first?
In order to break down multiple nodes with the same name you need to use CROSS APPLY:
SELECT ProductID, ProductTags.Tag.value('.', 'nvarchar(MAX)') AS AttachedTag
FROM dbo.products
CROSS APPLY Tags.nodes('/ArrayOfString/string/text()') as ProductTags(Tag)
WHERE ProductType = 1
ORDER BY AttachedTag
You can also use as
Select YOUR_COLUMN.value('(YOUR_NODES)[1]','VARCHAR(100)') From YOUR_TABLE
like
xml.value('(/form/NotesSection/Nt-Status)[1]','VARCHAR(100)')
I'm trying to convert a product table that contains all the detail of the product into separate tables in SQL. I've got everything done except for duplicated descriptor details.
The problem I am having all the products have size/color/style/other that many other products contain. I want to only have one size or color descriptor for all the items and reuse the "ID" for all the product which I believe is a Parent key to the Product ID which is a ...Foreign Key. The only problem is that every descriptor would have multiple Foreign Keys assigned to it. So I was thinking on the fly just have it skip figuring out a Foreign Parent key for each descriptor and just check to see if that descriptor exist and if it does use its Key for the descriptor.
Data Table
PI Colo Sz OTHER
1 | Blue | 5 | Vintage
2 | Blue | 6 | Vintage
3 | Blac | 5 | Simple
4 | Blac | 6 | Simple
===================================
Its destination table is this
===================================
DI Description
1 | Blue
2 | Blac
3 | 5
4 | 6
6 | Vintage
7 | Simple
=============================
Select Data.Table
Unique.Data.Table.Colo
Unique.Data.Table.Sz
Unique.Data.Table.Other
=======================================
Then the dual part of the questions after we create all the descriptors how to do a new query and assign the product ID to the descriptors.
PI| DI
1 | 1
1 | 3
1 | 4
2 | 1
2 | 3
2 | 4
By figuring out how to do this I should be able to duplicate this pattern for all 300 + columns in the product. Some of these fields are 60+ characters large so its going to save a ton of space.
Do I use a Array?
Okay, if I understand you correctly, you want all unique attributes converted from columns into rows in a single table (detailstable) that has an id and a description field:
Assuming the schema:
datatable
------------------
PI [PK]
Colo
Sz
OTHER
detailstable
------------------
DI [PK]
Description
You can first get all of the unique attributes into its own table with:
INSERT INTO detailstable (Description)
SELECT
a.description
FROM
(
SELECT DISTINCT Colo AS description
FROM datatable
UNION
SELECT DISTINCT Sz AS description
FROM datatable
UNION
SELECT DISTINCT OTHER AS description
FROM datatable
) a
Then to link up the datatable to the detailstable, I'm assuming you have a cross-reference table defined like:
datadetails
------------------
PI [PK]
DI [PK]
You can then do:
INSERT INTO datadetails (PI, DI)
SELECT
a.PI
b.DI
FROM
datatable a
INNER JOIN
detailstable b ON b.Description IN (a.Colo, a.Sz, a.OTHER)
I reckon you want to split description table for different categories, like - colorDescription, sizeDescription etc.
If that is not practical then I would recommend having an extra column showing an category attribute:
DI Description Category
1 | Blue | Color
2 | Blac | Color
3 | 5 | Size
4 | 6 | Size
6 | Vintage | Other
7 | Simple | Other
And then have primary key in this table as combination of ID and Category column.
This will have less chances for injecting any data errors. It will be also easy to track that down.