Loop over one table, subselect another table and update values of first table with SQL/VBA - sql

I have a source table that has a few different prices for each product (depending on the order quantity). Those prices are listed vertically, so each product could have more than one row to display its prices.
Example:
ID | Quantity | Price
--------------------------
001 | 5 | 100
001 | 15 | 90
001 | 50 | 80
002 | 10 | 20
002 | 20 | 15
002 | 30 | 10
002 | 40 | 5
The other table I have is the result table in which there is only one row for each product, but there are five columns that each could contain the quantity and price for each row of the source table.
Example:
ID | Quantity_1 | Price_1 | Quantity_2 | Price_2 | Quantity_3 | Price_3 | Quantity_4 | Price_4 | Quantity_5 | Price_5
---------------------------------------------------------------------------------------------------------------------------
001 | | | | | | | | | |
002 | | | | | | | | | |
Result:
ID | Quantity_1 | Price_1 | Quantity_2 | Price_2 | Quantity_3 | Price_3 | Quantity_4 | Price_4 | Quantity_5 | Price_5
---------------------------------------------------------------------------------------------------------------------------
001 | 5 | 100 | 15 | 90 | 50 | 80 | | | |
002 | 10 | 20 | 20 | 15 | 30 | 10 | 40 | 5 | |
Here is my Python/SQL solution for this (I'm fully aware that this could not work in any way, but this was the only way for me to show you my interpretation of a solution to this problem):
For Each result_ID In result_table.ID:
Subselect = (SELECT * FROM source_table WHERE source_table.ID = result_ID ORDER BY source_table.Quantity) # the Subselect should only contain rows where the IDs are the same
For n in Range(0, len(Subselect)): # n (index) should start from 0 to last row - 1
price_column_name = 'Price_' & (n + 1)
quantity_column_name = 'Quantity_' & (n + 1)
(UPDATE result_table
SET result_table.price_column_name = Subselect[n].Price, # this should be the price of the n-th row in Subselect
result_table.quantity_column_name = Subselect[n].Quantity # this should be the quantity of the n-th row in Subselect
WHERE result_table.ID = Subselect[n].ID)
I honestly have no idea how to do this with only SQL or VBA (those are the only languages I'd be able to use -> MS-Access).

This is a pain in MS Access. If you can enumerate the values, you can pivot them.
If we assume that price is unique (or quantity or both), then you can generate such a column:
select id,
max(iif(seqnum = 1, quantity, null)) as quantity_1,
max(iif(seqnum = 1, price, null)) as price_1,
. . .
from (select st.*,
(select count(*)
from source_table st2
where st2.id = st.id and st2.price >= st.price
) as seqnum
from source_table st
) st
group by id;
I should note that another solution would use data frames in Python. If you want to take that route, ask another question and tag it with the appropriate Python tags. This question is clearly a SQL question.

Related

SQL to Get Latest Field Value

I'm trying to write an SQL query (SQL Server) that returns the latest value of a field from a history table.
The table structure is basically as below:
ISSUE TABLE:
issueid
10
20
30
CHANGEGROUP TABLE:
changegroupid | issueid | updated |
1 | 10 | 01/01/2020 |
2 | 10 | 02/01/2020 |
3 | 10 | 03/01/2020 |
4 | 20 | 05/01/2020 |
5 | 20 | 06/01/2020 |
6 | 20 | 07/01/2020 |
7 | 30 | 04/01/2020 |
8 | 30 | 05/01/2020 |
9 | 30 | 06/01/2020 |
CHANGEITEM TABLE:
changegroupid | field | newvalue |
1 | ONE | 1 |
1 | TWO | A |
1 | THREE | Z |
2 | ONE | J |
2 | ONE | K |
2 | ONE | L |
3 | THREE | K |
3 | ONE | 2 |
3 | ONE | 1 | <--
4 | ONE | 1A |
5 | ONE | 1B |
6 | ONE | 1C | <--
7 | ONE | 1D |
8 | ONE | 1E |
9 | ONE | 1F | <--
EXPECTED RESULT:
issueid | updated | newvalue
10 | 03/01/2020 | 1
20 | 07/01/2020 | 1C
30 | 06/01/2020 | 1F
So each change to an issue item creates 1 change group record with the date the change was made, which can then contain 1 or more change item records.
Each change item shows the field name that was changed and the new value.
I then need to link those tables together to get each issue, the latest value of the field name called 'ONE', and ideally the date of the latest change.
These tables are from Jira, for those familiar with that table structure.
I've been trying to get this to work for a while now, so far I've got this query:
SELECT issuenum, MIN(created) AS updated FROM
(
SELECT ISSUE.IssueId, UpdGrp.Created as Created, UpdItm.NEWVALUE
FROM ISSUE
JOIN ChangeGroup UpdGrp ON (UpdGrp.IssueID = CR.ID)
JOIN CHANGEITEM UpdItm ON (UpdGrp.ID = UpdItm.groupid)
WHERE UPPER(UpdItm.FIELD) = UPPER('ONE')
) AS dummy
GROUP BY issuenum
ORDER BY issuenum
This returns the first 2 columns I'm looking for but I'm struggling to work out how to return the final column as when I include that in the first line I get an error saying "Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
I've done a search on here and can't find anything that exactly matches my requirements.
Use window functions:
SELECT i.*
FROM (SELECT i.IssueId, cg.Created as Created, ui.NEWVALUE,
ROW_NUMBER() OVER (PARTITION BY i.IssueId ORDER BY cg.Created DESC) as seqnum
FROM ISSUE i JOIN
ChangeGroup cg
ON cg.IssueID = CR.ID JOIN
CHANGEITEM ci
ON cg.ID = ci.groupid
WHERE UPPER(UpdItm.FIELD) = UPPER('ONE')
) i
WHERE seqnum = 1
ORDER BY issueid;

SQL group column where other column is equal

I'm trying to select some information from a database.
I get a database with columns like:
Ident,Name,Length,Width,Quantity,Planned
Table data is as follow
+-----------+-----------+---------+---------+------------+---------+
| Ident | Name | Length | Width | Quantity | Planned |
+-----------+-----------+---------+---------+------------+---------+
| 12345 | Name1 | 1500 | 1000 | 20 | 5 |
| 23456 | Name1 | 1500 | 1000 | 30 | 13 |
| 34567 | Name1 | 2500 | 1000 | 10 | 2 |
| 45678 | Name1 | 2500 | 1000 | 10 | 4 |
| 56789 | Name1 | 1500 | 1200 | 20 | 3 |
+-----------+-----------+---------+---------+------------+---------+
my desired result, would be to group rows where "Name,Length and Width" are equal, sum the "Quantity" and reduce it by the sum of "Planned"
e.g:
- Name1,1500,1000,32 --- (32 because (20+30)-(5+13))
- Name1,2500,1000,14 --- (14 because (10+10)-(2+4)))
- Name1,1500,1200,17
now I got problems how to group or join these information to get the wished select. may be some you of can help me.. if further information's required, please write it in comment.
You can achieve it by grouping your table and subtract sums of Quantity and Planned.
select
Name
,Length
,Width
,sum(Quantity) - sum(Planned)
from yourTable
group by Name,Length,Width
select
A1.Name,A1.Length,A1.Width,((A1.Quantity + A2.Quantity) -(A1.Planned+A2.Planned))
from `Table` AS A1, `Table` AS A2
where A1.Name = A2.Name and A1.Length = A2.Length and A1.Width = A2.Width
group by (whatever)
So you are comparing these columns form the same table?

How do I do multiple selection based on a flowchart of criteria?

Table name: Copies
+------------------------------------------------------------------------------------+
| group_id | my_id | previous | in_this | higher_value | most_recent |
+----------------------------------------------------------------------------------------------------------------
| 900 | 1 | null | Y | 7 | May16 |
| 900 | 2 | null | Y | 3 | Oct 16 |
| 900 | 3 | null | N | 9 | Oct 16 |
| 901 | 4 | 378 | Y | 3 | Oct 16 |
| 901 | 5 | null | N | 2 | Oct 16 |
| 902 | 6 | null | N | 5 | May16 |
| 902 | 7 | null | N | 9 | Oct 16 |
| 903 | 8 | null | Y | 3 | Oct 16 |
| 903 | 9 | null | Y | 3 | May16 |
| 904 | 10 | null | N | 0 | May 16 |
| 904 | 11 | null | N | 0 | May16
--------------------------------------------------------------------------------------
Output table
+---------------------------------------------------------------------------------------------------+
| group_id | my_id | previous | in_this | higher_value |most_recent|
+----------------------------------------------------------------------------------------------------
| 900 | 1 | null | Y | 7 | May16 |
| 902 | 7 | null | N | 9 | Oct 16 |
| 903 | 8 | null | Y | 3 | Oct 16 |
---------------------------------------------------------------------------------------------------------
Hi all, I need help with a query that returns one record within a group based on the importance of the field. The importance is ranked as follows:
previous- if one record within the group_id is not null, then neither record within a group_id is returned (because according to our rules, all records within a group should have the same previous value)
in_this- If one record is Y, and the other is N within a group_id, then we keep the Y; If all records are Y or all are N, then we move to the next attribute
Higher_value- If all records in the ‘in_this’ field are equal, then we need to select the record with the greater value from this field. If both records have an equal value, we move to the next attribute
Most_recent- If all records were of equal value in the ‘higher_value’ field, then we consider the newest record. If these are equal, then nothing is returned.
This is a simplified version of the table I am looking at, but I just would like to get the gist of how something like this would work. Basically, my table has multiple copies of records that have been grouped through some algorithm. I have been tasked with selecting which of these records within a group is the ‘good’ one, and we are basing this on these fields.
I’d like the output to actually show all fields, because I will likely attempt to refine the query to include other fields (there are over 40 to consider), but the most important is the group_id and my_id fields. It would be neat if we could also somehow flag why each record got picked, but that isn’t necessary.
It seems like something like this should be easy, but I have a hard time wrapping my head around how to pick from within a group_id. Thanks for your help.
You can use analytic functions for this. The trick is establishing the right variables for each condition:
select t.*
from (select t.*,
max(in_this) over (partition by group_id) as max_in_this,
min(higher_value) over (partition by group_id) as min_higher_value,
max(higher_value) over (partition by group_id) as max_higher_value,
row_number() over (partition by group_id, higher_value order by my_id) as seqnum_ghv,
min(most_recent) over (partition by group_id) as min_most_recent,
max(most_recent) over (partition by group_id) as max_most_recent,
row_number() over (partition by group_id order by most_recent) as seqnum_mr
from t
) t
where max_in_this is not null and
( (min_higher_value <> max_higher_value and seqnum_ghv = 1) or
(min_higher_value = max_higher_value and min_most_recent <> max_most_recent and seqnum_mr = 1
)
);
The third condition as stated makes no sense, but you should get the idea for how to implement this.

Create a sub query for sum data as a new column in SQL Server

Suppose that I have a table name as tblTemp which has data as below:
| ID | AMOUNT |
----------------
| 1 | 10 |
| 1-1 | 20 |
| 1-2 | 30 |
| 1-3 | 40 |
| 2 | 50 |
| 3 | 60 |
| 4 | 70 |
| 4-1 | 80 |
| 5 | 90 |
| 6 | 100 |
ID will be format as X (without dash) if it's only one ID or (X-Y) format if new ID (Y) is child of (X).
I want to add a new column (Total Amount) to output as below:
| ID | AMOUNT | Total Amount |
---------------------------------
| 1 | 10 | 100 |
| 1-1 | 20 | 100 |
| 1-2 | 30 | 100 |
| 1-3 | 40 | 100 |
| 2 | 50 | 50 |
| 3 | 60 | 60 |
| 4 | 70 | 150 |
| 4-1 | 80 | 150 |
| 5 | 90 | 90 |
| 6 | 100 | 100 |
The "Total Amount" column is the calculate column which sum value in Amount column that the (X) in ID column is the same.
In order to get parent ID (X), I use the following SQL:
SELECT
ID, SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
), Amount
FROM
tblTemp
How Can I query like this in SQL Server 2012?
You can use sqlfiddle here to test it.
Thank You
Pengan
You have already done most of the work. To get the final result you can use your existing query and make it a subquery or use a CTE, then use sum() over() to get the result:
;with cte as
(
SELECT
ID,
SUBSTRING (ID, 1,
IIF (CHARINDEX('-', ID) = 0,
len(ID),
CHARINDEX('-', ID) - 1)
) id_val, Amount
FROM tblTemp
)
select id, amount, sum(amount) over(partition by id_val) total
from cte
See SQL Fiddle with Demo
You can do this using the sum() window function:
select id, amount,
SUM(amount) over (partition by (case when id like '%-%'
then left (id, charindex('-', id) - 1)
else id
end)
) as TotalAmount
from tblTemp t

Distinct over several field but show all field SQL Server

I have a query with a result like this
No date No.PO product type div price
1. 01-10-2012 | AAA1 | X1 | 1 | SBS | 100
2. 09-10-2012 | ABA1 | X1 | 2 | SBS | 150
3. 11-10-2012 | ACC1 | X1 | 1 | SBS | 110
4. 15-10-2012 | ACD1 | X1 | 1 | DBS | 115
5. 20-10-2012 | ADA1 | X1 | 1 | SBS | 112
6. 23-10-2012 | AFA1 | X1 | 2 | SBS | 160
7. 27-10-2012 | AHA1 | X1 | 1 | SBS | 120
and a few thousand record . . .
and the result should be show like this
No date No.PO product type div price
1. 27-10-2012 | AHA1 | X1 | 1 | SBS | 120
2. 23-10-2012 | AFA1 | X1 | 2 | SBS | 160
3. 15-10-2012 | ACD1 | X1 | 1 | DBS | 115
Here is the rules
Distinct on product type and div
Only the last transaction that i wanna show (it means the biggest date among the duplicate product, type and div data)
All of the field (date, No.PO, product, type, div and price) must be showed
Hopefully my description is clear now . .
Anyone can help me with the right query?
Please try the query and check the result, mere guess.
SELECT * FROM(
SELECT
ROW_NUMBER() OVER (Partition by [Type], [Div] ORDER BY [Date] DESC) RowNum,
[Date],
[No.PO],
[Product],
[Type],
[Div],
[Price]
FROM TABLE_Name
)x WHERE RowNum=1
ORDER BY [Date] DESC