Re-numbering sub-sets of a table in SQL - sql

I have a table which contains columns for: ComponentID, PartID, Position, Country and City. The primary key consists of ComponentID, PartID, Country and City.
I am using MS SQL Azure.
Potentially a "part" may be deleted. If this happens, I need to be able to reset all of the positions for components and parts, BUT partitioned by Country and city.
I have written the code below, but it throws a primary key error. Can anyone help?
-- Update the positions
;WITH CTE AS
(
SELECT Comp.Position,
RowNum = row_number() OVER (partition by Comp.Country, Comp.City order by Comp.Position ASC)
FROM Comp
WHERE Comp.CompID = #selectedComponent
)
UPDATE CTE
SET Position = RowNum

Your CTE is generating the new row positions for your data so you then need to Join that to your existing table to update the position values.
To do that, you will need to include all the primary key columns in the CTE, and then do an update over a join:
UPDATE dbo.Comp
SET Position = CTE.RowNum
FROM CTE
INNER JOIN DBO.Comp ON CTE.ComponentId = dbo.Comp.ComponentId
AND CTE.City = dbo.Comp.City AND CTE.Country = dbo.Comp.Country AND CTE.PartID = dbo.Comp.PartID

Related

Inner join of table a and table b with selecting only one row of multiple in table b ( row with colum n = max value )

i am a total beginner in SQL. So i have two tables ( table a and table b )
table a holds people with unique IDs, table b holds multiple rows for each person of table a and also the persons ID ( for a possible join ) . the rows in table b are sorted by the columnn row_number.
How can i select all people but only the row of table b with the highest row_number ?
i hope you could somewhat understand me.
Cheers
If i got you right:
SELECT a.persons_ID
,b.rn
FROM A
INNER JOIN
(
SELECT MAX(row_number_column) AS rn
,persons_ID
FROM B
GROUP BY persons_ID
) sub
ON sub.persons_ID = A.persons_ID
The subselect in the inner joins groups your data of table B. So there will be just one row for each persons_ID - the row with the highest row_number_column.
Finally just a simple join on persons_ID.
If you don't need any other information than the person ID and the last row_number per person, then it is quite trivial. Let's call the first table person and the second visit:
select person_id,
max(row_number) max_row_number
from visit
group by person_id
If you need some other information from the first table, like person.name, then perform the join:
select person.person_id,
person.name,
max(visit.row_number) max_row_number
from person
inner join visit on visit.person_id = person.person_id
group by person.person_id,
person.name
If you need some other information from the second table, like visit.present, then modern databases support the row_number() window function (not to be confused with the column that you have):
select name,
base.row_number,
present
from (
select person.name,
row_number() over (partition by visit.person_id
order by visit.row_number desc) rn,
visit.row_number,
visit.present
from person
inner join visit on visit.person_id = person.person_id
) base
where rn = 1
NB: I would strongly advise to rename the column row_number to some other name, as row_number is an analytic function in many databases.

SQL Server 2016 Sub Query Guidance

I am currently working on an assignment for my SQL class and I am stuck. I'm not looking for full code to answer the question, just a little nudge in the right direction. If you do provide full code would you mind a small explanation as to why you did it that way (so I can actually learn something.)
Here is the question:
Write a SELECT statement that returns three columns: EmailAddress, ShipmentId, and the order total for each Client. To do this, you can group the result set by the EmailAddress and ShipmentId columns. In addition, you must calculate the order total from the columns in the ShipItems table.
Write a second SELECT statement that uses the first SELECT statement in its FROM clause. The main query should return two columns: the Client’s email address and the largest order for that Client. To do this, you can group the result set by the EmailAddress column.
I am confused on how to pull in the EmailAddress column from the Clients table, as in order to join it I have to bring in other tables that aren't being used. I am assuming there is an easier way to do this using sub Queries as that is what we are working on at the time.
Think of SQL as working with sets of data as opposed to just tables. Tables are merely a set of data. So when you view data this way you immediately see that the query below returns a set of data consisting of the entirety of another set, being a table:
SELECT * FROM MyTable1
Now, if you were to only get the first two columns from MyTable1 you would return a different set that consisted only of columns 1 and 2:
SELECT col1, col2 FROM MyTable1
Now you can treat this second set, a subset of data as a "table" as well and query it like this:
SELECT
*
FROM (
SELECT
col1,
col2
FROM
MyTable1
)
This will return all the columns from the two columns provided in the inner set.
So, your inner query, which I won't write for you since you appear to be a student, and that wouldn't be right for me to give you the entire answer, would be a query consisting of a GROUP BY clause and a SUM of the order value field. But the key thing you need to understand is this set thinking: you can just wrap the ENTIRE query inside brackets and treat it as a table the way I have done above. Hopefully this helps.
You need a subquery, like this:
select emailaddress, max(OrderTotal) as MaxOrder
from
( -- Open the subquery
select Cl.emailaddress,
Sh.ShipmentID,
sum(SI.Value) as OrderTotal -- Use the line item value column in here
from Client Cl -- First table
inner join Shipments Sh -- Join the shipments
on Sh.ClientID = Cl.ClientID
inner join ShipItem SI -- Now the items
on SI.ShipmentID = Sh.ShipmentID
group by C1.emailaddress, Sh.ShipmentID -- here's your grouping for the sum() aggregation
) -- Close subquery
group by emailaddress -- group for the max()
For the first query you can join the Clients to Shipments (on ClientId).
And Shipments to the ShipItems table (on ShipmentId).
Then group the results, and count or sum the total you need.
Using aliases for the tables is usefull, certainly when you select fields from the joined tables that have the same column name.
select
c.EmailAddress,
i.ShipmentId,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
order by i.ShipmentId, c.EmailAddress;
Using that grouped query in a subquery, you can get the Maximum total per EmailAddress.
select EmailAddress,
-- max(TotalShipItems) as MaxTotalShipItems,
max(TotalPriceDiscounted) as MaxTotalPriceDiscounted
from (
select
c.EmailAddress,
-- i.ShipmentId,
-- count(*) as TotalShipItems,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
) q
group by EmailAddress
order by EmailAddress
Note that an ORDER BY is mostly meaningless inside a subquery if you don't use TOP.

Oracle Sql Duplicate rows when joining new table

I am using oracle sql to join tables. I use the following code:
SELECT
T.TRANSACTION_KEY,
PR.ACCOUNT_KEY,
T.ACCT_CURR_AMOUNT,
T.EXECUTION_LOCAL_DATE_TIME,
TC.DESCRIPTION,
T.OPP_ACCOUNT_NAME,
T.OPP_COUNTRY,
PT.PARTY_TYPE_DESC,
P.PARTY_NAME,
P.CUSTOM_SMALL_STRING_02,
CO.COUNTRY_NAME,
LE.LIST_CD
FROM TRANSACTIONS T
LEFT JOIN TRANSACTION_CODE TC
ON T.TRANSACTION_CODE = TC.ENTITY
LEFT JOIN PARTY_ACCOUNT_RELATION PR
ON T.ACCOUNT = PR.ACCOUNT
LEFT JOIN PARTY P
ON PR.PARTY_KEY = P.PARTY_KEY
LEFT JOIN PARTY_TYPE PT
ON P.PARTY_TYPE = PT.ENTITY
LEFT JOIN COUNTRY CO
ON T.OPP_COUNTRY = CO.ENTITY
LEFT JOIN LISTED_ENTITY LE
ON CO.COUNTRY = LE.ENTITY_KEY
WHERE
PR.PARTY_KEY = '111111111' and T.EXECUTION_LOCAL_DATE_TIME>'2017-01-01';
It works fine until now but I want to join another table which has a column in common(ENTITY_KEY) with PARTY_ACCOUNT_RELATION table (ACCOUNT_KEY) and I want to include some of the new table's columns but when I do that, it becomes dublicated. I am adding the following lines before "where" statment:
LEFT JOIN EVALUATE_RULE ER
ON PR.ACCOUNT_KEY = ER.ENTITY_KEY
Does anyone know where the problem is?
If joining another table into an existing query causes the existing rows to be duplicated, it is because the table being joined in has duplicate values in the columns that are being used as keys for the join
In your case, if you do
SELECT ENTITY_KEY FROM EVALUATE_RULE GROUP BY ENTITY_KEY HAVING COUNT(*) > 1
You'll see which entity_keys are duplicated. When these duplicates are joined to the existing data, the existing data has to be doubled up to permit both rows from EVALUATE_RULE with the same ENTITY_KEY to exist in the result set
You must either de-dupe the table, or put other clauses into your ON condition to further restrict the rows coming from EVALUATE_RULE.
For example, after adding EVALUATE_RULE and putting ER.* in your SELECT list, imagine that you can see that the rows from ER are status = 'old' and status = 'current' but you know you only want the current ones.. So put AND er.status = 'current' in your ON clause
Your comment indicates that multiple records differ by some column you don't care about, so this technique will just select only one row:
LEFT JOIN
(SELECT e.*, ROW_NUMBER() OVER(PARTITION BY e.entity_key ORDER BY e.name) as rown FROM evaluate_rule e) er
ON
er.entity_key = pr.account_key and
er.rown = 1
If you want info on why this works, run that sql in isolation:
SELECT e.*, ROW_NUMBER() OVER(PARTITION BY e.entity_key ORDER BY e.name) as rown FROM evaluate_rule e
ORDER BY e.entity_key -- i added this to make it more clear what is going on. You don't need it in your main query
It just assigns a number to each row in the table, the number restarts at 1 every time entity_key changes, so we can then select all those with rown = 1
If it turns out you DO want something specific like "the latest row from evaluate_rule", you can use something like this:
SELECT e.*, ROW_NUMBER() OVER(PARTITION BY e.entity_key ORDER BY e.created_date DESC) as rown FROM evaluate_rule e
Now the latest created_date row will always have rown = 1
So far as I can understain from your description, table EVALUATE_RULE has moro records with ACCOUNT_KEY=ENTITY_KEY.
You can change your query section:
LEFT JOIN EVALUATE_RULE ER ON PR.ACCOUNT_KEY = ER.ENTITY_KEY
to
LEFT JOIN (SELECT DISTINCT ENTITY_KEY FROM EVALUATE_RULE) ER ON PR.ACCOUNT_KEY = ER.ENTITY_KEY
If you post structure of EVALUATE_RULE (indicating PK columns) I can change my answer to let you includ EVALUATE_RULE columns in final query.

SQL Server - Multiple FROM keywords?

The search term is to ambiguous for google aparently. I am looking at a SQL call and it has 2 FROM keywords? I've never seen this before, can someone explain?
SELECT TOP(5) SUM(column) AS column, column
FROM ( SELECT DISTINCT column, column, column
FROM ((((((table table
INNER JOIN table table ON (column = column
AND column = 2
AND column != '' ))
INNER JOIN table table ON (column = column
AND (column = 144 OR column = 159 OR column = 162 OR column = 164 OR column = 163 OR column = 1 OR column = 2 OR column = 122 OR column = 155 OR column = 156 )))
inner join table table ON (column = column
AND column = 0 ))
INNER JOIN table ON (column = column ))
INNER JOIN table table ON ( column = column
AND (column = 102 OR column = 103 )))
INNER JOIN table table ON (column = column ))) TempTable
GROUP BY column ORDER BY column desc
You will note the multiple FROM keywords. It runs just fine. Just curious to what the purpose is.
This is called as subquery. You can use subquery within your main query
So subquery made the multiple FORM clause.
There's a reason why SQL is called a Structured Query Language: it lets you formulate queries that use other queries as their source, thus creating a hierarchical query structure.
This is a common practice: each FROM keyword is actually paired with its own SELECT, making the inner query a source for the outer one.
Proper formatting would help you understand what is going on: indenting inner SELECTs helps you see the structure of your query, making it easier to understand which part is used as the source of what other parts:
SELECT TOP(5) SUM(price) AS total_price, item_id
FROM ( -- The output of this query serves as input for the outer query
SELECT price, item
FROM order -- This may have its own selects, joins, etc.
GROUP BY order_id
)
GROUP BY item_id
SQL supports SELECTing from the results of another, nested SELECT. As already mentioned, the nested SELECT is called a subquery.
More details about subqueries and examples of their use in MSSQL Server can be found at http://technet.microsoft.com/en-us/library/ms189575(v=sql.105).aspx
Subquery used to select into an aliased column:
USE AdventureWorks2008R2;
GO
SELECT Ord.SalesOrderID, Ord.OrderDate,
(SELECT MAX(OrdDet.UnitPrice)
FROM AdventureWorks.Sales.SalesOrderDetail AS OrdDet
WHERE Ord.SalesOrderID = OrdDet.SalesOrderID) AS MaxUnitPrice
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS Ord
Using a subquery in the WHERE clause (from http://www.codeproject.com/Articles/200127/SQL-Joins-and-Subqueries)
-- Use a Subquery
SELECT * FROM AdventureWorks.Person.Address
WHERE StateProvinceID IN
(
SELECT StateProvinceID
FROM AdventureWorks.Person.StateProvince
WHERE StateProvinceCode = 'CA'
)
-- Use a Join
SELECT addr.*
FROM AdventureWorks.Person.Address addr
INNER JOIN AdventureWorks.Person.StateProvince state
ON addr.StateProvinceID = state.StateProvinceID
WHERE state.StateProvinceCode = 'CA'
You're seeing FROM clauses in subqueries. If you tabify the query it may be more obvious
SELECT TOP(5) SUM(column) AS column, column
FROM (
SELECT DISTINCT column, column, column
FROM ((((((table table
...
INNER JOIN table table ON (column = column ))) TempTable
GROUP BY column
ORDER BY column desc

Adding rows to SQL query for each element in WHERE sequence

I have an SQL query like the following:
SELECT store_id, SUM(quantity_sold) AS count
FROM sales_table
WHERE store_id IN ('Store1', 'Store2', 'Store3')
GROUP BY store_id;
This returns a row for each store that has rows in sales_table, but does not return a row for those that do not. What I want is one row per store, with a 0 for count if it has no records.
How can I do this, assuming that I do not have access to a stores table?
with stores (store_id) as (
values ('Store1'), ('Store2'), ('Store3')
)
select st.store_id,
sum(sal.quantity_sold) as cnt
from stores st
left join sales_table sal on sal.store_id = st.store_id
group by st.store_id;
If you do have a stores table, then simply do an outer join to that one instead of "making one up" using the common table expression (with ..).
This can also be written without the CTE (common table expression):
select st.store_id,
sum(sal.quantity_sold) as cnt
from (
values ('Store1'), ('Store2'), ('Store3')
) st
left join sales_table sal on sal.store_id = st.store_id
group by st.store_id;
(But I find the CTE version easier to understand)
You can use unnest() to generate rows from array elements.
SELECT store, sum(sales_table.quantity_sold) AS count
FROM unnest(ARRAY['Store1', 'Store2', 'Store3']) AS store
LEFT JOIN sales_table ON (sales_table.store_id = store)
GROUP BY store;