constructing complex sql statement - sql

I need to select and update a table looking up values from other tables. The idea is to select some details from order, but get the names of the foreign key values using the id
For the select I tried:
SELECT [Order].order_date,
[Order].total_price,
[Order].order_details,
vehicle.reg_no,
Staff.name,
stock.name
FROM
Order,
vehicle,
staff,
stock
WHERE
order.id = #order_id
AND vehicle.id = Order.vehicle_id
AND staff.id = Order.staff_id
AND stock.id = Order.product_id
for the update that i tried
UPDATE order
SET
total_price = #total_price,
order_detail = #order_detail,
product_id = (select is from stock where name = #product_name),
customer_id = (select id from customer where name = #customer_name),
vehicle_regno = (select reg_no from vehicle where name = #name)
WHERE (id = #id)
both does not return anything. i hope am clear enough to get some help, but if not pls i will provide more info.
thanks for helping

You might want to try converting you INNER JOINs to a LEFT JOIN for the SELECT statement.
SELECT [Order].order_date,
[Order].total_price,
[Order].order_details,
vehicle.reg_no,
Staff.name,
stock.name
FROM
[Order]
LEFT JOIN vehicle
ON vehicle.id = [Order].vehicle_id
LEFT JOIN staff
ON staff.id = [Order].staff_id
LEFT JOIN stock
ON stock.id = [Order].product_id
WHERE
[order].id = #order_id
The reasons why this would make a difference is if either a) you allow nulls in the fk fields or b) you don't have fk contraints which may point to a problem with your design.
Your update statement should update some rows provided that the value of #id exists in the ORDER table but as #Danny Varod already commented you won't get rows back only the number of rows affected

You are comparing stock.id with Order.product_id, could that be the problem?
Otherwise why it is not returning any rows we would need to know the content of the tables, perhaps you need a left join instead of inner join to some of them?

Related

Deleting data from one table if the reference doesn't exist in two other tables

I managed to import too much data into one of my database tables. I want to delete most of this data, but I need to ensure that the reference doesn't exist in either of two other tables before I delete it.
I figured this query would be the solution. It give me the right result on a test database, but in the production environment it returns no hits.
select product
from products
where 1=1
and product not in (select product from location)
and product not in (select product from lines)
You are getting no results/hits it means that you table location and/or lines having the null values in the product column. in clause failed if column having null value.
try below query just added the null condition on the top of your shared query.
select product from products
where 1=1
and product not in ( select product from location where product is not null)
and product not in ( select product from lines where product is not null)
Use EXISTS instead of IN which is more efficient
DELETE FROM products WHERE
NOT EXISTS
(
SELECT
1
FROM [Location]
WHERE Product = Products.Product
) AND
NOT EXISTS
(
SELECT
1
FROM lines
WHERE Product = Products.Product
)
Try this..
DELETE FROM Products where not exists
(select 1 from Location
join lines on lines.Product = Location.Product
and Location.Product = Products.Product
);
It's difficult to tell from your post why the query would return results in the test database but not production other than there is different data or different structures. You might try including the DDL for the participating tables in your post so that we know what the table structures are. For example, is the "product" column a PK or a text name?
One thing that does jump out is that your query will probably perform poorly. Try something like this instead: (Assuming the "product" column is a PK in Products and FK in the other tables.)
Select product
From Products As p
Left Outer Join Location As l
On p.product = l.product
And l.product is null
Left Outer Join Lines as li
On p.product = li.product
And li.product is null;
This simple set based approach may help ...
DELETE p
FROM products p
LEFT JOIN location lo ON p.product = lo.product
LEFT JOIN lines li ON p.product = li.product
WHERE lo.product IS NULL AND li.product IS NULL

Where am I going wrong with this SQL query?

I am attempting to do the following:
Check to see if the table does not exist and if so, create the TABLE 'tmpTriangleTransfer'.
Check to see if the table exists and if so, DROP the TABLE 'tmpTriangleTransfer'.
Insert the data being pulled from the other tables into the 2nd -
5th columns of the TABLE 'tmpTriangleTransfer'.
Loop and for each row that exists in the TABLE 'tmpTriangleTransfer' update the 1st column with the declared information.
Return all of the information from that table (to be formatted into a report).
Can someone please help me figure out what I am doing wrong? I'm getting no results even though I know for a fact there are records (when I run just the SELECT statement on the last line, it shows records and when I run the SELECT DISTINCT statement in the middle, it shows the same records).
IF OBJECT_ID('tmpTriangleTransfer') IS NOT NULL
DROP TABLE tmpTriangleTransfer;
IF OBJECT_ID('tmpTriangleTransfer') IS NULL
CREATE TABLE tmpTriangleTransfer
(
CompanyName varchar(max),
OrderID decimal(19,2) NULL,
DriverID int NULL,
VehicleID int NULL,
Phone varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
BOL varchar(50) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
);
INSERT INTO tmpTriangleTransfer (OrderID, BOL, DriverID, VehicleID, Phone)
SELECT DISTINCT tblOrder.OrderID AS OrderID, tblOrder.BOL AS BOL, tblOrderDrivers.DriverID AS DriverID, tblDrivers.VehicleID AS VehicleID, tblWorker.Phone AS Phone
FROM tblOrder WITH (NOLOCK)
INNER JOIN tblActiveOrders
ON tblOrder.OrderID = tblActiveOrders.OrderID
INNER JOIN tblOrderDrivers
ON tblOrder.OrderID = tblOrderDrivers.OrderID
INNER JOIN tblDrivers
ON tblOrderDrivers.DriverID = tblDrivers.DriverID
INNER JOIN tblWorker
ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE tblOrder.CustID = 7317
ORDER BY tblOrder.OrderID`
DECLARE #MaxRownum INT
SET #MaxRownum = (SELECT MAX(OrderID) FROM tmpTriangleTransfer)
DECLARE #Iter INT
SET #Iter = (SELECT MIN(OrderID) FROM tmpTriangleTransfer)
WHILE #Iter <= #MaxRownum
BEGIN
UPDATE tmpTriangleTransfer
SET tmpTriangleTransfer.CompanyName = 'Triangle'
WHERE tmpTriangleTransfer.CompanyName IS NULL;
SET #Iter = #Iter + 1
END
SELECT * from tmpTriangleTransfer WITH (NOLOCK)
Your existing query is far too complicated. In fact, you don't need a temporary table, the WHILE loop, or anything - just a single SELECT is all you need:
SELECT
'Triangle' AS CompanyName,
tblOrder.OrderId,
tblOrder.BOL,
tblOrderOrders.DriverID,
tblDrivers.VehicleID,
tblWorker.Phone
FROM
tblOrder
OUTER JOIN tblActiveOrders ON tblOrder.OrderID = tblActiveOrders.OrderID
OUTER JOIN tblOrderDrivers ON tblOrder.OrderID = tblOrderDrivers.OrderID
OUTER JOIN tblDrivers ON tblOrderDrivers.DriverID = tblDrivers.DriverID
OUTER JOIN tblWorker ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE
tblOrder.CustID = 7317
ORDER BY
tblOrder.OrderID
I've changed your query to use OUTER JOIN instead of INNER JOIN because I suspect this is the main reason for no data being returned. INNER JOIN requires rows to exist in both tables (relations) and I suspect that you have Orders without Drivers or that not every Order is in ActiveOrders. Change the joins to INNER JOIN if you know that related rows will always be present.
You can return literals in queries directly, like I'm doing in the SELECT 'Triangle' AS CompanyName part, whereas you were seemingly manually adding it to the output temporary-table.
Your code didn't seem to be doing anything that would require the WITH (NOLOCK) modifier - the fact it was repeated everywhere makes it look like a case of Cargo-Cult Programming.
Tip: In SQL, a SELECT statement, as written, is not representative of its logical execution order. It should instead be read in this order: FROM > WHERE > [GROUP BY >] SELECT > ORDER BY.
This is why in .NET Linq the .Select() call is often at the end, not the beginning, because previous Linq expressions define the data sources.
This query can be parameterised by converting it to a Table-defined Function that accepts CustID as a parameter, I also assume you have the company name "Triangle" stored in a table somewhere - embedding it as a literal value for a single query is a code-smell - what's so special about 7317 / "Triangle"?
Related note: Generally speaking, queries that only SELECT data (and don't perform any INSERT/UPDATE/DELETE/ALTER/CREATE statements) should be Table-valued UDFs or Views and not Stored Procedures - so that they can benefit from function-composition, query-composition and runtime execution plan optimizations that you cannot get with Stored Procedures.
If you're able to, see if you can remove the tbl prefix from the table names (Using "tbl" as a prefix has its defenders, but my own personal opinion is that it's an obsolete developer aid as today's database tooling shows type information, and it makes database refactoring harder (e.g. converting a table to a view).
Taken from a combination of the suggestion from Dai and the requirements of my employer:
`SELECT 'Triangle' AS CompanyName, tblOrder.OrderId AS OrderID, tblOrder.BOL AS BOL, tblOrderDrivers.DriverID AS DriverID, tblDrivers.VehicleID AS VehicleID, tblWorker.Phone AS Phone
FROM tblOrder WITH (NOLOCK)
INNER JOIN tblActiveOrders WITH (NOLOCK)
ON tblOrder.OrderID = tblActiveOrders.OrderID
INNER JOIN tblOrderDrivers WITH (NOLOCK)
ON tblOrder.OrderID = tblOrderDrivers.OrderID
INNER JOIN tblDrivers WITH (NOLOCK)
ON tblOrderDrivers.DriverID = tblDrivers.DriverID
INNER JOIN tblWorker WITH (NOLOCK)
ON tblDrivers.WorkerID = tblWorker.WorkerID
WHERE
tblOrder.CustID = 7317
ORDER BY
tblOrder.OrderID desc`

Define sort order when updating

I have a script that updates an ID field on one table where that record matches to another table based on criteria.
Below is the general structure of my query.
update p.saleId = e.saleId
from products p inner join sales s on s.crit1 = p.crit1
where p.someDate between s.startDate and s.endDate
This is working fine. My issue is that in some situations there is more than one match on the 'sales' table with this query which is generally ok. I'd however like to sort these results based on another field to make sure the saleId I get is the one with the highest cost.
Is that possible?
As it is the saleID you want to set and the sales table you are looking up, you can probably just update all products records. Then you can write a simple update statement on the table and don't have to join. This makes this much easier to write:
update products p
set saleId =
(
select top(1) s.saleId
from sales s
where s.crit1 = p.crit1
and p.someDate between s.startDate and s.endDate
order by cost desc
);
The main difference to your statement is that mine sets saleId = NULL where there is no match in the sales table, while your lets these untouched. But I guess that doesn't make a difference here.
I hope the below query may solve. Wrote very high level draft as per your question. Please take only the concept not the syntax.
with maxSales as (select salesId, crit1 from sales s1
where cost = (select max(cost) from
sales s2 where s1.crit1 = s2.crit1)
update products p set p.saleId =
(select s.saleId from
maxSales s
where s.crit1 = p.crit1
and p.someDate between s.startDate and s.endDate)
UPDATE p
set p.saleId = e.rowNumber
FROM products p
INNER JOIN
(SELECT saleId, row_number() OVER (ORDER BY saleId DESC) as rowNumber
FROM sales)
e ON e.saleId = p.saleId
TRY THIS:
UPDATE p
SET p.saleid = s.saleid
FROM products p
INNER JOIN
(SELECT s.crit1,
s.saleid
FROM sales s
WHERE cost IN
(SELECT max(cost) cost
FROM sales
GROUP BY crit1)) s ON s.crit1 = p.crit1
None of the answers worked, but I managed to do it by using and Outer Apply as my join, and specified the sort order in that.
Cheers everyone for the input.

Refactoring a tsql view which uses row_number() to return rows with a unique column value

I have a sql view, which I'm using to retrieve data. Lets say its a large list of products, which are linked to the customers who have bought them. The view should return only one row per product, no matter how many customers it is linked to. I'm using the row_number function to achieve this. (This example is simplified, the generic situation would be a query where there should only be one row returned for each unique value of some column X. Which row is returned is not important)
CREATE VIEW productView AS
SELECT * FROM
(SELECT
Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering,
customer.Id
//various other columns
FROM products
LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
//various other joins
) as temp
WHERE temp.prodcut_numbering = 1
Now lets say that the total number of rows in this view is ~1 million, and running select * from productView takes 10 seconds. Performing a query such as select * from productView where productID = 10 takes the same amount of time. I believe this is because the query gets evaluated to this
SELECT * FROM
(SELECT
Row_number() OVER(PARTITION BY products.Id ORDER BY products.Id) AS product_numbering,
customer.Id
//various other columns
FROM products
LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
//various other joins
) as temp
WHERE prodcut_numbering = 1 and prodcut.Id = 10
I think this is causing the inner subquery to be evaluated in full each time. Ideally I'd like to use something along the following lines
SELECT
Row_number() OVER(PARTITION BY products.productID ORDER BY products.productID) AS product_numbering,
customer.id
//various other columns
FROM products
LEFT OUTER JOIN customer ON customer.productId = prodcut.Id
//various other joins
WHERE prodcut_numbering = 1
But this doesn't seem to be allowed. Is there any way to do something similar?
EDIT -
After much experimentation, the actual problem I believe I am having is how to force a join to return exactly 1 row. I tried to use outer apply, as suggested below. Some sample code.
CREATE TABLE Products (id int not null PRIMARY KEY)
CREATE TABLE Customers (
id int not null PRIMARY KEY,
productId int not null,
value varchar(20) NOT NULL)
declare #count int = 1
while #count <= 150000
begin
insert into Customers (id, productID, value)
values (#count,#count/2, 'Value ' + cast(#count/2 as varchar))
insert into Products (id)
values (#count)
SET #count = #count + 1
end
CREATE NONCLUSTERED INDEX productId ON Customers (productID ASC)
With the above sample set, the 'get everything' query below
select * from Products
outer apply (select top 1 *
from Customers
where Products.id = Customers.productID) Customers
takes ~1000ms to run. Adding an explicit condition:
select * from Products
outer apply (select top 1 *
from Customers
where Products.id = Customers.productID) Customers
where Customers.value = 'Value 45872'
Takes some identical amount of time. This 1000ms for a fairly simple query is already too much, and scales the wrong way (upwards) when adding additional similar joins.
Try the following approach, using a Common Table Expression (CTE). With the test data you provided, it returns specific ProductIds in less than a second.
create view ProductTest as
with cte as (
select
row_number() over (partition by p.id order by p.id) as RN,
c.*
from
Products p
inner join Customers c
on p.id = c.productid
)
select *
from cte
where RN = 1
go
select * from ProductTest where ProductId = 25
What if you did something like:
SELECT ...
FROM products
OUTER APPLY (SELECT TOP 1 * from customer where customerid = products.buyerid) as customer
...
Then the filter on productId should help. It might be worse without filtering, though.
The problem is that your data model is flawed. You should have three tables:
Customers (customerId, ...)
Products (productId,...)
ProductSales (customerId, productId)
Furthermore, the sale table should probably be split into 1-to-many (Sales and SalesDetails). Unless you fix your data model you're just going to run circles around your tail chasing red-herring problems. If the system is not your design, fix it. If the boss doesn't let your fix it, then fix it. If you cannot fix it, then fix it. There isn't a easy way out for the bad data model you're proposing.
this will probably be fast enough if you really don't care which customer you bring back
select p1.*, c1.*
FROM products p1
Left Join (
select p2.id, max( c2.id) max_customer_id
From product p2
Join customer c2 on
c2.productID = p2.id
group by 1
) product_max_customer
Left join customer c1 on
c1.id = product_max_customer.max_customer_id
;

SQL Server Update with Inner Join

I have 3 tables (simplified):
tblOrder(OrderId INT)
tblVariety(VarietyId INT,Stock INT)
tblOrderItem(OrderId,VarietyId,Quantity INT)
If I place an order, I drop the stock level using this:
UPDATE tblVariety
SET tblVariety.Stock = tblVariety.Stock - tblOrderItem.Quantity
FROM tblVariety
INNER JOIN tblOrderItem ON tblVariety.VarietyId = tblOrderItem.VarietyId
INNER JOIN tblOrder ON tblOrderItem.OrderId = tblOrder.OrderId
WHERE tblOrder.OrderId = 1
All fine, until there are two rows in tblOrderItem with the same VarietyId for the same OrderId. In this case, only one of the rows is used for the stock update. It seems to be doing a GROUP BY VarietyId in there somehow.
Can anyone shed some light? Many thanks.
My guess is that because you have shown us simplified schema, some info is missing that would determine why have the repeated VarietyID values for a given OrderID.
When you have multiple rows, SQL Server will arbritrarily pick one of them for the update.
If this is the case, you need to group first
UPDATE V
SET
Stock = Stock - foo.SumQuantity
FROM
tblVariety V
JOIN
(SELECT SUM(Quantity) AS SumQuantity, VarietyID
FROM tblOrderItem
JOIN tblOrder ON tblOrderItem.OrderId = tblOrder.OrderId
WHERE tblOrder.OrderId = 1
GROUP BY VarietyID
) foo ON V.VarietyId = foo.VarietyId
If not, then the OrderItems table PK is wrong because if allows duplicate OrderID/VarietyID combinations (The PK should be OrderID/VarietyID, or these should be constrained unique)
From the documentation UPDATE
The results of an UPDATE statement are
undefined if the statement includes a
FROM clause that is not specified in
such a way that only one value is
available for each column occurrence
that is updated (in other words, if
the UPDATE statement is not
deterministic). For example, given the
UPDATE statement in the following
script, both rows in table s meet the
qualifications of the FROM clause in
the UPDATE statement, but it is
undefined which row from s is used to
update the row in table t.
CREATE TABLE s (ColA INT, ColB DECIMAL(10,3))
GO
CREATE TABLE t (ColA INT PRIMARY KEY, ColB DECIMAL(10,3))
GO
INSERT INTO s VALUES(1, 10.0)
INSERT INTO s VALUES(1, 20.0)
INSERT INTO t VALUES(1, 0.0)
GO
UPDATE t
SET t.ColB = t.ColB + s.ColB
FROM t INNER JOIN s ON (t.ColA = s.ColA)
GO
You are doing an update. It will update once.
Edit: to solve, you can add in a subquery that will group your orderitems by orderid and varietyid, with a sum on the amount.