I am having trouble writing a script which can delete all the rows which match on the first three columns and where the Quantities sum to zero?
I think the query needs to find all Products that match and then within that group, all the Names which match and then within that subset, all the currencies which match and then, the ones which have quantities netting to zero.
In the below example, the rows which would be deleted would be rows 1&2,4&6.
Product, Name, Currency, Quantity
1) Product A, Name A, GBP, 10
2) Product A, Name A, GBP, -10
3) Product A, Name B, GBP, 10
4) Product A, Name B, USD, 10
5) Product A, Name B, EUR, 10
6) Product A, Name B, USD, -10
7) Product A, Name C, EUR, 10
Hope this makes sense and appreciate any help.
Try this:
DELETE
FROM [Product]
WHERE Id IN
(
SELECT Id
FROM
(
SELECT Id, SUM(Quantity) OVER(PARTITION BY a.Product, a.Name, a.Currency) AS Sm
FROM [Product] a
) a
WHERE Sm = 0
)
You may want to break this problem into parts.
First create a view that lists those combinations which sum to zero
CREATE VIEW vw_foo AS
SELECT product,name, currency, sum(quantity) as net
FROM foo
GROUP BY product, name, currency
HAVING sum(quantity)=0;
At this point, you need to make sure this view has the data you expect to delete. In you example, the view should have only 2 records: ProductA/NameA/GBP and ProductA/NameB/USD
Step 2. Delete the data where the fields match:
DELETE FROM foo
WHERE EXISTS
(SELECT *
FROM vw_foo
WHERE vw_foo.product = product
AND vw_foo.name = name
AND vw_currency = currency);
One way to simplify the SQL is to just concatente the 3 columns into one and apply some grouping:
delete from product
where product + name + currency in (
select product + name + currency
from product
group by product + name + currency
having sum(quantity) = 0)
I am assuming this is a accounting problem with offsetting pairs of entries in the ledger.
If there are for instance three entries for combination (A, A, GBP) this code and some of the example above will not work.
I create a temporary test table, loaded it with your data, used a CTE - common table expression - to find the duplicate pattern and joined it to the table to select the rows.
Just change the 'select *' to 'delete'.
Again, this only works for equal offsetting pairs. It will cause havoc with odd number of entries.
Do you have only even number of entries?
Sincerely
John
-- create sample table
create table #products
(
product_id int identity(1,1),
product_txt varchar(16),
name_txt varchar(16),
currency_cd varchar(16),
quantity_num int
);
go
-- add data 2 table
insert into #products
(product_txt, name_txt, currency_cd, quantity_num)
values
('A', 'A', 'GBP', 10),
('A', 'A', 'GBP', -10),
('A', 'B', 'GBP', 10),
('A', 'B', 'USD', 10),
('A', 'B', 'EUR', 10),
('A', 'B', 'USD', -10),
('A', 'C', 'EUR', 10);
go
-- show the data
select * from #products;
go
-- use cte to find combinations
with cte_Ledger_Offsets (product_txt, name_txt, currency_cd)
as
(
select product_txt, name_txt, currency_cd
from #products
group by product_txt, name_txt, currency_cd
having sum(quantity_num) = 0
)
select * from #products p inner join cte_Ledger_Offsets c
on p.product_txt = c.product_txt and
p.name_txt = c.name_txt and
p.currency_cd = c.currency_cd;
Related
I am trying to join several tables. To simplify the situation, there is a table called Boxes which has a foreign key column for another table, Requests. This means that with a simple join I can get all the boxes that can be used to fulfill a request. But the Requests table also has a column called BoxCount which limits the number of boxes that is needed.
Is there a way to structure the query in such a way that when I join the two tables, I will only get the number of rows from Boxes that is specified in the BoxCount column of the given Request, rather than all of the rows from Boxes that have a matching foreign key?
Script to initialize sample data:
CREATE TABLE Requests (
Id int NOT NULL PRIMARY KEY,
BoxCount Int NOT NULL);
CREATE TABLE Boxes (
Id int NOT NULL PRIMARY KEY,
Label varchar,
RequestId INT FOREIGN KEY REFERENCES Requests(Id));
INSERT INTO Requests (Id, BoxCount)
VALUES
(1, 2),
(2, 3);
INSERT INTO Boxes (Id, Label, RequestId)
VALUES
(1, 'A', 1),
(2, 'B', 1),
(3, 'C', 1),
(4, 'D', 2),
(5, 'E', 2),
(6, 'F', 2),
(7, 'G', 2);
So, for example, when the hypothetical query is ran, it should return boxes A and B (because the first Request only needs 2 boxes), but not C. Similarly it should also include boxes D, E and F, but not box G, because the second request only requires 3 boxes.
Here is another approach using ROWCOUNT - a common and useful technique that every sql writer should master. The idea here is that you create a sequential number for all boxes within a request and use that to compare to the box count for filtering.
with boxord as (select *,
ROW_NUMBER() OVER (PARTITION BY RequestId ORDER BY Id) as rno
from dbo.Boxes
)
select req.*, boxord.Label, boxord.rno
from dbo.Requests as req inner join boxord on req.Id = boxord.RequestId
where req.BoxCount >= boxord.rno
order by req.Id, boxord.rno
;
fiddle to demonstrate
The INNER JOIN keyword selects records that have matching values in both tables
SELECT (cols) FROM Boxes
INNER JOIN Request on Boxes.(FK_column) = request.id
WHERE Request.BoxCount = (value)
select r.id,
r.boxcount,
b.id,
b.label
from requests r
cross apply (
select top (r.BoxCount)
id, label
from boxes
where requestid = r.id
order by id
) b;
I'm looking to assign unique person IDs to a marketing program, but need to optimize based on each person's Probability Score (some people can be sent to multiple programs, some only one) and have two constraints such as budgeted mail quantity for each program.
I'm using SQL Server and am able to put IDs into their highest scoring program using the row_number() over(partition by person_ID order by Prob_Score), but I need to return a table where each ID is assigned to a program, but I'm not sure how to add the max mail quantity constraint specific to each individual program. I've looked into the Check() constraint functionality, but I'm not sure if that's applicable.
create table test_marketing_table(
PersonID int,
MarketingProgram varchar(255),
ProbabilityScore real
);
insert into test_marketing_table (PersonID, MarketingProgram, ProbabilityScore)
values (1, 'A', 0.07)
,(1, 'B', 0.06)
,(1, 'C', 0.02)
,(2, 'A', 0.02)
,(3, 'B', 0.08)
,(3, 'C', 0.13)
,(4, 'C', 0.02)
,(5, 'A', 0.04)
,(6, 'B', 0.045)
,(6, 'C', 0.09);
--this section assigns everyone to their highest scoring program,
--but this isn't necessarily what I need
with x
as
(
select *, row_number()over(partition by PersonID order by ProbabilityScore desc) as PersonScoreRank
from test_marketing_table
)
select *
from x
where PersonScoreRank='1';
I also need to specify some constraints: two max C packages, one max A & one max B package can be sent. How can I reassign the IDs to a program while also using the highest probability score left available?
The final result should look like:
PersonID MarketingProgram ProbabilityScore PersonScoreRank
3 C 0.13 1
6 C 0.09 1
1 A 0.07 1
6 B 0.045 2
You need to rethink your ROW_NUMBER() formula based on your actual need, and you should also have a table of Marketing Programs to make this work efficiently. This covers the basic ideas you need to incorporate to efficiently perform the filtering you need.
MarketingPrograms Table
CREATE TABLE MarketingPrograms (
ProgramID varchar(10),
PeopleDesired int
)
Populate the MarketingPrograms Table
INSERT INTO MarketingPrograms (ProgramID, PeopleDesired) Values
('A', 1),
('B', 1),
('C', 2)
Use the MarketingPrograms Table
with x as (
select *,
row_number()over(partition by ProgramId order by ProbabilityScore desc) as ProgramScoreRank
from test_marketing_table
)
select *
from x
INNER JOIN MarketingPrograms m
ON x.MarketingProgram = m.ProgramID
WHERE x.ProgramScoreRank <= m.PeopleDesired
This is the data I have
I need Unique ID(1 row) with max(Price). So, the output would be:
I have tried the following
select * from table a
join (select b.id,max(b.price) from table b
group by b.id) c on c.id=a.id;
gives the Question as output, because there is no key. I did try the other where condition as well, which gives the original table as output.
You could try something like this in SQL Server:
Table
create table ex1 (
id int,
item char(1),
price int,
qty int,
usr char(2)
);
Data
insert into ex1 values
(1, 'a', 7, 1, 'ab'),
(1, 'a', 7, 2, 'ac'),
(2, 'b', 6, 1, 'ab'),
(2, 'b', 6, 1, 'av'),
(2, 'b', 5, 1, 'ab'),
(3, 'c', 5, 2, 'ab'),
(4, 'd', 4, 2, 'ac'),
(4, 'd', 3, 1, 'av');
Query
select a.* from ex1 a
join (
select id, max(price) as maxprice, min(usr) as minuser
from ex1
group by id
) c
on c.id = a.id
and a.price = c.maxprice
and a.usr = c.minuser
order by a.id, a.usr;
Result
id item price qty usr
1 a 7 1 ab
2 b 6 1 ab
3 c 5 2 ab
4 d 4 2 ac
Explanation
In your dataset, ID 1 has 2 records with the same price. You have to make a decision which one you want. So, in the above example, I am showing a single record for the user whose name is lowest alphabetically.
Alternate method
SQL Server has ranking function row_number over() that can be used as well:
select * from (
select row_number() over( partition by id order by id, price desc, usr) as sr, *
from ex1
) c where sr = 1;
The subquery says - give me all records from the table and give each row a serial number starting with 1 unique to each ID. The rows should be sorted by ID first, then price descending and then usr. The outer query picks out records with sr number 1.
Example here: https://rextester.com/KZCZ25396
I have a number of tables that follow this rather common pattern: A <-->> B. I would like to find the pairs of matching rows in table A where certain columns have equal values and also have referencing rows in B where certain columns have equal values. In other words, a pair of rows (R, S) in A matches, iff for given sets of columns {a1, a2, …, an} in A and {b1, b2, …, bn} in B:
We have R.a1 = S.a1, R.a2 = S.a2, …, R.an = S.an.
For every R's referencing row T in B exists S's referencing row U in B s.t. T.b1 = U.b1, T.b2 = U.b2, …, T.bn = U.bn.
(R, S) matches iff (S, R) matches.
(I'm not very familiar with relational algebra, so my definition above might not follow any convention.)
The approach that I came up with was:
Find pairs (R, S) that have matching columns.
See if there's and equal number of (any) R's and S's referencing rows in B.
For each row in B find a matching row, group by the referencing row in A and count. Check that there are as many matching rows as referencing rows.
However, the query that I wrote (below) for steps 2 and 3, to find matching rows in B, is quite complex. Is there a better solution?
-- Tables similar to those that I have.
CREATE TABLE a (
id INTEGER PRIMARY KEY,
data TEXT
);
CREATE TABLE b (
id INTEGER PRIMARY KEY,
a_id INTEGER REFERENCES a (id),
data TEXT
);
SELECT DISTINCT dup.lhs_parent_id, dup.rhs_parent_id
FROM (
SELECT DISTINCT
MIN(lhs.a_id, rhs.a_id) AS lhs_parent_id, -- Normalize.
MAX(lhs.a_id, rhs.a_id) AS rhs_parent_id,
COUNT(*) AS count
FROM b lhs
INNER JOIN b rhs USING (data)
WHERE NOT (lhs.id = rhs.id OR lhs.a_id = rhs.a_id) -- Remove self-matching rows and duplicate values with the same parent.
GROUP BY lhs.a_id, rhs.a_id
) dup
INNER JOIN ( -- Check that lhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) lhs_ct ON (
dup.lhs_parent_id = lhs_ct.parent_id AND
dup.count = lhs_ct.count
)
INNER JOIN ( -- Check that rhs has the same number of rows.
SELECT
a_id AS parent_id,
COUNT(*) AS count
FROM b
GROUP BY a_id
) rhs_ct ON (
dup.rhs_parent_id = rhs_ct.parent_id AND
dup.count = rhs_ct.count
);
-- Test data.
-- Expected query result is three rows with values (1, 2), (1, 3) and (2, 3) for a_id,
-- since the first three rows (with values 'row 1', 'row 2' and 'row 3')
-- have referencing rows, each of which has a matching pair. The fourth row
-- ('row 3') only has one referencing row with the value 'foo', so it doesn't have a
-- pair for the referenced rows with the value 'bar'.
INSERT INTO a (id, data) VALUES
(1, 'row 1'),
(2, 'row 2'),
(3, 'row 3'),
(4, 'row 4');
INSERT INTO b (id, a_id, data) VALUES
(1, 1, 'foo'),
(2, 1, 'bar'),
(3, 2, 'foo'),
(4, 2, 'bar'),
(5, 3, 'foo'),
(6, 3, 'bar'),
(7, 4, 'foo');
I'm using SQLite.
To find matching and different rows it is easier to use INTERSECT and MINUS operations then joins...
But when only one field actually used in comparison JOIN solution looks better:
Select B1.A_Id, B2.A_Id
From (
Select Data, A_Id, Count(Id) A_Count
From B
Group By Data, A_Id
) b1
inner join (
Select Data, A_Id, Count(Id) a_count
From B Group By Data, A_Id
) b2 on b1.data = b2.data and b1.a_count = b2.a_count and b1.a_id <> b2.a_id
As I understand you need to find out the pairs of different a_id which have same data and count of data.
The result of my script gives, the possible couples in two directions, that left room for optimization on SQLlite specific syntax.
Result example:
{1,2}, {1,3}, {2,1}, {2,3}, {3,2}, {3,1}
I have a problem that I would like have solved via a SQL query. This is going to
be used as a PoC (proof of concept).
The problem:
Product offerings are made up of one or many product instances, a product
instance can belong to many product offerings.
This can be realised like this in a table:
PO | PI
-----
A | 10
A | 11
A | 12
B | 10
B | 11
C | 13
Now I would like to get back the product offer from a set of product instances.
E.g. if we send in 10,11,13 the expected result back is B & C, and if we send in
only 10 then the result should be NULL since no product offering is made up of
only 10. Sending in 10,11,12 would result in A (not A & B since 12 is not a valid product offer in it self).
Prerequisites:
The combination of product instances sent in can only result in one specific
combination of product offerings, so there is only one solution to each query.
Okay, I think I have it. This meets the constraints you provided. There might be a way to simplify this further, but it ate my brain a little:
select distinct PO
from POPI x
where
PO not in (
select PO
from POPI
where PI not in (10,11,12)
)
and PI not in (
select PI
from POPI
where PO != x.PO
and PO not in (
select PO
from POPI
where PI not in (10,11,12)
)
);
This yields only results who fill the given set which are disjoint with all other results, which I think is what you were asking for. For the test examples given:
Providing 10,11,12 yields A
Providing 10,11,13 yields B,C
Edit: Whilst I think mine works fine, Adam's answer is without a doubt more elegant and more efficient - I'll just leave mine here for posterity!
Apologies since I know this has been tagged as an Oracle issue since I started playing. This is some SQL2008 code which I think works for all the stated cases....
declare #test table
(
[PI] int
)
insert #test values (10), (11), (13)
declare #testCount int
select #testCount = COUNT(*) from #test
;with PO_WITH_COUNTS as
(
select PO_FULL.PO, COUNT(PO_FULL.[PI]) PI_Count
from ProductOffering PO_FULL
left
join (
select PO_QUALIFYING.PO, PO_QUALIFYING.[PI]
from ProductOffering PO_QUALIFYING
where PO_QUALIFYING.[PI] in (select [PI] from #test)
) AS QUALIFYING
on QUALIFYING.PO = PO_FULL.PO
and QUALIFYING.[PI] = PO_FULL.[PI]
group by
PO_FULL.PO
having COUNT(PO_FULL.[PI]) = COUNT(QUALIFYING.[PI])
)
select PO_OUTER.PO
from PO_WITH_COUNTS PO_OUTER
cross
join PO_WITH_COUNTS PO_INNER
where PO_OUTER.PI_Count = #testCount
or PO_OUTER.PO <> PO_INNER.PO
group by
PO_OUTER.PO, PO_OUTER.PI_Count
having PO_OUTER.PI_Count = #testCount
or PO_OUTER.PI_Count + SUM(PO_INNER.PI_Count) = #testCount
Not sure if Oracle has CTEs but could just state the inner query as two derived tables. The cross join in the outer query lets us find combinations of offerings that have all the valid items. I know that this will only work based on the statement in the question that the data is such that there is only 1 valid combination for each requested set, Without that it's even more complicated as counts are not enough to remove combinations that have duplicate products in them.
I don't have a db in front of me, but off the top of my head you want the list of POs that don't have any PIs not in your input list, ie
select distinct po
from tbl
where po not in ( select po from tbl where pi not in (10,11,13) )
Edit: Here are the example other cases:
When input PI = 10,11,13 the inner select returns A so the outer select returns B, C
When input PI = 10 the inner select returns A,B,C so the outer select returns no rows
When input PI = 10,11,12 the inner select returns C so the outer select returns A,B
Edit: Adam has pointed out that this last case doesn't meet the requirement of only returning A (that'll teach me for rushing), so this isn't yet working code.
Select Distinct PO
From Table T
-- Next eliminates POs that contain other PIs
Where Not Exists
(Select * From Table
Where PO = T.PO
And PI Not In (10, 11, 12))
-- And this eliminates POs that do not contain all the PIs
And Not Exists
(Select Distinct PI From Table
Where PI In (10, 11, 12)
Except
Select Distinct PI From Table
Where PO = T.PO
or, if your database does not implement EXCEPT...
Select Distinct PO
From Table T
-- Next predicate eliminates POs that contain other PIs
Where Not Exists
(Select * From Table
Where PO = T.PO
And PI Not In (10, 11, 12))
-- And this eliminates POs that do not contain ALL the PIs
And Not Exists
(Select Distinct PI From Table A
Where PI In (10, 11, 12)
And Not Exists
(Select Distinct PI From Table
Where PO = T.PO
And PdI = A.PI))
Is it possible that a customers asks for a product more than once?
For example: he/she asks an offering for 10,10,11,11,12?
If this is possible than solutions like
select ...
from ...
where pi in (10,10,11,11,12)
will not work.
Because 'pi in (10,10,11,11,12)' is the same as 'pi in (10,11,12)'.
A solution for 10,10,11,11,12 is A&B.
well some pseudo code from the top of my head here:
select from table where PI = 10 or pi =11, etc
store the result in a temp table
select distinct PO and count(PI) from temp table.
now for each PO you can get the total available PI offerings. if the number of PIs available matches the count in the temp table, it means that you have all the PIs for that PO. add all the POs and you ave your result set.
You will need a count of the items in your list, i.e. #list_count. Figure out which Offerings have Instances that aren't in the list. Select all Offerings that aren't in that list and do have Instances in the list:
select P0,count(*) c from table where P0 not in (
select P0 from table where P1 not in (#list)
) and P1 in (#list) group by P0
I would store that in a temp table and select * records where c = #list_count
If we redefine a bit a problem:
Lets have a customer table with product instances:
crete table cust_pi (
pi varchar(5),
customer varchar(5));
And a "product_catalogue" table:
CREATE TABLE PI_PO_TEST
("PO" VARCHAR2(5 CHAR),
"PI" VARCHAR2(5 CHAR)
);
Lets fill it with some sample data:
insert into CUST_PI (PI, CUSTOMER)
values ('11', '1');
insert into CUST_PI (PI, CUSTOMER)
values ('10', '1');
insert into CUST_PI (PI, CUSTOMER)
values ('12', '1');
insert into CUST_PI (PI, CUSTOMER)
values ('13', '1');
insert into CUST_PI (PI, CUSTOMER)
values ('14', '1');
insert into PI_PO_TEST (PO, PI)
values ('A', '10');
insert into PI_PO_TEST (PO, PI)
values ('A', '11');
insert into PI_PO_TEST (PO, PI)
values ('A', '12');
insert into PI_PO_TEST (PO, PI)
values ('A', '13');
insert into PI_PO_TEST (PO, PI)
values ('B', '14');
insert into PI_PO_TEST (PO, PI)
values ('C', '11');
insert into PI_PO_TEST (PO, PI)
values ('C', '12');
insert into PI_PO_TEST (PO, PI)
values ('D', '15');
insert into PI_PO_TEST (PO, PI)
values ('D', '14');
Then my first shoot solution is like this:
select po1 po /* select all product offerings that match the product definition
(i.e. have the same number of product instances per offering as
in product catalogue */
from (select po po1, count(c.pi) k1
from cust_pi c, pi_po_test t
where c.pi = t.pi
and customer = 1
group by po) t1,
(select po po2, count(*) k2 from pi_po_test group by po) t2
where k1 = k2
and po1 = po2
minus /* add those, that are contained within others */
select slave
from (select po2 master, po1 slave
/* this query returns, that if you have po "master" slave should be removed from result,
as it is contained within*/
from (select t1.po po1, t2.po po2, count(t1.po) k1
from pi_po_test t1, pi_po_test t2
where t1.pi = t2.pi
group by t1.po, t2.po) t1,
(select po, count(po) k2 from pi_po_test group by po) t2
where t1.po2 = t2.po
and k1 < k2)
where master in
/* repeated query from begining. This could be done better :-) */
(select po1 po
from (select po po1, count(c.pi) k1
from cust_pi c, pi_po_test t
where c.pi = t.pi
and customer = 1
group by po) t1,
(select po po2, count(*) k2 from pi_po_test group by po) t2
where k1 = k2
and po1 = po2)
All of that was done on Oracle, so your mileage may vary
I tested this under 4 sets of values and they all returned a correct result. This uses a function that I use in SQL to generate a table from a string of parameters separated by semicolons.
DECLARE #tbl TABLE (
po varchar(10),
pii int)
INSERT INTO #tbl
SELECT 'A', 10
UNION ALL
SELECT 'A', 11
UNION ALL
SELECT 'A', 12
UNION ALL
SELECT 'B', 10
UNION ALL
SELECT 'B', 11
UNION ALL
SELECT 'C', 13
DECLARE #value varchar(100)
SET #value = '10;11;12;'
--SET #value = '11;10;'
--SET #value = '13;'
--SET #value = '10;'
SELECT DISTINCT po
FROM #tbl a
INNER JOIN fMultiValParam (#value) p ON
a.pii = p.paramid
WHERE a.po NOT IN (
SELECT t.po
FROM #tbl t
LEFT OUTER JOIN (SELECT *
FROM #tbl tt
INNER JOIN fMultiValParam (#value) p ON
tt.pii = p.paramid) tt ON
t.pii = tt.pii
AND t.po = tt.po
WHERE tt.po IS NULL)
here's the function
CREATE FUNCTION [dbo].[fMultiValParam]
(#Param varchar(5000))
RETURNS #tblParam TABLE (ParamID varchar(40))
AS
BEGIN
IF (#Param IS NULL OR LEN(#Param) < 2)
BEGIN
RETURN
END
DECLARE #len INT
DECLARE #index INT
DECLARE #nextindex INT
SET #len = DATALENGTH(#Param)
SET #index = 0
SET #nextindex = 0
WHILE (#index < #len)
BEGIN
SET #Nextindex = CHARINDEX(';', #Param, #index)
INSERT INTO #tblParam
SELECT SUBSTRING(#Param, #index, #nextindex - #index)
SET #index = #nextindex + 1
END
RETURN
END
Try this:
SELECT DISTINCT COALESCE ( offer, NULL )
FROM products
WHERE instance IN ( #instancelist )
IMHO impossible via pure SQL without some stored-procedure code. But... i'm not sure.
Added: On the other hand, I'm getting an idea about a recursive query (in MSSQL 2005 there is such a thing, which allows you to join a query with it's own results until there are no more rows returned) which might "gather" the correct answers by cross-joining the results of previous step with all products and then filtering out invalid combinations. You would however get all permutations of valid combinations and it would hardly be efficient. And the idea is pretty vague, so I can't guarantee that it can actually be implemented.