Add missing rows within a table - sql

I need a hint please, in my table it can happen that positions of an order is not written to the next ID.
Let's look at the table:
Pos 2 is missing in ID 3
ID
DOC
POSI
TOTAL
1
123
1
100
1
123
2
600
1
123
3
200
2
123
1
100
2
123
2
600
2
123
3
200
3
123
1
100
3
123
3
200
Is it possible to create a view using SQL that compares the individual IDs partitions with each other and appends the missing value from ID 2 to ID 3 as a row?
Maybe you have some keywords for me, if something like this is possible.

The hint would be: Use a join.
One way of approaching this is, that you select the key pairs that you expect and then left join the original table. Be conscious about the missing-value handling, since you have not specified in your question what should happen to those newly created entries.
Test Data
CREATE TABLE test (id INTEGER, doc INTEGER, posi INTEGER, total INTEGER);
INSERT INTO test VALUES (1, 123, 1, 100);
INSERT INTO test VALUES (1, 123, 2, 600);
INSERT INTO test VALUES (1, 123, 3, 200);
INSERT INTO test VALUES (2, 123, 1, 100);
INSERT INTO test VALUES (2, 123, 2, 600);
INSERT INTO test VALUES (2, 123, 3, 200);
INSERT INTO test VALUES (3, 123, 1, 100);
INSERT INTO test VALUES (3, 123, 3, 200);
The possible key combinations can be generated with a cross join:
SELECT DISTINCT a.id, b.posi
FROM test a, test b
And now join the original table:
WITH expected_lines AS (
SELECT DISTINCT a.id, b.posi
FROM test a, test b
)
SELECT el.id, el.posi, t.doc, t.total
FROM expected_lines el
LEFT JOIN test t ON el.id = t.id AND el.posi = t.posi
You did not describe further, what should happen with the now empty columns. As you may note DOC and TOTAL are null.
My educated guess would be, that you want to make DOC part of the key and assume a TOTAL of 0. If that's the case, you can go with the following:
WITH expected_lines AS (
SELECT DISTINCT a.id, b.posi, c.doc
FROM test a, test b, test c
)
SELECT el.id, el.posi, el.doc, ifnull(t.total, 0) total
FROM expected_lines el
LEFT JOIN test t ON el.id = t.id AND el.posi = t.posi AND el.doc = t.doc
Result

Related

Getting SQL rows where >= 1 row have a certain value in another column

I've seen similarly worded questions, and I may be phrasing it wrong, but take the following example table:
a
b
1
5
2
6
3
7
1
8
2
8
2
9
1
10
2
10
3
10
And say I know beforehand that I have a values [1,2]. How can I get all values of b that the a values share? In the above, the result would be [8, 10]. If I had [1,3] for a, then I would get [10]. If I had [2] for a, I would get [6,8,9,10]
I imagine it would start something like SELECT b from tablename WHERE ...
You can use intersect
Schema and insert statements:
create table test(a int, b int);
insert into test values(1, 5);
insert into test values(2, 6);
insert into test values(3, 7);
insert into test values(1, 8);
insert into test values(2, 8);
insert into test values(2, 9);
insert into test values(1, 10);
insert into test values(2, 10);
insert into test values(3, 10);
Query1:
select b from test where a=1
intersect
select b from test where a=2
Output:
b
8
10
Query2:
select b from test where a=1
intersect
select b from test where a=3
Output:
b
10
Query3:
select b from test where a=2
Output:
b
6
8
9
10
db<>fiddle here
Create a CTE that returns the values of a that you want and filter the table for these values only.
Then group by b and in the HAVING clause filter the resultset so that only values of b that are associated to the values of a that you want are returned:
WITH cte(a) AS (VALUES (1), (2))
SELECT b
FROM tablename
WHERE a IN cte
GROUP BY b
HAVING COUNT(*) = (SELECT COUNT(*) FROM cte);
See the demo.

Check duplicates in sql table and replace the duplicates ID in another table

I have a table with duplicate entries (I forgot to make NAME column unique)
So I now have this Duplicate entry table called 'table 1'
ID NAME
1 John F Smith
2 Sam G Davies
3 Tom W Mack
4 Bob W E Jone
5 Tom W Mack
IE ID 3 and 5 are duplicates
Table 2
ID NAMEID ORDERS
1 2 item4
2 1 item5
3 4 item6
4 3 item23
5 5 item34
NAMEID are ID from table 1. Table 2 ID 4 and 5 I want to have NAMEID of 3 (Tom W Mack's Orders) like so
Table 2 (correct version)
ID NAMEID ORDERS
1 2 item4
2 1 item5
3 4 item6
4 3 item23
5 3 item34
Is there an easy way to find and update the duplicates NAMEID in table 2 then remove the duplicates from table 1
In this case what you can do is.
You can find how many duplicate records you have.
In Order to find duplicate records you can use.
SELECT ID, NAME,COUNT(1) as CNT FROM TABLE1 GROUP BY ID, NAME
This is will give you the count and you find all the duplicate records
and delete them manually.
Don't forget to alter your table after removing all the duplicate records.
Here's how you can do it:
-- set up the environment
create table #t (ID int, NAME varchar(50))
insert #t values
(1, 'John F Smith'),
(2, 'Sam G Davies'),
(3, 'Tom W Mack'),
(4, 'Bob W E Jone'),
(5, 'Tom W Mack')
create table #t2 (ID int, NAMEID int, ORDERS varchar(10))
insert #t2 values
(1, 2, 'item4'),
(2, 1, 'item5'),
(3, 4, 'item6'),
(4, 3, 'item23'),
(5, 5, 'item34')
go
-- update the referencing table first
;with x as (
select id,
first_value(id) over(partition by name order by id) replace_with
from #t
),
y as (
select #t2.nameid, x.replace_with
FROM #t2
join x on #t2.nameid = x.id
where #t2.nameid <> x.replace_with
)
update y set nameid = replace_with
-- delete duplicates from referenced table
;with x as (
select *, row_number() over(partition by name order by id) rn
from #t
)
delete x where rn > 1
select * from #t
select * from #t2
Pls, test first for performance and validity.
Let's use the example data
INSERT INTO TableA
(`ID`, `NAME`)
VALUES
(1, 'NameA'),
(2, 'NameB'),
(3, 'NameA'),
(4, 'NameC'),
(5, 'NameB'),
(6, 'NameD')
and
INSERT INTO TableB
(`ID`, `NAMEID`, `ORDERS`)
VALUES
(1, 2, 'itemB1'),
(2, 1, 'itemA1'),
(3, 4, 'itemC1'),
(4, 3, 'itemA2'),
(5, 5, 'itemB2'),
(5, 6, 'itemD1')
(makes it a bit easier to spot the duplicates and check the result)
Let's start with a simple query to get the smallest ID for a given NAME
SELECT
NAME, min(ID)
FROM
tableA
GROUP BY
NAME
And the result is [NameA,1], [NameB,2], [NameC,4], [NameD,6]
Now if you use that as an uncorrelated subquery for a JOIN with the base table like
SELECT
keep.kid, dup.id
FROM
tableA as dup
JOIN
(
SELECT
NAME, min(ID) as kid
FROM
tableA
GROUP BY
NAME
) as keep
ON
keep.NAME=dup.NAME
AND keep.kid<dup.id
It finds all duplicates that have the same name as in the result of the subquery but a different id + it also gives you the id of the "original", i.e. the smallest id for that name.
For the example it's [1,3], [2,5]
Now you can use that in an UPDATE query like
UPDATE
TableB as b
JOIN
tableA as dup
JOIN
(
SELECT
NAME, min(ID) as kid
FROM
tableA
GROUP BY
NAME
) as keep
ON
keep.NAME=dup.NAME
AND keep.kid<dup.id
SET
b.NAMEID=keep.kid
WHERE
b.NAMEID=dup.id
And the result is
ID,NAMEID,ORDERS
1, 2, itemB1
2, 1, itemA1
3, 4, itemC1
4, 1, itemA2 <- now has NAMEID=1
5, 2, itemB2 <- now has NAMEID=2
5, 6, itemD1
To eleminate the duplicates from tableA you can use the first query again.

Oracle SQL - How can I write an insert statement that is conditional and looped?

Context:
I have two tables: markettypewagerlimitgroups (mtwlg) and stakedistributionindicators (sdi). When a mtwlg is created, 2 rows are created in the sdi table which are linked to the mtwlg - each row with the same values bar 2, the id and another field (let's call it column X) which must contain a 0 for one row and 1 for the other.
There was a bug present in our codebase which prevented this happening automatically, so any mtwlg's created during the time that bug was present do not have the related sdi's, causing NPE's in various places.
To fix this, a patch needs to be written to loop through the mtwlg table and for each ID, search the sdi table for the 2 related rows. If the rows are present, do nothing; if there is only 1 row, check if F is a 0 or a 1, and insert a row with the other value; if neither row is present, insert them both. This needs to be done for every mtwlg, and a unique ID needs to be inserted too.
Pseudocode:
For each market type wager limit group ID
Check if there are 2 rows with that id in the stake distributions table, 1 where column X = 0 and one where column X = 1
if none
create 2 rows in the stake distributions table with unique id's; 1 for each X value
if one
create the missing row in the stake distributions table with a unique id
if 2
do nothing
If it helps at all - the patch will be applied using liquibase.
Anyone with any advice or thoughts as to if and how this will be possible to write in SQL/a liquibase patch?
Thanks in advance, let me know of any other information you need.
EDIT:
I've actually just been advised to do this using PL/SQL, do you have any thoughts/suggestions in regards to this?
Thanks again.
Oooooh, an excellent job for MERGE.
Here's your pseudo code again:
For each market type wager limit group ID
Check if there are 2 rows with that id in the stake distributions table,
1 where column X = 0 and one where column X = 1
if none
create 2 rows in the stake distributions table with unique id's;
1 for each X value
if one
create the missing row in the stake distributions table with a unique id
if 2
do nothing
Here's the MERGE variant (still pseudo-code'ish as I don't know how your data really looks):
MERGE INTO stake_distributions d
USING (
SELECT limit_group_id, 0 AS x
FROM market_type_wagers
UNION ALL
SELECT limit_group_id, 1 AS x
FROM market_type_wagers
) t
ON (
d.limit_group_id = t.limit_group_id AND d.x = t.x
)
WHEN NOT MATCHED THEN INSERT (d.limit_group_id, d.x)
VALUES (t.limit_group_id, t.x);
No loops, no PL/SQL, no conditional statements, just plain beautiful SQL.
Nice alternative suggested by Boneist in the comments uses a CROSS JOIN rather than UNION ALL in the USING clause, which is likely to perform better (unverified):
MERGE INTO stake_distributions d
USING (
SELECT w.limit_group_id, x.x
FROM market_type_wagers w
CROSS JOIN (
SELECT 0 AS x FROM DUAL
UNION ALL
SELECT 1 AS x FROM DUAL
) x
) t
ON (
d.limit_group_id = t.limit_group_id AND d.x = t.x
)
WHEN NOT MATCHED THEN INSERT (d.limit_group_id, d.x)
VALUES (t.limit_group_id, t.x);
Answer: you don't. There is absolutely no need to loop through anything - you can do it in a single insert. All you need to do is identify the rows that are missing, and then you just need to add them in.
Here is an example:
drop table t1;
drop table t2;
drop sequence t2_seq;
create table t1 (cola number,
colb number,
colc number);
create table t2 (id number,
cola number,
colb number,
colc number,
colx number);
create sequence t2_seq
START WITH 1
INCREMENT BY 1
MAXVALUE 99999999
MINVALUE 1
NOCYCLE
CACHE 20
NOORDER;
insert into t1 values (1, 10, 100);
insert into t2 values (t2_seq.nextval, 1, 10, 100, 0);
insert into t2 values (t2_seq.nextval, 1, 10, 100, 1);
insert into t1 values (2, 20, 200);
insert into t2 values (t2_seq.nextval, 2, 20, 200, 0);
insert into t1 values (3, 30, 300);
insert into t2 values (t2_seq.nextval, 3, 30, 300, 1);
insert into t1 values (4, 40, 400);
commit;
insert into t2 (id, cola, colb, colc, colx)
with dummy as (select 1 id from dual union all
select 0 id from dual)
select t2_seq.nextval,
t1.cola,
t1.colb,
t1.colc,
d.id
from t1
cross join dummy d
left outer join t2 on (t2.cola = t1.cola and d.id = t2.colx)
where t2.id is null;
commit;
select * from t2
order by t2.cola;
ID COLA COLB COLC COLX
---------- ---------- ---------- ---------- ----------
1 1 10 100 0
2 1 10 100 1
3 2 20 200 0
5 2 20 200 1
7 3 30 300 0
4 3 30 300 1
6 4 40 400 0
8 4 40 400 1
If the processing logic is too gnarly to be encapsulated in a single SQL statement, you may need to resort to cursor for loops and row types - basically allows you to do things like the following:
DECLARE
r_mtwlg markettypewagerlimitgroups%ROWTYPE;
BEGIN
FOR r_mtwlg IN (
SELECT mtwlg.*
FROM markettypewagerlimitgroups mtwlg
)
LOOP
-- do stuff here
-- refer to elements of the current row like this
DBMS_OUTPUT.PUT_LINE(r_mtwlg.id);
END LOOP;
END;
/
You can obviously nest another loop inside this one that hits the stakedistributionindicators table, but I'll leave that as an exercise for you. You could also left join to stakedistributionindicators a couple of times in this first cursor so that you only return rows that don't already have an x=1 and x=0, again you can probably work that bit out for yourself.
If you would rather write your logic in Java vs. PL/SQL, Liquibase allows you to create custom changes. The custom change points to a Java class you write that can do whatever logic you need. A simple example can be found here

How to Get Sum of One Column Based On Other Table in Sql Server

I have 2 table in my database (like this):
tblCustomers:
id CustomerName
1 aaa
2 bbb
3 ccc
4 ddd
5 eee
6 fff
tblPurchases:
id CustomerID Price
1 1 300
2 2 100
3 3 500
4 1 150
5 4 50
6 3 250
7 6 700
8 2 30
9 1 310
10 4 25
Now I want with "Stored Procedures" take a new table that give me the sum of price for each customer. Exactly like under.
How can do that?
Procedures Result:
id CustomerName SumPrice
1 aaa 760
2 bbb 130
3 ccc 750
4 ddd 75
5 eee 0
6 fff 700
select c.id, c.customername, sum(isnull(p.price, 0)) as sumprice
from tblcustomers c
left join tblpurchases p
on c.id = p.customerid
group by c.id, c.customername
SQL Fiddle test: http://sqlfiddle.com/#!3/9b573/1/0
Note the need for an outer join because your desired result includes customers with no purchases.
You can use the below query to get the result
select id,CustomerName,sum(price) as TotalPrice
from
(
select tc.id,tc.CustomerName,tp.price
from tblCustomers tc
join
tblPurchases tp on tc.id = tp.CustomerID
) tab
group by id,CustomerName
Although the other answers here do work, they don't appear to be what I would consider standard practice, or optimal.
The simplest solution (standard, but not always optimal) requires no sub-query of any variety.
SELECT
cust.id,
cust.CustomerName,
SUM(prch.price) AS SumPrice
FROM
tblCustomers AS cust
INNER JOIN
tblPurchases AS prch
ON cust.id = prch.CustomerID
GROUP BY
cust.id,
cust.CustomerName
The only reason that this is not necessarily optimal is that it involves grouping by two fields, one of which is a string. This involves creating 'counters' in memory that are identified by this composite of an id and string, which can be inefficient due to the fact that you only really need to use the id to uniquely identify the counter. (The identifier is only one item and is a small (probably only 4 bytes), rather than multiple items one of which is long (potentially many many bytes)).
This means that you can do the following as a possible optimisation. Though depending on your data this many be a premature optimsation, it has no performance down-side and is always good to know about...
SELECT
cust.id,
cust.CustomerName,
prch.SumPrice
FROM
tblCustomers AS cust
INNER JOIN
(
SELECT
CustomerID,
SUM(price) AS SumPrice
FROM
tblPurchases
GROUP BY
CustomerID
) AS prch
ON cust.id = prch.CustomerID
This makes the in-memory aggregation as simple as possible, as so as quick as possible.
In both cases you should have the best possible efficiency in the query by ensuring that you have indexes on tblCustomer(id) and on tblPurchases(CustomerID),
DECLARE #tblcustomers table (id int, customername varchar(10));
insert into #tblcustomers values (1, 'aaa');
insert into #tblcustomers values (2, 'bbb');
insert into #tblcustomers values (3, 'ccc');
insert into #tblcustomers values (4, 'ddd');
insert into #tblcustomers values (5, 'eee');
insert into #tblcustomers values (6, 'fff');
DECLARE #tblpurchases table (id int, customerid int, price int);
insert into #tblpurchases values (1, 1, 300);
insert into #tblpurchases values (2, 2, 100);
insert into #tblpurchases values (3, 3, 500);
insert into #tblpurchases values (4, 1, 150);
insert into #tblpurchases values (5, 4, 50);
insert into #tblpurchases values (6, 3, 250);
insert into #tblpurchases values (7, 6, 700);
insert into #tblpurchases values (8, 2, 30);
insert into #tblpurchases values (9, 1, 310);
insert into #tblpurchases values (10, 4, 25);
WITH CTE AS(
select c.id,c.customername from #tblcustomers c
)
Select c.id,c.customername,(Select SUM(ISNULL(P.price,0)) from #tblpurchases P
WHERE P.customerid = C.id) AS Price from CTE c

How to get value without subqueries (on SQL-Server)?

I have the following table on SQL Server:
ID Num
1 A
2 B
2 B
3 C
3 C
4 C
(Num is a numeric column - A, B, and C are standins for numeric values, for the purpose of this question)
How to get the value of A+B+C+C without using subqueries and CTE?
A - for 1, B - for 2, C - for 3, C - for 4.
The answer seems to sum(distinct Num), but distinct is by ID field!
Demo table:
create table test (ID int, Num int);
insert into test values (1, 10);
insert into test values (2, 100);
insert into test values (2, 100);
insert into test values (3, 1000);
insert into test values (3, 1000);
insert into test values (4, 1000);
The correct answer is 10+100+1000+1000 = 2110.
A random guess, using CTE to avoid the pointless subquery restriction:
With X as (Select Distinct Id, No From Test)
Select
Sum(No)
From X
Or using a derived table (which works in SQL 2000):
Select
Sum(No)
From (
Select Distinct
Id,
No
From
Test
) a;
http://sqlfiddle.com/#!3/77a6e/6
The solution:
select cast(sum(distinct Num + cast(0.00001 as number(38,19))/ID) as number(18,2))