change "many-to-many" to "one-to-many" - sql

I have following table and data:
create table Foo
(
id int not null,
hid int not null,
value int not null
)
insert into Foo(id, hid, value) values(1,1,1) -- use this as 1 < 3
insert into Foo(id, hid, value) values(1,2,3)
insert into Foo(id, hid, value) values(2,3,3) -- use this as 3 < 5
insert into Foo(id, hid, value) values(2,4,5)
insert into Foo(id, hid, value) values(3,2,2) -- use this or next one as value are the same
insert into Foo(id, hid, value) values(3,3,2)
Currently the "id" and "hid" has many-to-many association, what I want to achieve is to make the "hid" as "one" instead of "many", the rule is to use the minimum "value" in the table, see comment in above sql code.
Is this possible use some query to achieve this instead of a cursor?
Thanks!

SQL 2005:
WITH X AS ( SELECT id, min(value) as minval from Foo group by id )
SELECT * FROM
(
SELECT Foo.*, RANK() OVER ( PARTITION by Foo.id order by Foo.hid, Foo.value ) as Rank
FROM Foo JOIN X on Foo.id = X.id and Foo.value = X.minval
) tmp
WHERE Rank = 1
id hid value Rank
----------- ----------- ----------- --------------------
1 1 1 1
2 3 3 1
3 2 2 1
The first line (WITH clause) gets a set of ids with the min value (my arbitrary choice).
The RANK is used to eliminate duplicates - there may be a better way.
With MySql or SQL 2000 I guess you could do this with a complicated set of subqueries.

Not sure if you are looking for a query, or instructions on how to modify your schema, but here is a query:
select id, min(hid) as hid, min(value) as value
from Foo
group by id

Related

SQL - group by occurrence and return id

I have a table of IDs and value:
ID Value
X 1
X 1
X 2
Y 5
Y 5
Y 5
Z 3
Z 6
I want to see which ID contains more than 1 different value. In this case return ID X and Y because X contains[1,2] and Z contains [3,6]:
ID
X
Z
I have tried this:
select ID from
(
SELECT ID
,count(*) over (partition by [Value]) as c
FROM mytable
) a
where c>1
But this is not returning the desired answer
I prefer aggregating this way:
SELECT ID
FROM mytable
GROUP BY ID
HAVING MIN(Value) <> MAX(Value);
On many databases, the above HAVING clause will be sargable, meaning that an index on (ID, Value) can be used. The version which checks COUNT(DISTINCT Value) may not be able to use such an index.
Try this,
SELECT ID
FROM mytable
GROUP BY ID
HAVING COUNT(DISTINCT Value) > 1;
Just group them by ID and check wheter it got more than 1 occurrencies in Value field. Something like this
SELECT ID
FROM table
GROUP BY ID
HAVING COUNT(DISTINCT Value) > 1
CREATE TABLE yourtable(
ID VARCHAR(30) NOT NULL
,Value int NOT NULL
);
INSERT INTO yourtable
(ID,Value) VALUES
('X',1),
('X',1),
('X',2),
('Y',5),
('Y',5),
('Y',5),
('Z',3),
('Z',6);
Other approaches are far better,but I used Rank and Subquery to distinguish ID with more than one occurrence.
SELECT ID
FROM   (SELECT *,
               Rank()
                 OVER(
                   partition BY ID
                   ORDER BY Value) ID2
        FROM   yourtable) a
WHERE ID2 > 1
dbfiddle

Oracle SQL: using LAG function with user-defined-type returns "inconsistent datatypes"

I have a type MyType defined as follows:
create or replace type MyType as varray(20000) of number(18);
And a table MyTable defined as follows:
create table MyTable (
id number(18) primary key
,widgets MyType
)
I am trying to select the widgets for each row and its logically previous row in MyTable using the following SQL:
select t.id
,lag(t.widgets,1) over (order by t.id) as widgets_previous
from MyTable t
order by t.id;
and I get the response:
ORA-00932: inconsistent datatypes: expected - got MYSCHEMA.MYTYPE
If I run the exact same query using a column of type varchar or number instead of MyType it works fine.
The type of the column in the current row and its previous row must be the same so I can only assume it is something related to the user defined type.
Do I need to do something special to use LAG with a user defined type, or does LAG not support user defined types? If the latter, are there any other utility functions that would provide the same functionality or do I need to do a traditional self join in order to achieve the same?
After reading all the above I've opted for the following as the most effective method for achieving what I need:
select curr.id
,curr.widgets as widgets
,prev.widgets as previous_widgets
from (select a.id
,a.widgets
,lag(a.id,1) over (order by a.id) as previous_id
from mytable a
) curr
left join mytable prev on (prev.id = curr.previous_id)
order by curr.id
ie. a lag / self join hybrid using lag on a number field that it doesn't complain about to identify the join condition. It's fairly tidy I think and I get my collections as desired. Thanks to everyone for the extremely useful input.
You can use lag with UDT. The problem is varray
Does this give you a result?
select t.id
,lag(
(select listagg(column_value, ',') within group (order by column_value)
from table(t.widgets))
,1) over (order by t.id) as widgets_previous
from MyTable t
order by t.id;
You could try something like:
SQL> create or replace type TestType as varray(20000) of number(18);
Type created.
SQL> create table TestTable (
id number(18) primary key
,widgets TestType
)
Table created.
SQL> delete from testtable
0 rows deleted.
SQL> insert into TestTable values (1, TestType(1,2,3,4))
1 row created.
SQL> insert into TestTable values (2, TestType(5,6,7))
1 row created.
SQL> insert into TestTable values (3, TestType())
1 row created.
SQL> insert into TestTable values (4,null)
1 row created.
SQL> commit
Commit complete.
SQL> -- show all data with widgets
SQL> select t.id, w.column_value as widget_ids
from testtable t, table(t.widgets) w
ID WIDGET_IDS
---------- ----------
1 1
1 2
1 3
1 4
2 5
2 6
2 7
7 rows selected.
SQL> -- show with lag function
SQL> select t.id, lag(w.column_value, 1) over (order by t.id) as widgets_previous
from testtable t, table(t.widgets) w
ID WIDGETS_PREVIOUS
---------- ----------------
1
1 1
1 2
1 3
2 4
2 5
2 6
7 rows selected.

Percentage by group - oracle

I have this sample.
What I need is getting an average per key not key and value. However, the syntax I used appear to give me the average per key and value.
select avg(value2),KEY,VALUE from testavg
GROUP BY key,value
order by key, value
Doing otherwise will yield a syntax error. The results I need are as follow:
10 A 0.96
10 B 0.04
12 C 1
But the statement I used yields the incorrect results above.
Could this be achieved by issuing 1 single oracle select statement? I have included the statement to create the entire table.
CREATE TABLE "TESTAVG"
( "KEY" NUMBER,
"VALUE" VARCHAR2(20 BYTE),
"VALUE2" NUMBER
)
Insert into TESTAVG (KEY,VALUE,VALUE2) values (10,'A',12);
Insert into TESTAVG (KEY,VALUE,VALUE2) values (10,'A',13);
Insert into TESTAVG (KEY,VALUE,VALUE2) values (10,'B',1);
Insert into TESTAVG (KEY,VALUE,VALUE2) values (12,'C',20);
This query might run faster on larger data - only reads the table once:
select distinct key, value,
sum(value2) over (partition by key, value) / sum(value2) over (partition by key) r
from testavg
/
KEY VALUE R
---------- -------------------- ----------
10 A .961538462
10 B .038461538
12 C 1
select avg(value2),KEY from testavg
GROUP BY key
order by key;
8.66666666666666666666666666666666666667 10
20 12
EDIT: Specs are still not clear but this might be what you need...
with gr1 as (select key,sum(value2) sumvalue
from testavg
group by key)
, gr2 as (select key,value,sum(value2) sumvalue
from testavg
GROUP BY key,value)
select gr1.key,gr2.value,gr2.sumvalue/gr1.sumvalue
from gr1
, gr2
where gr1.key = gr2.key;
10 B 0.0384615384615384615384615384615384615385
12 C 1
10 A 0.9615384615384615384615384615384615384615

Fixing duplicate rows in a table

I have a table like below
DECLARE #ProductTotals TABLE
(
id int,
value nvarchar(50)
)
which has following value
1, 'abc'
2, 'abc'
1, 'abc'
3, 'abc'
I want to update this table so that it has the following values
1, 'abc'
2, 'abc_1'
1, 'abc'
3, 'abc_2'
Could someone help me out with this
Use a cursor to move over the table and try to insert every row in a second temporary table. If you get a collision (technically with a select), you can run a second query to get the maximum number (if any) that's appended to your item.
Once you know what maximum number is used (use isnull to cover the case of the first duplicate) just run an update over your original table and keep going with your scan.
Are you looking to remove duplicates? or just change the values so they aren't duplicate?
to change the values use
update producttotals
set value = 'abc_1'
where id =2;
update producttotals
set value = 'abc_2'
where id =3;
to find duplicate rows do a
select id, value
from producttotals
group by id, value
having count() > 2;
Assuming SQL Server 2005 or greater
DECLARE #ProductTotals TABLE
(
id int,
value nvarchar(50)
)
INSERT INTO #ProductTotals
VALUES (1, 'abc'),
(2, 'abc'),
(1, 'abc'),
(3, 'abc')
;WITH CTE as
(SELECT
ROW_NUMBER() OVER (Partition by value order by id) rn,
id,
value
FROM
#ProductTotals),
new_values as (
SELECT
pt.id,
pt.value,
pt.value + '_' + CAST( ROW_NUMBER() OVER (partition by pt.value order by pt.id) as varchar) new_value
FROM
#ProductTotals pt
INNER JOIN CTE
ON pt.id = CTE.id
and pt.value = CTE.value
WHERE
pt.id NOT IN (SELECT id FROM CTE WHERE rn = 1)) --remove any with the lowest ID for the value
UPDATE
#ProductTotals
SET
pt.value = nv.new_value
FROM
#ProductTotals pt
inner join new_values nv
ON pt.id = nv.id and pt.value = nv.value
SELECT * FROM #ProductTotals
Will produce the following
id value
----------- --------------------------------------------------
1 abc
2 abc_1
1 abc
3 abc_2
Explanation of the SQL
The first CTE creates a row number Value. So the numbering gets restarted whenever it sees a new value
rn id value
-------------------- ----------- --------
1 1 abc
2 1 abc
3 2 abc
4 3 abc
The second CTE called new_values ignores any IDs that are assoicated with with a RN of 1. So rn 1 and rn 2 get removed because they share the same ID. It also uses ROW_NUMBER() again to determine the number for the new_value
id value new_value
----------- ------ -------------
2 abc abc_1
3 abc abc_2
The final statement just updates the Old value with the new value

How do I get first unused ID in the table?

I have to write a query wherein i need to allocate a ID (unique key) for a particular record which is not being used / is not being generated / does not exist in database.
In short, I need to generate an id for a particular record and show it on print screen.
E. g.:
ID Name
1 abc
2 def
5 ghi
So, the thing is that it should return ID=3 as the next immediate which is not being generated yet, and after this generation of the id, I will store this data back to database table.
It's not an HW: I am doing a project, and I have a requirement where I need to write this query, so I need some help to achieve this.
So please guide me how to make this query, or how to achieve this.
Thanks.
I am not able to add comments,, so thats why i am writing my comments here..
I am using MySQL as the database..
My steps would be like this:-
1) Retrieve the id from the database table which is not being used..
2) As their are no. of users (website based project), so i want no concurrency to happen,, so if one ID is generated to one user, then it should lock the database, until the same user recieves the id and store the record for that id.. After that, the other user can retrieve the ID whichever is not existing.. (Major requirement)..
How can i achive all these things in MySQL,, Also i suppose Quassnoi's answer will be worth,, but its not working in MySQL.. so plz explain the bit about the query as it is new to me.. and will this query work in MySQL..
I named your table unused.
SELECT id
FROM (
SELECT 1 AS id
) q1
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
UNION ALL
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
ORDER BY
id
LIMIT 1
This query consists of two parts.
The first part:
SELECT *
FROM (
SELECT 1 AS id
) q
WHERE NOT EXISTS
(
SELECT 1
FROM unused
WHERE id = 1
)
selects a 1 is there is no entry in the table with this id.
The second part:
SELECT *
FROM (
SELECT id + 1
FROM unused t
WHERE NOT EXISTS
(
SELECT 1
FROM unused ti
WHERE ti.id = t.id + 1
)
ORDER BY
id
LIMIT 1
) q2
selects a first id in the table for which there is no next id.
The resulting query selects the least of these two values.
Depends on what you mean by "next id" and how it's generated.
If you're using a sequence or identity in the database to generate the id, it's possible that the "next id" is not 3 or 4 but 6 in the case you've presented. You have no way of knowing whether or not there were values with id of 3 or 4 that were subsequently deleted. Sequences and identities don't necessarily try to reclaim gaps; once they're gone you don't reuse them.
So the right thing to do is to create a sequence or identity column in your database that's automatically incremented when you do an INSERT, then SELECT the generated value.
The correct way is to use an identity column for the primary key. Don't try to look at the rows already inserted, and pick an unused value. The Id column should hold a number large enough that your application will never run out of valid new (higher) values.
In your description , if you are skipping values that you are trying to use later, then you are probably giving some meaning to the values. Please reconsider. You probably should only use this field as a look up (a reference) value from another table.
Let the database engine assign the next higher value for your ID. If you have more than one process running concurrently, you will need to use LAST_INSERT_ID() function to determine the ID that the database generated for your row. You can use LAST_INSERT_ID() function within the same transaction before you commit.
Second best (but not good!) is to use the max value of the index field plus one. You would have to do a table lock to manage the concurrency issues.
/*
This is a query script I wrote to illustrate my method, and it was created to solve a Real World problem where we have multiple machines at multiple stores creating transfer transactions in their own databases,
that are then synced to other databases on the store (this happens often, so getting the Nth free entry for the Nth machine should work) where the transferid is the PK and then those are synced daily to a MainFrame where the maximum size of the key (which is the TransactionID and StoreID) is limited.
*/
--- table variable declarations
/* list of used transaction ids (this is just for testing, it will be the view or table you are reading the transaction ids from when implemented)*/
DECLARE #SampleTransferIDSourceTable TABLE(TransferID INT)
/* Here we insert the used transaction numbers*/
DECLARE #WorkTable TABLE (WorkTableID INT IDENTITY (1,1), TransferID INT)
/*this is the same table as above with an extra column to help us identify the blocks of unused row numbers (modifying a table variable is not a good idea)*/
DECLARE #WorkTable2 TABLE (WorkTableID INT , TransferID INT, diff int)
--- Machine ID declared
DECLARE #MachineID INT
-- MachineID set
SET #MachineID = 5
-- put in some rows with different sized blocks of missing rows.
-- comment out the inserts after two to the bottom to see how it handles no gaps or make
-- the #MachineID very large to do the same.
-- comment out early rows to test how it handles starting gaps.
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 1 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 2 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 4 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 5 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 6 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 9 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 10 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 20 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 21 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 24 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 25 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 30 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 31 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 33 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 39 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 40 )
INSERT #SampleTransferIDSourceTable ( TransferID ) VALUES ( 50 )
-- copy the transaction ids into a table with an identiy item.
-- When implemented add where clause before the order by to limit to the local StoreID
-- Zero row added so that it will find gaps before the lowest used row.
INSERT #WorkTable (TransferID)
SELECT 0
INSERT #WorkTable (TransferID)
SELECT TransferID FROM #SampleTransferIDSourceTable ORDER BY TransferID
-- copy that table to the new table with the diff column
INSERT #WorkTable2
SELECT WorkTableID,TransferID,TransferID - WorkTableID
FROM #WorkTable
--- gives us the (MachineID)th unused ID or the (MachineID)th id beyond the highest id used.
IF EXISTS (
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
)
SELECT Top 1
GapStart.TransferID + #MachineID - (GapStart.diff + 1)
FROM #WorkTable2 GapStart
INNER JOIN #WorkTable2 GapEnd
ON GapStart.WorkTableID = GapEnd.WorkTableID - 1
AND GapStart.diff < GapEnd.diff
AND gapEnd.diff >= (#MachineID - 1)
ORDER BY GapStart.TransferID
ELSE
SELECT MAX(TransferID) + #MachineID FROM #SampleTransferIDSourceTable
Should work under MySql.
SELECT TOP 100
T1.ID + 1 AS FREE_ID
FROM TABLE1 T1
LEFT JOIN TABLE2 T2 ON T2.ID = T1.ID + 1
WHERE T2.ID IS NULL
are you allowed to have a utility table? if so i would create a table like so:
CREATE TABLE number_helper (
n INT NOT NULL
,PRIMARY KEY(n)
);
Fill it with all positive 32 bit integers (assuming the id you need to generate is a positive 32 bit integer)
Then you can select like so:
SELECT MIN(h.n) as nextID
FROM my_table t
LEFT JOIN number_helper h ON h.n = t.ID
WHERE t.ID IS NULL
Haven't actually tested this but it should work.