Create a new column of value based on another column - sql

I am new to SQL, and am asked to create two new columns of value based on another column in Oracle Sql.
Here is how data looks like:Under each ID, there is also an IDseq representing a sub-segment in this ID, each with a Start and End place.
SQL needs to help me find the smallest IDseq under each ID, then find the corresponding start place. Similarly, find the largest IDseq under each ID, then find the corresponding end place. Each unique ID would have only one origin and one destination, which will be shown in the two new columns. I'd like to create two new columns (see below) - Origin and Dest to show the origin and destination place for each ID.
Really appreciate your help.

You can use a CASE statement, such as:
select
a.idseq, a.id, a.start, a.end,
case
when a.id = 'ABC' then 'X'
when a.id = 'BCD' then 'Q'
end as origin,
case
when a.id = 'ABC' then 'G'
when a.id = 'BCD' then 'Z'
end as dest
from
yourtablename a

I wrote this before seeing the Oracle tag. MySQL has derived temp table issues, maybe you can avoid the extras in Oracle?
CREATE TEMPORARY TABLE tmp_sequence (
IDSeq INT NOT NULL AUTO_INCREMENT, range_id VARCHAR(3), range_start CHAR(1), range_end CHAR(1), origin CHAR(1), destination CHAR(1), PRIMARY KEY (IDSeq)
);
INSERT INTO tmp_sequence (range_id, range_start, range_end)
VALUES ('ABC', 'X', 'Y'), ('ABC', 'Y', 'H'), ('ABC', 'H','L'), ('ABC','L', 'G'),
('BCD','Q','D'), ('BCD','D','H'),('BCD','H','Z');
CREATE TEMPORARY TABLE tmp_min AS
SELECT MIN(IDSeq) min_id, range_id
FROM tmp_sequence
GROUP BY range_id;
CREATE TEMPORARY TABLE tmp_start AS
SELECT s.min_id, s.range_id, t.range_start
FROM tmp_sequence t
JOIN tmp_min s ON t.IDSeq = s.min_id
AND t.range_id = s.range_id;
UPDATE tmp_sequence t
JOIN tmp_start s ON t.range_id = s.range_id
SET origin = s.range_start;
CREATE TEMPORARY TABLE tmp_max AS
SELECT MAX(IDSeq) max_id, range_id
FROM tmp_sequence
GROUP BY range_id;
CREATE TEMPORARY TABLE tmp_end AS
SELECT s.max_id, s.range_id, t.range_end
FROM tmp_sequence t
JOIN tmp_max s ON t.IDSeq = s.max_id
AND t.range_id = s.range_id;
UPDATE tmp_sequence t
JOIN tmp_end s ON t.range_id = s.range_id
SET destination = s.range_end;
DROP TEMPORARY TABLE tmp_sequence;
DROP TEMPORARY TABLE tmp_min;
DROP TEMPORARY TABLE tmp_start;
DROP TEMPORARY TABLE tmp_end;

Related

Needing system defined function to select updated or unmatched new records from two tables

I am having a live data table in which the old values are placed,in a new table i am moving data from that live table to this one how to find updated or new records that are inserted or updated in new table with out using except,checksum(binary_checksum) and join ,i am looking for a solution using System Defined Function.
The requirement is interesting as the best solutions are to use EXCEPT or a FULL JOIN. What you are trying to do is what is referred to as an left anti semi join. Here's a good article about the topic.
Note this sample data and the solutions (note that my solution that does not use EXCEPT or a join is the last solution):
-- sample data
if object_id('tempdb.dbo.orig') is not null drop table dbo.orig;
if object_id('tempdb.dbo.new') is not null drop table dbo.new;
create table dbo.orig (someid int, col1 int, constraint uq_cl_orig unique (someid, col1));
create table dbo.new (someid int, col1 int, constraint uq_cl_new unique (someid, col1));
insert dbo.orig values (1,100),(2,110),(3,120),(4,2000)
insert dbo.new values (1,100),(2,110),(3,122),(5,999);
Here's the EXCEPT version
select someid
from
(
select * from dbo.new except
select * from dbo.orig
) n
union -- union "distict"
select someid
from
(
select * from dbo.orig except
select * from dbo.new
) o;
Here's a FULL JOIN Solution which will also tell you if the record was removed, changed or added:
select
someid = isnull(n.someid, o.someid),
[status] =
case
when count(isnull(n.someid, o.someid)) > 1 then 'changed'
when max(n.col1) is null then 'removed' else 'added'
end
from dbo.new n
full join dbo.orig o
on n.col1=o.col1 and n.someid = o.someid
where n.col1 is null or o.col1 is null
group by isnull(n.someid, o.someid);
But, because those efficient solutions are not an option - you will need to go with a NOT IN or NOT EXISTS subquery.... And because it has to be a function, I am encapsulating the logic into a function.
create function dbo.newOrChangedOrRemoved()
returns table as return
-- get the new records
select someid, [status] = 'new'
from dbo.new n
where n.someid not in (select someid from dbo.orig)
union all
-- get the removed records
select someid, 'removed'
from dbo.orig o
where o.someid not in (select someid from dbo.new)
union all
-- get the changed records
select someid, 'changed'
from dbo.orig o
where exists
(
select *
from dbo.new n
where o.someid = n.someid and o.col1 <> n.col1
);
Results:
someid status
----------- -------
5 new
4 removed
3 changed

SQL 'GROUP BY' to filter an array of 'text' data type

I am new to SQL and I an trying to understand the GROUP BY statement.
I have inserted the following data in SQL:
CREATE TABLE table( id integer, type text);
INSERT INTO table VALUES (1,'start');
INSERT INTO table VALUES (2,'start');
INSERT INTO table VALUES (2,'complete');
INSERT INTO table VALUES (3,'complete');
INSERT INTO table VALUES (3,'start');
INSERT INTO table VALUES (4,'start');
I want to select those IDs that do not have a type 'complete'. For this example I should get IDs 1, 4.
I have tried multiple GROUP BY - HAVING combinations. My best approach is:
SELECT id from customers group by type having type!='complete';
but the resulted IDs are 4,3,2.
Could anyone give me a hint about what I am doing wrong?
You are close. The having clause needs an aggregation function and you need to aggregate by id:
select id
from table t
group by id
having sum(case when type = 'complete' then 1 else 0 end) = 0;
Normally, if you have something called an id, you would also have a table with that as primary key. If so, you can also do:
select it.id
from idtable it
where not exists (select 1
from table t
where t.type = 'complete' and it.id = t.id
);

Quick comparison of two columns in other TABLE

How to quickly find values ​​in a column that does not contain another column in another table
The problem is the speed of the query that is dynamically built in "execute immediate "stmt and average size of test tables test_table = 40mln and test_table2 = 1mln
Unfortunately I've been able to find similar topics and i will be grateful for any help
My queries:
select pole2 from test_table tt
where exists( select 1 from test_table2 tt2
where tt2.pole1 = 'ABC'
and tt.pole2 != tt.pole2)
select pole2 from test_table tt
where pole2 not in ( select pole2 from test_table2 tt2
where tt2.pole1 = 'ABC')

Insert values into table from the same table

Using SQL server (2012)
I have a table - TABLE_A with columns
(id, name, category, type, reference)
id - is a primary key, and is controlled by a separte table (table_ID) that holds the the primary next available id. Usually insertions are made from the application side (java) that takes care of updating this id to the next one after every insert. (through EJBs or manually, etc..)
However,
I would like to to write stored procedure (called from java application) that
- finds records in this table where (for example) reference = 'AAA' (passed as
parameter)
- Once multiple records found (all with same reference 'AAA', I want it to INSERT new
records with new ID's and reference = 'BBB', and other columns (name, category, type)
being same as in the found list.
I am thinking of a query similar to this
INSERT INTO table_A
(ID
,NAME
,CATEGORY
,TYPE,
,Reference)
VALUES
(
**//current_nextID,**
(select NAME
from TABLE_A
where REFENCE in (/*query returning value 'AAA' */),
(select CATEGORY
from TABLE_A
where REFENCE in (/*query returning value 'AAA' */),
(select TYPE
from TABLE_A
where REFENCE in (/*query returning value 'AAA' */),
'BBB - NEW REFERENCE VALUE BE USED'
)
Since, I don't know how many records I will be inserting , that is how many items in the result set of a criteria query
select /*field */
from TABLE_A
where REFENCE in (/*query returning value 'AAA' */),
I don't know how to come up with the value of ID, on every record. Can anyone suggest anything, please ?
It's not clear from your question how sequencing is handled but you can do something like this
CREATE PROCEDURE copybyref(#ref VARCHAR(32)) AS
BEGIN
-- BEGIN TRANSACTION
INSERT INTO tablea (id, name, category, type, reference)
SELECT value + rnum, name, category, type, 'BBB'
FROM
(
SELECT t.*, ROW_NUMBER() OVER (ORDER BY id) rnum
FROM tablea t
WHERE reference = 'AAA'
) a CROSS JOIN
(
SELECT value
FROM sequence
WHERE table_id = 'tablea'
) s
UPDATE sequence
SET value = value + ##ROWCOUNT + 1
WHERE table_id = 'tablea'
-- COMMIT TRANSACTION
END
Sample usage:
EXEC copybyref 'AAA';
Here is SQLFiddle demo

Tricky MS Access SQL query to remove surplus duplicate records

I have an Access table of the form (I'm simplifying it a bit)
ID AutoNumber Primary Key
SchemeName Text (50)
SchemeNumber Text (15)
This contains some data eg...
ID SchemeName SchemeNumber
--------------------------------------------------------------------
714 Malcolm ABC123
80 Malcolm ABC123
96 Malcolms Scheme ABC123
101 Malcolms Scheme ABC123
98 Malcolms Scheme DEF888
654 Another Scheme BAR876
543 Whatever Scheme KJL111
etc...
Now. I want to remove duplicate names under the same SchemeNumber. But I want to leave the record which has the longest SchemeName for that scheme number. If there are duplicate records with the same longest length then I just want to leave only one, say, the lowest ID (but any one will do really). From the above example I would want to delete IDs 714, 80 and 101 (to leave only 96).
I thought this would be relatively easy to achieve but it's turning into a bit of a nightmare! Thanks for any suggestions. I know I could loop it programatically but I'd rather have a single DELETE query.
See if this query returns the rows you want to keep:
SELECT r.SchemeNumber, r.SchemeName, Min(r.ID) AS MinOfID
FROM
(SELECT
SchemeNumber,
SchemeName,
Len(SchemeName) AS name_length,
ID
FROM tblSchemes
) AS r
INNER JOIN
(SELECT
SchemeNumber,
Max(Len(SchemeName)) AS name_length
FROM tblSchemes
GROUP BY SchemeNumber
) AS w
ON
(r.SchemeNumber = w.SchemeNumber)
AND (r.name_length = w.name_length)
GROUP BY r.SchemeNumber, r.SchemeName
ORDER BY r.SchemeName;
If so, save it as qrySchemes2Keep. Then create a DELETE query to discard rows from tblSchemes whose ID value is not found in qrySchemes2Keep.
DELETE
FROM tblSchemes AS s
WHERE Not Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID);
Just beware, if you later use Access' query designer to make changes to that DELETE query, it may "helpfully" convert the SQL to something like this:
DELETE s.*, Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID)
FROM tblSchemes AS s
WHERE (((Exists (SELECT * FROM qrySchemes2Keep WHERE MinOfID = s.ID))=False));
DELETE FROM Table t1
WHERE EXISTS (SELECT 1 from Table t2
WHERE t1.SchemeNumber = t2.SchemeNumber
AND Length(t2.SchemeName) > Length(t1.SchemeName)
)
Depend on your RDBMS you may use function different from Length (Oracle - length, mysql - length, sql server - LEN)
delete ShortScheme
from Scheme ShortScheme
join Scheme LongScheme
on ShortScheme.SchemeNumber = LongScheme.SchemeNumber
and (len(ShortScheme.SchemeName) < len(LongScheme.SchemeName) or (len(ShortScheme.SchemeName) = len(LongScheme.SchemeName) and ShortScheme.ID > LongScheme.ID))
(SQL Server flavored)
Now updated to include the specified tie resolution. Although, you may get better performance doing it in two queries: first deleting the schemes with shorter names as in my original query and then going back and deleting the higher ID where there was a tie in name length.
I'd do this in multiple steps. Large delete operations done in a single step make me too nervous -- what if you make a mistake? There's no sql 'undo' statement.
-- Setup the data
DROP Table foo;
DROP Table bar;
DROP Table bat;
DROP Table baz;
CREATE TABLE foo (
id int(11) NOT NULL,
SchemeName varchar(50),
SchemeNumber varchar(15),
PRIMARY KEY (id)
);
insert into foo values (714, 'Malcolm', 'ABC123' );
insert into foo values (80, 'Malcolm', 'ABC123' );
insert into foo values (96, 'Malcolms Scheme', 'ABC123' );
insert into foo values (101, 'Malcolms Scheme', 'ABC123' );
insert into foo values (98, 'Malcolms Scheme', 'DEF888' );
insert into foo values (654, 'Another Scheme ', 'BAR876' );
insert into foo values (543, 'Whatever Scheme ', 'KJL111' );
-- Find all the records that have dups, find the longest one
create table bar as
select max(length(SchemeName)) as max_length, SchemeNumber
from foo
group by SchemeNumber
having count(*) > 1;
-- Find the one we want to keep
create table bat as
select min(a.id) as id, a.SchemeNumber
from foo a join bar b on a.SchemeNumber = b.SchemeNumber
and length(a.SchemeName) = b.max_length
group by SchemeNumber;
-- Select into this table all the rows to delete
create table baz as
select a.id from foo a join bat b where a.SchemeNumber = b.SchemeNumber
and a.id != b.id;
This will give you a new table with only records for rows that you want to remove.
Now check these out and make sure that they contain only the rows you want deleted. This way you can make sure that when you do the delete, you know exactly what to expect. It should also be pretty fast.
Then when you're ready, use this command to delete the rows using this command.
delete from foo where id in (select id from baz);
This seems like more work because of the different tables, but it's safer probably just as fast as the other ways. Plus you can stop at any step and make sure the data is what you want before you do any actual deletes.
If your platform supports ranking functions and common table expressions:
with cte as (
select row_number()
over (partition by SchemeNumber order by len(SchemeName) desc) as rn
from Table)
delete from cte where rn > 1;
try this:
Select * From Table t
Where Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber )
And Id >
(Select Min (Id)
From Table
Where SchemeNumber = t.SchemeNumber
And SchemeName = t.SchemeName)
or this:,...
Select * From Table t
Where Id >
(Select Min(Id) From Table
Where SchemeNumber = t.SchemeNumber
And Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber))
if either of these selects the records that should be deleted, just change it to a delete
Delete
From Table t
Where Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber )
And Id >
(Select Min (Id)
From Table
Where SchemeNumber = t.SchemeNumber
And SchemeName = t.SchemeName)
or using the second construction:
Delete From Table t Where Id >
(Select Min(Id) From Table
Where SchemeNumber = t.SchemeNumber
And Len(SchemeName) <
(Select Max(Len(Schemename))
From Table
Where SchemeNumber = t.SchemeNumber))