Using Recursive CTE on ORACLE to create a multicolumn filtering - sql

I am currently using a VBA function to loop over an Oracle result set to eliminate duplicates on multiple columns/fields (i.e. only distinct values in each column) For example:
My result set is ordered by RECORD_ID, and I want to eliminate FIELD_1 and FIELD_2 duplication:
RECORD_ID FIELD_1 FIELD_D
1 A i
2 A j
3 B i
4 B k
5 C j
6 C k
7 D k
So my program creates a new table (say FINAL_TABLE) and evaluates every line in the original sql resultset (say TABLE_1):
IF the current value of TALBE_1.FIELD_1 IS NOT in FINAL_TABLE.FIELD_1 AND the current value of TALBE_1.FIELD_2 IS NOT in FINAL_TABLE.FIELD_2 THEN insert record/row into FINAL_TABLE
This results in
Column 1 Column 2 Column 3
1 A i
4 B k
5 C j
Where there is only unique values on both columns 2 and 3.
I have tried looking into a way of moving away from loops into SQL with the LAG and PATTERN MATCHING functions but cant figure it out. (cant think of a way to use distinct)
I have also looked at methods that create a table of possible combinations and then select from there but this is unfeasible since only a couple of thousand rows of data would make the number of combinations too large for most computers to handle.
Bottom line: Can this logic be implemented through a recursive SQL query?

If you store the result set in a temporary table, then I think you can do this with delete:
delete from temp
where exists (select 1
from temp t2
where t2.id < temp.id and (t2.col2 = temp.col2 or t2.col3 = temp.col3)
);

Related

SQL Combining two different tables

(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'

How do you do an IN query that has multiple columns in sqlite

I have a need to get columns for specific rows in the database, identified by more than one column. I'd like to do this in batches using an IN query.
In the single column case, it's easy:
SELECT id FROM foo WHERE a IN (1,2,3,4)
But I'm getting a syntax error when I try multi columns
SELECT id FROM foo WHERE (a,b) IN ((1,2), (3,4), (5,6))
Is there any way to do this? I can't just do two IN clauses because it potentially returns extra rows and also doesn't use the multi-column index as well.
This is Declan_K's idea:
To be able to use multi-column indexes and not having to update a separate column, you have to put the lookup values into a table which you can join:
CREATE TEMPORARY TABLE lookup(a, b);
INSERT INTO lookup VALUES ...;
CREATE INDEX lookup_a_b ON lookup(a, b);
SELECT id FROM foo JOIN lookup ON foo.a = lookup.a AND foo.b = lookup.b;
(The index is not really necessary; if you omit it, the SQLite query optimizer will be forced to look up the lookup records in the foo table, instead of doing it the other way around.)
If you want to avoid the temporary table, you could construct it dynamically using a subquery:
SELECT id
FROM foo
JOIN (SELECT 1 AS a, 2 AS b UNION ALL
SELECT 3 , 4 UNION ALL
SELECT 5 , 6 ) AS lookup
ON foo.a = lookup.a
AND foo.b = lookup.b
This will still be able to use an index on foo(a,b).
This command should use an index on (a, b):
SELECT id FROM foo WHERE a = 1 and b = 2
UNION ALL
SELECT id FROM foo WHERE a = 3 and b = 4
UNION ALL
SELECT id FROM foo WHERE a = 5 and b = 6
It's tedious to write if you have a length list of pairs, but that can be automated.
What I think you really want is another (string) column that contains the Key/Value pairs, so that you can query the combination specifically. Something like:
SELECT id FROM foo WHERE c IN ('1,2', '3,4', '5,6')
You can populate the new column with something like:
UPDATE foo SET c = a||','||b
You may need to convert a and b to strings but when that is done, concatenation will be the answer.
SELECT id FROM foo WHERE a||'_'||b IN ('1_2', '3_4', '5_6')

oracle transpose based on column values

I know transpose (or pivot) w/ sql is a common ask, but I haven't been able to get to exactly what I'm trying to do on stack/google.
In short, I want case when/then without hardcoding all possible values of a column because these values may be numerous and/or change over time. For example,
id col val
1 a 65
1 b 34
1 c 25
2 a 67
2 c 22
...
the goal is to wind up with a single row for each distinct id, with columns for each distinct col
Easy enough when the values of col are static and small, but when there are dozens of such values hardcoding every possible clause in a case statement seems arduous.
in psuedo code, what i want to do is
select
for each attr in (select distinct col from table)
sum(case when col = attr then val end) as transposed_attr,
end for
from table
group by id
But i'm inexperienced with PL/SQL, so I don't know how to achieve this in oracle.
Advice?
What version of Oracle? 11g introduces the pivot command...
Infact, just look here for both using PIVOT command and not:
http://orafaq.com/wiki/PIVOT

MS SQL - Joining on two tables with a substringed key in one column

I have a 2 tables I need to join, however on one of the tables I need to extract a key from a varchar field in each row.
Table 1 Description (numeric 18,varchar 4000)
descriptionid description
1 Blah Blah: Queue 1Blah Blah
2 foobar:Queue 2
3 rem:Queue 2 -This is a note
4 Anotherrow: Queue 3
5 Something else
Table 2 Queue - (numeric 18, varchar 100)
queueid queue
123 Queue 1
124 Queue 2
127 Queue 3
129 Queue 4
So I need to produce the output like so
View 3 Queue-Description (numeric 18, numeric 18)
descriptionid queueid
1 123
2 124
3 124
4 127
5 null
So in table 1 row 1, I need to strip out the value Queue1 from the description, verify it is in the queue table, and lookup the queueid.
I am unable to change the structure of tables 1 and 2.
What ways can this be achieved in MSSQL?
What is the most efficient way to do this in SQL - using MSSQL 2005 here.
most efficient way
Well... don't know about that but it is a way.
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on T1.description like '%'+T2.queue+'%'
Another way
select T1.descriptionid,
T2.queueid
from Table1 as T1
left outer join Table2 as T2
on charindex(T2.queue, T1.description, 1) > 0
If there are more than one match (see comment by Ed Harper) you can use this to pick the one with the longest match.
select T1.descriptionid,
T2.queueid
from Table1 as T1
outer apply (
select top 1 T3.queueid
from Table2 as T3
where charindex(T3.queue, T1.description, 1) > 0
order by len(T3.queue) desc
) as T2(queueid)
The most efficient way to do this is to add an extra column to your table and insert the extracted the ID from the string. You can do this when rows are added and you can process the existing ones fairly easily. But trying to left join like this will be very slow.
In Sql Server 2005 you can extract your queue string using regex. The Data Extraction section on this page contains an example.
In a stored procedure you can then build an indexed temp table that contains a new column - this allows you to do this without changing the table metadata).
If you can change the table metadata you can:
Trigger the content into another column (on insert).
Or if the information is not needed immediately a daily sql job could extract the information.

comparing 2 consecutive rows in a recordset

Currently,I have this objective to meet. I need to query the database for certain results. After done so, I will need to compare the records:
For example: the query return me with 10 rows of records, I then need to compare: row 1 with 2, row 2 with 3, row 3 with 4 ... row 9 with 10.
The final result that I wish to have is 10 or less than 10 rows of records.
I have one approach currently. I do this within a function, hand have the variables call "previous" and "current". In a loop I will always compare previous and current which I populate through the record set using a cursor.
After I got each row of filtered result, I will then input it into a physical temporary table.
After all the results are in this temporary table. I'll do a query on this table and insert the result into a cursor and then returning the cursor.
The problem is: how can I not use a temporary table. I've search through online about using nested tables, but somehow I just could not get it working.
How to replace the temp table with something else? Or is there other approach that I can use to compare the row columns with other rows.
EDIT
So sorry, maybe I am not clear with my question. Here is a sample of the result that I am trying to achieve.
TABLE X
Column A B C D
100 300 99 T1
100 300 98 T2
100 300 97 T3
100 100 97 T4
100 300 97 T5
101 11 11 T6
ColumnA is the primary key of the table. ColumnA has duplicates because table X is an audit table that keep tracks of all changes.column D acts as the timestamp for that record.
For my query, I am only interested in changes in column A,B and D. After the query I would like to get the result as below:
Column A B D
100 300 T1
100 100 T4
100 300 T5
101 11 T6
I think Analytics might do what you want :
select col1, col2, last(col1) over (order by col1, col2) LASTROWVALUE
from table1
this way, LASTROWVALUE will contain de value of col1 for the last row, which you can directly compare to the col1 of the current row.
Look this URL for more info : http://www.orafaq.com/node/55
SELECT ROW_NUMBER() OVER(ORDER BY <Some column name>) rn,
Column1, <Some column name>, CompareColumn,
LAG(CompareColumn) OVER(ORDER BY <Some column name>) PreviousValue,
LEAD(CompareColumn) OVER(ORDER BY <Some column name>) NextValue,
case
when CompareColumn != LEAD(CompareColumn) OVER(ORDER BY <Some column name>) then CompareColumn||'-->'||LEAD(CompareColumn) OVER(ORDER BY <Some column name>)
when CompareColumn = LAG(CompareColumn) OVER(ORDER BY <Some column name>) then 'NO CHANGE'
else 'false'
end
FROM <table name>
You can use this logic in a loop to change behaviour.
Hi It's not very clear what exactly yuo want to accomplish. But maybe you can fetch the results of the original query in a PLSQL collection and use that to do your comparison.
What exactly are you doing the row comparison for? Are you looking to eliminate duplicates, or are you transforming the data into another form and then returning that?
To eliminate duplicates, look to use GROUP BY or DISTINCT functionality in your SELECT.
If you are iterating over the initial data and transforming it in some way then it is hard to do it without using a temporary table - but what exactly is your problem with the temp table? If you are concerned about the performance of a cursor then maybe you could do one outer SELECT that compares the results of two inner SELECTs - but the trick is that the second SELECT is offset by one row, so you achieve the requirement of comparing row 1 against row2, etc.
I think you are complicating things with the temp table.
It can be made using a cursor and 2 temporary variables.
Here is the pseudo code:
declare
v_temp_a%xyz;
v_temp_b%xyz;
i number;
cursor my_cursor is select xyz from xyz;
begin
i := 1;
for my_row in my_cursor loop
if (i = 1)
v_temp_a := my_row;
else
v_temp_b := v_temp_a;
v_temp_a := my_row;
/* at this point v_temp_b has the previous row and v_temp_a has the currunt row
compare them and put whatever logic you want */
end if
i := i + 1;
end loop
end