comparing 2 consecutive rows in a recordset - sql

Currently,I have this objective to meet. I need to query the database for certain results. After done so, I will need to compare the records:
For example: the query return me with 10 rows of records, I then need to compare: row 1 with 2, row 2 with 3, row 3 with 4 ... row 9 with 10.
The final result that I wish to have is 10 or less than 10 rows of records.
I have one approach currently. I do this within a function, hand have the variables call "previous" and "current". In a loop I will always compare previous and current which I populate through the record set using a cursor.
After I got each row of filtered result, I will then input it into a physical temporary table.
After all the results are in this temporary table. I'll do a query on this table and insert the result into a cursor and then returning the cursor.
The problem is: how can I not use a temporary table. I've search through online about using nested tables, but somehow I just could not get it working.
How to replace the temp table with something else? Or is there other approach that I can use to compare the row columns with other rows.
EDIT
So sorry, maybe I am not clear with my question. Here is a sample of the result that I am trying to achieve.
TABLE X
Column A B C D
100 300 99 T1
100 300 98 T2
100 300 97 T3
100 100 97 T4
100 300 97 T5
101 11 11 T6
ColumnA is the primary key of the table. ColumnA has duplicates because table X is an audit table that keep tracks of all changes.column D acts as the timestamp for that record.
For my query, I am only interested in changes in column A,B and D. After the query I would like to get the result as below:
Column A B D
100 300 T1
100 100 T4
100 300 T5
101 11 T6

I think Analytics might do what you want :
select col1, col2, last(col1) over (order by col1, col2) LASTROWVALUE
from table1
this way, LASTROWVALUE will contain de value of col1 for the last row, which you can directly compare to the col1 of the current row.
Look this URL for more info : http://www.orafaq.com/node/55

SELECT ROW_NUMBER() OVER(ORDER BY <Some column name>) rn,
Column1, <Some column name>, CompareColumn,
LAG(CompareColumn) OVER(ORDER BY <Some column name>) PreviousValue,
LEAD(CompareColumn) OVER(ORDER BY <Some column name>) NextValue,
case
when CompareColumn != LEAD(CompareColumn) OVER(ORDER BY <Some column name>) then CompareColumn||'-->'||LEAD(CompareColumn) OVER(ORDER BY <Some column name>)
when CompareColumn = LAG(CompareColumn) OVER(ORDER BY <Some column name>) then 'NO CHANGE'
else 'false'
end
FROM <table name>
You can use this logic in a loop to change behaviour.

Hi It's not very clear what exactly yuo want to accomplish. But maybe you can fetch the results of the original query in a PLSQL collection and use that to do your comparison.

What exactly are you doing the row comparison for? Are you looking to eliminate duplicates, or are you transforming the data into another form and then returning that?
To eliminate duplicates, look to use GROUP BY or DISTINCT functionality in your SELECT.
If you are iterating over the initial data and transforming it in some way then it is hard to do it without using a temporary table - but what exactly is your problem with the temp table? If you are concerned about the performance of a cursor then maybe you could do one outer SELECT that compares the results of two inner SELECTs - but the trick is that the second SELECT is offset by one row, so you achieve the requirement of comparing row 1 against row2, etc.

I think you are complicating things with the temp table.
It can be made using a cursor and 2 temporary variables.
Here is the pseudo code:
declare
v_temp_a%xyz;
v_temp_b%xyz;
i number;
cursor my_cursor is select xyz from xyz;
begin
i := 1;
for my_row in my_cursor loop
if (i = 1)
v_temp_a := my_row;
else
v_temp_b := v_temp_a;
v_temp_a := my_row;
/* at this point v_temp_b has the previous row and v_temp_a has the currunt row
compare them and put whatever logic you want */
end if
i := i + 1;
end loop
end

Related

SQL Combining two different tables

(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'

Updating one table from another table

I have created a new table for my use , lets say t1 which has 8 columns in it. I have populated 3 columns through a procedure. Column 1 is name. Now I want to populate the 4th column for corresponding name. This would be update with where clause.
The scenario is I have created a query which has the result, calling that t2 which has name and total_amount. Now I want to populate total_amount into the 4th column of t1.
The approach that I'm right now following is looping through each name in t1 and finding its counter total_amount in t2(with clause) and updating the value in t1. But it is taking infinite time. First is because of looping in t1 , secondly the t2 is itself a query which is executing again and again.
Now, the actual task is much more complicated and I have just provided the crux of it. Please suggest me an approach which is fast.
create or replace procedure proc
is
temp_value number(18,2);
CURSOR total is
select name, age, sex from data_table where
{conditions};
/*Gives me name and age in 1st and 2nd column and likewise data in 3rd column */
begin
FOR temp IN total LOOP
with aa as (SELECT b.name,
NVL (SUM (c.amount), 0) as total_amount
FROM data_table2 b, data_table3 c
WHERE {joins and groub by}
)
/* This gives me total amount for corresponding name. There is no repetition of name */
select nvl(sum(total_amount),0) into temp_value from aa where name = temp.name;
update t1 set amount = temp_value where name = temp.name;
END LOOP;
END;
/
Cant add a comment to question, hence putting it here.
Per your example:
with aa as (SELECT b.name,
NVL (SUM (c.amount), 0) as total_amount
FROM data_table2 b, data_table3 c
WHERE {joins and groub by}
)
/* This gives me total amount for corresponding name. There is no repetition of name */
select nvl(sum(total_amount),0) into temp_value from aa where name = temp.name;
update t1 set amount = temp_value where name = temp.name;
In your with clause you you take some sum, and then populate the sum of those sum for all names in your cursor. Why cant you directly do:
SELECT SUM(NVL(total_amount, 0)) INTO temp_vale FROM
data_tabl1, data_tabl2, data_tabl3
WHERE
--JOIN CONDITIONS
AND data_tabl1.total)name = --data_tabl2/3.name
GROUP BY --clause;
Why I say this is, with clause is not always a good idea. If your 'with' has a huge data, then it will run forever. 'With' is used to take care of repetitive tables with small data being joined again and again.
Also for tuning purpose, try some hints.
Also, NVL(SUM..) why not SUM(NVL(total_amount, 0))?

Updating one column with the value from another, based on another common column

I have a large (3 million rows) table of transactional data, which can be simplified thus:
ID File DOB
--------------------------
1 File1 01/01/1900
2 File1 03/10/1978
3 File1 03/10/1978
4 File2 15/07/1997
5 File2 01/01/1900
6 File2 15/07/1997
In some cases there is no date. I would like to update the date field so it is the same as the other records for a file which has a date. So record 1's DOB would become 03/10/1978, because records 2 and 3 for that file have that date. Likewise record 5 would become 15/07/1997.
What is the most efficient way to achieve this?
Thanks.
Supposing your table is called "Files", then this will work:
UPDATE f1 SET f1.DOB=f2.MaxDOB
FROM files f1
JOIN (SELECT File, MAX(DOB) AS MaxDOB FROM files GROUP BY File) f2 ON
f2.File=f1.File;
As far as performance is concerned, it probably won't get much more efficient than this, but you do need to insure there is an index on the (File, DOB) column set. 3 million records is a lot and this query will also update records that do not need it, but filtering those out would require a much more complex join. Anyway... you better check the query plan.
I dont know about most efficient way, but i can think of one solution...create a temp table with following query. Though i am not sure about exact keywords of sqlserver 2008, but this might work or you may need to change key word like to_date and its format.
create table new_table as (
select file,min(DOB) as default_date, max(DOB) as fixed_date from three_million_table group by file having min(dob)= to_Date('01/01/1900','dd/mm/yyyy') )
so your new table will have
column headers: file, default_date,fixed_date
values: File1, 01/01/1900, 03/10/1978
Now it may not be wise to run update on three_million_table, but if you think it is ok then:
update T1
SET T1.DOB = T2.fixed_date
FROM three_million_table T1
INNER JOIN new_table T2
ON T1.file = T2.file
Hope this help... having 3 million records will surely take it toll to update the table by scanning each record
;WITH testCTE ([name],dobir,number)
AS (SELECT [File],DOB, ROW_NUMBER() OVER (PARTITION BY [FILE],DOB
ORDER BY ( SELECT 0)) RowNumber
FROM test)
UPDATE TEST
SET DOB = tcte.dobir
FROM testCTE as tcte
LEFT JOIN TEST t on tcte.name = t.[FILE]
WHERE tcte.number > 1 and [FILE] = tcte.[name]
sql fiddle

Using Recursive CTE on ORACLE to create a multicolumn filtering

I am currently using a VBA function to loop over an Oracle result set to eliminate duplicates on multiple columns/fields (i.e. only distinct values in each column) For example:
My result set is ordered by RECORD_ID, and I want to eliminate FIELD_1 and FIELD_2 duplication:
RECORD_ID FIELD_1 FIELD_D
1 A i
2 A j
3 B i
4 B k
5 C j
6 C k
7 D k
So my program creates a new table (say FINAL_TABLE) and evaluates every line in the original sql resultset (say TABLE_1):
IF the current value of TALBE_1.FIELD_1 IS NOT in FINAL_TABLE.FIELD_1 AND the current value of TALBE_1.FIELD_2 IS NOT in FINAL_TABLE.FIELD_2 THEN insert record/row into FINAL_TABLE
This results in
Column 1 Column 2 Column 3
1 A i
4 B k
5 C j
Where there is only unique values on both columns 2 and 3.
I have tried looking into a way of moving away from loops into SQL with the LAG and PATTERN MATCHING functions but cant figure it out. (cant think of a way to use distinct)
I have also looked at methods that create a table of possible combinations and then select from there but this is unfeasible since only a couple of thousand rows of data would make the number of combinations too large for most computers to handle.
Bottom line: Can this logic be implemented through a recursive SQL query?
If you store the result set in a temporary table, then I think you can do this with delete:
delete from temp
where exists (select 1
from temp t2
where t2.id < temp.id and (t2.col2 = temp.col2 or t2.col3 = temp.col3)
);

Check whether a table contains rows or not sql server 2005

How to Check whether a table contains rows or not sql server 2005?
For what purpose?
Quickest for an IF would be IF EXISTS (SELECT * FROM Table)...
For a result set, SELECT TOP 1 1 FROM Table returns either zero or one rows
For exactly one row with a count (0 or non-zero), SELECT COUNT(*) FROM Table
Also, you can use exists
select case when exists (select 1 from table)
then 'contains rows'
else 'doesnt contain rows'
end
or to check if there are child rows for a particular record :
select * from Table t1
where exists(
select 1 from ChildTable t2
where t1.id = t2.parentid)
or in a procedure
if exists(select 1 from table)
begin
-- do stuff
end
Like Other said you can use something like that:
IF NOT EXISTS (SELECT 1 FROM Table)
BEGIN
--Do Something
END
ELSE
BEGIN
--Do Another Thing
END
FOR the best performance, use specific column name instead of * - for example:
SELECT TOP 1 <columnName>
FROM <tableName>
This is optimal because, instead of returning the whole list of columns, it is returning just one. That can save some time.
Also, returning just first row if there are any values, makes it even faster. Actually you got just one value as the result - if there are any rows, or no value if there is no rows.
If you use the table in distributed manner, which is most probably the case, than transporting just one value from the server to the client is much faster.
You also should choose wisely among all the columns to get data from a column which can take as less resource as possible.
Can't you just count the rows using select count(*) from table (or an indexed column instead of * if speed is important)?
If not then maybe this article can point you in the right direction.
Fast:
SELECT TOP (1) CASE
WHEN **NOT_NULL_COLUMN** IS NULL
THEN 'empty table'
ELSE 'not empty table'
END AS info
FROM **TABLE_NAME**