how to make row data to source and target zigzag using hive or pig - sql

Input
id,name,time
1,home,10:20
1,product,10:21
1,mobile,10:22
2,id,10:24
2,bag,10:30
2,home,10:21
3,keyboard,10:32
3,home,10:33
3,welcome,10:36
I want to make name column as source and target based on the below output.
Earlier I tried with pig
The steps are:
a=load-->b=asc->c=dec -> then join the data
I got the output like this
(1,home,10:20,1,product,10:21)
(2,bag,10:30,2,id,10:24)
(3,home,10:32,3,welcome,10:36)
output
1,home,product
1,product,mobile
2,id,bag
2,bag,home
3,keyboard,home
3,home,welcome

In Hive (and in Spark), you can use Window function LEAD :
with t as
( select id, name, lead(name) over (partition by id) as zigzag from table)
select * from t where t.zigzag is not null
Should give you the output :
+---+--------+-------+
| id| name| zigzag|
+---+--------+-------+
| 1| home|product|
| 1| product| mobile|
| 2| bag| home|
| 2| home| id|
| 3|keyboard| home|
| 3| home|welcome|
+---+--------+-------+

Related

Running total between two dates SQL

I have a problem with building an efficient query in order to get a running total of sales between two dates.
Now I have the query :
select SalesId,
sum(Sales) as number_of_sales,
Sales_DATE as SalesDate,
ADD_MONTHS(Sales_DATE , -12) as SalesDatePrevYear
from DWH.L_SALES
group by SalesId, Sales_DATE
With the result:
| SalesId| number_of_sales| SalesDate|SalesDatePrevYear|
|:---- |:------:| :-----:|-----:|
| 1000| 1| 20200101|20190101|
| 1001| 1| 20220101|20210101|
| 1002| 1| 20220201|20210201|
| 1003| 1| 20220301|20210301|
The preferred result is the following:
| SalesId| number_of_sales| running total of sales | SalesDate|SalesDatePrevYear|
|:---- |:------:| :-----:| :-----:|-----:|
| 1000| 1| 1 | 20200101|20190101|
| 1001| 1| 1 | 20220101|20210101|
| 1002| 1| 2| 20220201|20210201|
| 1003| 1| 3|20220301|20210301|
As you can see, I want the total of Sales between the two dates, but because I also need the lower level (SalesId), it always stays at 1.
How can i get this efficiently?
You have successfully gotten the result which gives you the start and end dates that you care about, so you just need to take this result and then join it to the original data with an inequality join, and then sum the results. I suggest looking into the style of using CTE's (Common Table Expressions) which is helpful for learning and debugging.
For example,
WITH CTE_BASE_RESULT AS
(
your query goes here
)
SELECT CTE_BASE_RESULT.SalesId, CTE_BASE_RESULT.SalesDate, SUM(Sales) AS Total_Sales_Prior_Year
FROM CTE_BASE_RESULT
INNER JOIN DWH.L_Sales
ON CTE_BASE_RESULT.SalesId = L_Sales.SalesId
AND CTE_BASE_RESULT.SalesDate >= L_Sales.SalesDATE
AND CTE_BASE_RESULT.SalesDatePrevYear > L_Sales.SalesDATE
GROUP BY CTE_BASE_RESULT.SalesId, CTE_BASE_RESULT.SalesDate
I also recommend a website like SQL Generator that can help write complex operations, for example this is called Timeseries Aggregate.
This syntax works for snowflake, I didnt see what system you're on.
Alternatively,
WITH BASIC_OFFSET_1YEAR AS (
SELECT
A.Sales_Id,
A.SalesDate,
SUM(B.Sales) as SUM_SALES_PAST1YEAR
FROM
L_Sales A
INNER JOIN L_Sales B ON A.Sales_Id = B.Sales_Id
WHERE
B.SalesDate >= DATEADD(YEAR, -1, A.SalesDate)
AND B.SalesDate <= A.SalesDate
GROUP BY
A.Sales_Id,
A.SalesDate
)
SELECT
src.*, BASIC_OFFSET_1YEAR.SUM_SALES_PAST1YEAR
FROM
L_Sales src
LEFT OUTER JOIN BASIC_OFFSET_1YEAR
ON BASIC_OFFSET_1YEAR.SalesDate = src.SalesDate
AND BASIC_OFFSET_1YEAR.Sales_Id = src.Sales_Id

Updating different unique value to each group

I have a table where everything that has the same classification_id and application_id have the same group_id.
id |classification_id |application_id |authorisation_id |group_id |
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056|
I've managed to sort out subgroups of this group by authorisation_id and I've created a group_helper which basically shows my end goal - from this data set I want to get three different groups:
id |classification_id |application_id |authorisation_id |group_id |group_helper|
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 2|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| 3|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
Now, I want each of those groups to have a different group_id. I don't have to update the one which has group_id = NULL since it is already unique. Now I want to give every row that has group_helper = 2 same (but different from those where group_id = NULL) UUID, every row that has group_helper = 3 same UUID (but different from those which have group_id = NULL or 2) and so on. This has to work on n amount of group_helper values because there can be much more than maximum 2.
So my end goal would look like this:
id |classification_id |application_id |authorisation_id |group_id |group_helper|
------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------+
54f614f3-7582-4ae9-a07e-5ff6d29e7a3b|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
a01571a1-4f04-4ff9-9a7b-3a720736b9ec|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
3e18f2d0-4d5f-41b3-baf5-ba0feac8f43e|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
b2ebe2ee-ffed-4e32-8abe-cd8b7d400646|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|fd3e63d1-d59c-477f-b58b-3ae3726c7992| 2|
ef01e6f7-f6ad-4d4d-b129-9c756734bef5|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
7d340811-b679-49fd-bdd6-32a1bb9bbfed|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
c45d7bb6-2146-48d0-a804-929cc42484cd|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
ddec5929-a08f-4f48-97f8-ccc2b85531ac|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|ed3ff96c-2f93-4182-8e4f-4594cb20cbb6| 3|
ae9edbb2-def3-4c4e-9a27-72454a09e146|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|a4edd12d-c19e-4e0d-badd-d3cf5e6d6d82|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
3a3fd904-1988-4f8c-bf27-8cdf349b8431|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|25a7e1f7-4d8c-4e12-a10f-3654d7ef5ee9|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
27c669b9-763c-49cf-887a-b9b1f85dc1ab|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|302b23f1-ce57-4219-bcae-7bdbc3b86cb4|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
03820732-32c4-4cd4-910b-4e27fdd44bdf|63a7b151-2b8d-4b6a-b9a1-108a80ae4cdf|cd3d597b-25d1-4b4b-92f0-2ad8fcb4698c|5e3bce60-b0d8-436c-9d33-b3a1d4c9a308|8e563f95-ff0c-41e7-b211-d5ac6f78d056| |
You can create a CTE which generates new group_id, selecting a single value for each group_helper column, then use update ... from .... (see demo)
with grouper(helper, gid) as
(select distinct on (group_helper)
group_helper
, gen_random_uuid()
from sometable
where group_helper is not null
order by group_helper
) --select * from grouper
update sometable
set group_id = gid
from grouper
where helper = group_helper;

How to validate particular column in a Dataframe without troubling other columns using spark-sql?

set.createOrReplaceTempView("input1");
String look = "select case when length(date)>0 then 'Y' else 'N' end as date from input1";
Dataset<Row> Dataset_op = spark.sql(look);
Dataset_op.show();
In the above code the dataframe 'set' has 10 columns and i've done the validation for one column among them (i.e) 'date'. It return date column alone.
My question is how to return all the columns with the validated date column in a single dataframe?
Is there any way to get all the columns in the dataframe without manually selecting all the columns in the select statement. Please share your suggestions.TIA
Data
df= spark.createDataFrame([
(1,'2022-03-01'),
(2,'2022-04-17'),
(3,None)
],('id','date'))
df.show()
+---+----------+
| id| date|
+---+----------+
| 1|2022-03-01|
| 2|2022-04-17|
| 3| null|
+---+----------+
You have two options
Option 1 select without projecting a new column with N and Y
df.createOrReplaceTempView("input1");
String_look = "select id, date from input1 where length(date)>0";
Dataset_op = spark.sql(String_look).show()
+---+----------+
| id| date|
+---+----------+
| 1|2022-03-01|
| 2|2022-04-17|
+---+----------+
Or project Y and N into a new column. Remember the where clause is applied before column projection. So you cant use the newly created column in the where clause
String_look = "select id, date, case when length(date)>0 then 'Y' else 'N' end as status from input1 where length(date)>0";
+---+----------+------+
| id| date|status|
+---+----------+------+
| 1|2022-03-01| Y|
| 2|2022-04-17| Y|
+---+----------+------+

Spark: how to perform loop fuction to dataframes

I have two dataframes as below, I'm trying to search the second df using the foreign key, and then generate a new data frame. I was thinking of doing a spark.sql("""select history.value as previous_year 1 from df1, history where df1.key=history.key and history.date=add_months($currentdate,-1*12)""" but then I need to do it multiple times for say 10 previous_years. and join them back together. How can I create a function for this? Many thanks. Quite new here.
dataframe one:
+---+---+-----------+
|key|val| date |
+---+---+-----------+
| 1|100| 2018-04-16|
| 2|200| 2018-04-16|
+---+---+-----------+
dataframe two : historical data
+---+---+-----------+
|key|val| date |
+---+---+-----------+
| 1|10 | 2017-04-16|
| 1|20 | 2016-04-16|
+---+---+-----------+
The result I want to generate is
+---+----------+-----------------+-----------------+
|key|date | previous_year_1 | previous_year_2 |
+---+----------+-----------------+-----------------+
| 1|2018-04-16| 10 | 20 |
| 2|null | null | null |
+---+----------+-----------------+-----------------+
To solve this, the following approach can be applied:
1) Join the two dataframes by key.
2) Filter out all the rows where previous dates are not exactly years before reference dates.
3) Calculate the years difference for the row and put the value in a dedicated column.
4) Pivot the DataFrame around the column calculated in the previous step and aggregate on the value of the respective year.
private def generateWhereForPreviousYears(nbYears: Int): Column =
(-1 to -nbYears by -1) // loop on each backwards year value
.map(yearsBack =>
/*
* Each year back count number is transformed in an expression
* to be included into the WHERE clause.
* This is equivalent to "history.date=add_months($currentdate,-1*12)"
* in your comment in the question.
*/
add_months($"df1.date", 12 * yearsBack) === $"df2.date"
)
/*
The previous .map call produces a sequence of Column expressions,
we need to concatenate them with "or" in order to obtain
a single Spark Column reference. .reduce() function is most
appropriate here.
*/
.reduce(_ or _) or $"df2.date".isNull // the last "or" is added to include empty lines in the result.
val nbYearsBack = 3
val result = sourceDf1.as("df1")
.join(sourceDf2.as("df2"), $"df1.key" === $"df2.key", "left")
.where(generateWhereForPreviousYears(nbYearsBack))
.withColumn("diff_years", concat(lit("previous_year_"), year($"df1.date") - year($"df2.date")))
.groupBy($"df1.key", $"df1.date")
.pivot("diff_years")
.agg(first($"df2.value"))
.drop("null") // drop the unwanted extra column with null values
The output is:
+---+----------+---------------+---------------+
|key|date |previous_year_1|previous_year_2|
+---+----------+---------------+---------------+
|1 |2018-04-16|10 |20 |
|2 |2018-04-16|null |null |
+---+----------+---------------+---------------+
Let me "read through the lines" and give you a "similar" solution to what you are asking:
val df1Pivot = df1.groupBy("key").pivot("date").agg(max("val"))
val df2Pivot = df2.groupBy("key").pivot("date").agg(max("val"))
val result = df1Pivot.join(df2Pivot, Seq("key"), "left")
result.show
+---+----------+----------+----------+
|key|2018-04-16|2016-04-16|2017-04-16|
+---+----------+----------+----------+
| 1| 100| 20| 10|
| 2| 200| null| null|
+---+----------+----------+----------+
Feel free to manipulate the data a bit if you really need to change the column names.
Or even better:
df1.union(df2).groupBy("key").pivot("date").agg(max("val")).show
+---+----------+----------+----------+
|key|2016-04-16|2017-04-16|2018-04-16|
+---+----------+----------+----------+
| 1| 20| 10| 100|
| 2| null| null| 200|
+---+----------+----------+----------+

What SQL query should I perform to get the result set expected?

What SQL query should I perform to get the result set expected, giving the first element of the chain (2) as input data, or any of them ?
table name: changes
+----+---------------+---------------+
| id | new_record_id | old_record_id |
+----+---------------+---------------+
| 1| 4| 2|
| -- non relevant data -- |
| 6| 7| 4|
| -- non relevant data -- |
| 11| 13| 7|
| 12| 14| 13|
| -- non relevant data -- |
| 31| 20| 14|
+----+---------------+---------------+
Result set expected:
+--+
| 2|
| 4|
| 7|
|13|
|14|
|20|
+--+
I know I should consider change my data model, but: What if I couldn't?
Thank you in advance!
Following code should work to get your result
CREATE TABLE #return(val int)
DECLARE #VAL int
SET #VAL = 2
WHILE (SELECT count(*) FROM [changes]
WHERE old_record_id = #VAL) > 0
BEGIN
INSERT INTO #return values(#VAL)
SELECT #VAL = new_record_id FROM [changes]
WHERE old_record_id = #VAL
END
INSERT INTO #return values(#VAL)
SELECT * FROM #return
Since you're vague on the database, here is some good literature:
MySQL
Look around section 7, this goes in to hierarchy and recursive functions
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html
MSSQL
Good article on the same thing (you're going to find functions the common place, though MSSQL can use WITH).
http://www.sqlservercurry.com/2009/06/simple-family-tree-query-using.html
PostgreSQL
Same type of article. All these have the same premise, working your way up a database tree from child to parent.
http://www.paragoncorporation.com/ArticleDetail.aspx?ArticleID=24
if you have data in the middle that you do not want to update then the only solution would be
update table set id=2 where id=4;
update table set id=4 where id=7;
update table set id=7 where id=13;
update table set id=13 where id=14;
update table set id=14 where id=20;
But this update might work if it does the updates in the order they appear which will generally happens if you have the pk of the table set to id.
update table set id=(select min(id) from table b where b.id>table.id)
You can also force this by adding an order by id at the end, and let you see if it allows that.