Adding column to table with value from next row - sql

I have a table in PostgreSQL with a timestamp column, and I want to modify the table to have a second timestamp column and seed it with the value of the immediately successive timestamp. Is there a way to do this? The tables are fairly large, so a correlated subquery might kill the machine.
More concretely, I want to go from this to that:
+----+------+ +----+------+------+
| ts | data | | ts | te | data |
+----+------+ +----+------+------+
| T | ... | --> | T | U | ... |
| U | ... | | U | V | ... |
| V | ... | | V | null | ... |
+----+------+ +----+------+------+
Basically, I want to be able to hand point in time queries much better (i.e., give me the data for time X).

Basically I think you could just retrieve the timestamp at the query time, not storing it in the table, but if you're performing such action and think that this is what you need then:
You need to add that column to your table:
ALTER TABLE tablename ADD COLUMN te timestamp;
Then perform an update feeding the value with the use of LEAD window function.
UPDATE tablename t
SET te = x.te
FROM (
SELECT ts, lead(ts, 1) OVER (order by ts) AS te
FROM tablename t2
) x
WHERE t.ts = x.ts
Here's an example of how it works using sample integer data: SQL Fiddle.
It will perform exactly the same for timestamp data type values.

select ts,LEAD(ts) over(order by (select null)) as te,data from table_name

Related

How to do a batch insert version of auto increment?

I have a temporary table that has columns id and value:
| id | value |
And I have a table that aggregates data with columns g_id and value:
| g_id | value |
The id in the temporary table is locally ordered by id, here's 2 examples:
temp_table 1
| id | value |
+----+--------+
| 1 | first |
| 2 | second |
| 3 | third |
temp_table 2
| id | value |
+----+-------+
| 2 | alpha |
| 3 | beta |
| 4 | gamma |
I want to insert all the rows from the temp table into the global table, while having the g_id ordered globally. If I insert temp table 1 before temp table 2, it should be like this:
global_table
| group_id | value |
+----------+--------+
| 1 | first |
| 2 | second |
| 3 | third |
| 4 | alpha |
| 5 | beta |
| 6 | gamma |
For my purposes, it is ok if there are jumps between consecutive numbers, and if 2 inserts are done at the same time, it can be interleaved. The requirement is basically that the id columns are always increasing.
e.g.
Let's say I have 2 sets of queries run at the same time, and the group_id in global_table is auto incremented:
Query 1:
INSERT INTO global_table (value) VALUES (first)
INSERT INTO global_table (value) VALUES (second)
INSERT INTO global_table (value) VALUES (third)
Query 2:
INSERT INTO global_table (value) VALUES (alpha)
INSERT INTO global_table (value) VALUES (beta)
INSERT INTO global_table (value) VALUES (gamma)
I can get something like this:
global_table
| group_id | value |
+----------+--------+
| 1 | first |
| 3 | alpha |
| 4 | second |
| 5 | third |
| 6 | beta |
| 7 | gamma |
How do I achieve something like this with inserting from a table? Like with
INSERT INTO global_table (value)
SELECT t.value
FROM temp_table t
Unfortunately, this may not result in a incrementing id all the time.
The requirement is basically that the id columns are always
increasing.
If I understood you correctly, you should use ORDER BY in your INSERT statement.
INSERT INTO global_table (value)
SELECT t.value
FROM temp_table t
ORDER BY t.id;
In SQL Server, if you include ORDER BY t.id, it will guarantee that the new IDs generated in the global_table table will be in the specified order.
I don't know about other databases, but SQL Server IDENTITY has such guarantee.
Using your sample data, it is guaranteed that the group_id generated for value second will be greater than the value generated for the value first. And group_id for value third will be greater than for value second.
They may be not consecutive, but if you specify ORDER BY, their relative order will be preserved.
Same for the second table, and even if you run two INSERT statements at the same time. Generated group_ids may interleave between two tables, but relative order within each statement will be guaranteed.
See my answer for a similar question with more technical details and references.

How can you assign all the different values in a table variable to fields in existing rows of a table?

I have a table variable (#tableVar) with one column (tableCol) of unique values.
I have a target table with many existing rows that also has a column that is filled entirely with the NULL value.
What type of statement can I use to iterate through #tableVar and assign a different value from #tableVar.tableCol to the null field for each of the rows in my target table?
*Edit (to provide info)
My Target table has this structure:
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | NULL |
| Byron | NULL |
| Steve | NULL |
+-------+------------+
My table variable has this structure
+------------+
| CallNumber |
+------------+
| 6348 |
| 2675 |
| 9898 |
+------------+
I need to assign a different call number to each row in the target table, to achieve this kind of result.
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | 6348 |
| Byron | 2675 |
| Steve | 9898 |
+-------+------------+
Note: Each row does not need a specific CallNumber. The only requirement is that each row have a unique CallNumber. For example, "James" does not specifically need 6348; He can have any number as long as it's unique, and the unique number must come from the table variable. You can assume that the table variable will have enough CallNumbers to meet this requirement.
What type of query can I use for this result?
You can use an update with a sequence number:
with toupdate as (
select t.*, row_number() over (order by (select null)) as seqnum
from target t
)
update toupdate
set col = tv.tablecol
from (select tv.*, row_number() over (order by (select null)) as seqnum
from #tablevar tv
) tv
where tv.seqnum = toupdate.seqnum;
This assumes that #tablevar has a sufficient number of rows to assign in target. If not, I would suggest that you ask a new question with sample data and desired results.
Here is a db<>fiddle.

geometry matching and update

I am trying to match geometries from two tables and update one table based on the match. But this is taking huge time.
Table1
+-------------+----------+-------------+
| Column | Type | Modifiers |
|-------------+----------+-------------|
| id | bigint | |
| jid | integer | |
| geom | geometry | |
+-------------+----------+-------------+
Indexes:
"points_geom_gix" gist (geom)
"points_jid_idx" btree (jid)
Table2
+----------+----------+------------+
| Column | Type | Modifiers |
|----------+----------+------------|
| id | integer | |
| geom | geometry | |
+----------+----------+------------+
Indexes:
"jxn_geom_idx" gist (geom)
I tried with bellow queries.
UPDATE table1 SET jid = a.id from table2 a WHERE st_equals(geom,a.geom);
and
UPDATE table1 SET jid = b.id from table1 as a JOIN table2 b on st_equals(a.geoproperty,b.geom);
But both queries are taking huge amount of time(hours).
If I do count of matching geometries in both tables it will give count within seconds.
UPDATE
I am using PostgreSQL 9.5.7 and Postgis 2.2.1
If you required only bounding box level comparison use "=" on geometry columns instead of st_equals, its fast. Like a.geom=b.geom.
Refer to this as well,
Link

SQL select statement with except

I'm trying to generate a SQL statement where I need to get rid of all 'users' that have a certain trait attributed to them. Here is the example here
+------+-------+
| User | Trait |
+------+-------+
| A | Fire |
| A | Water |
| A | Air |
| B | Water |
| B | Air |
| C | Water |
| C | Fire |
+------+-------+
With SQL I'd like to remove all users who have the trait fire associated with them.
So basically, afterwards, we'd be left with
+------+-------+
| User | Trait |
+------+-------+
| B | Water |
| B | Air |
+------+-------+
If I was able to use something in excel to filter it out instead of through SQL, this would work as well. I've been looking through various ways, but from what I've tried, most will only remove the single row with the trait, but not the user along with it.
I need sql to translate something in the lines of
For (i = table.length; i++)
If Trait = Fire
getVal(User(i))
DeleteRows(User(i))
I'm looked into sql except, but the table I'm using is quite a bit more complex, so some help using a basic example would be nice to lead me in the right direction.
Thanks
You can use a sub-select and discard userids with NOT IN
SELECT *
FROM mytable
WHERE userid NOT IN (SELECT userid FROM mytable WHERE Trait = 'Fire')
The simplest way of doing this is using not exists.
select *
from tablename t
where not exists (select 1 from tablename where userid = t.userid and Trait = 'Fire')
Try:
select a.user, a.trait
from the_table a
where not exists
( select null
from the_table b
where b.user = a.user
and b.trait = 'Fire'
);

Aggregate ENTIRE rows based on single field without querying source twice or using CTEs?

Assume I have the following table:
+--------+--------+--------+
| field1 | field2 | field3 |
+--------+--------+--------+
| a | a | 1 |
| a | b | 2 |
| a | c | 3 |
| b | a | 1 |
| b | b | 2 |
| c | b | 2 |
| c | b | 3 |
+--------+--------+--------+
I want to select only the rows where field3 is the minimum value, so only these rows:
+--------+--------+--------+
| field1 | field2 | field3 |
+--------+--------+--------+
| a | a | 1 |
| b | a | 1 |
| c | b | 2 |
+--------+--------+--------+
The most popular solution is to query the source twice, once directly and then joined to a subquery where the source is queried again and then aggregated. However, since my data source is actually a derived table/subquery itself, I'd have to duplicate the subquery in my SQL which is ugly. The other option is to use the WITH CTE and reuse the subquery which would be nice, but Teradata, the database I am using, doesn't support CTEs in views, though it does in macros which is not an option for me now.
So is it possible in standard SQL to group multiple records into a single record by using only a single field in the aggregation without querying the source twice or using a CTE?
This is possible using a window function:
select *
from (
select column_1, column_2, column_3,
min(column_3) over (partition by column_1) as min_col_3
from the_table
) t
where column_3 = min_col_3;
The above is standard SQL and I believe Teradata also supports window functions.
The derived table is necessary because you can't refer to a column alias in the where clause - at least not in standard SQL.
I think Teradata actually allows that using the qualify operator, but as I have never used it, I am not sure:
select *
from the_table
qualify min(column_3) over (partition by column_1) = column_3;
Use NOT EXISTS to return a row if there are no other row with same field1 value but a lower field3 value:
select * from table t1
where not exists (select 1 from table t2
where t2.field1 = t1.field1
and t2.field3 < t1.field3)