Hive built in fucntion uuid() not working? - hive

In my hive, I can see the following built in function:
describe FUNCTION extended uuid
uuid() - Returns a universally unique identifier (UUID) string.
The value is returned as a canonical UUID 36-character string.
Example:
> SELECT uuid();
'0baf1f52-53df-487f-8292-99a03716b688'
> SELECT uuid();
'36718a53-84f5-45d6-8796-4f79983ad49d'
I am trying to generate a uuid for every row in a table:
from (select *, uuid() as id from table1) t
insert into table table2
select a,b,id
insert into table table3
c,id;
Every single row in each table ends up with an identical uuid value. However, if I replace uuid() function with rand() function, every row ends up with a different random id.
Why is uuid() only generating one value?
I can't use reflect('java.util.UUID','randomUUID') because reflect is blocked by sentry.

The UUID function was added in 2.2.0, it had a bug unfortunately in that it was not tagged as non-deterministic. If you update to 2.2.3 it will work as expected, if that is not an option, you can create your own UUID generator UDF in it's place,
#UDFType(deterministic = false)
public class UUIDShim extends UDF {
private final Text result = new Text();
public Text evaluate() {
result.set(UUID.randomUUID().toString());
return result;
}
}

Related

goqu select for update

i have the following table created in postgres using the goqu library, there will only be one value in the column at any point and I want to use select for update to lock the rows before overwriting the value to prevent concurrency.
table:
user_id
1234
This is what I tried:
type data struct {
ID int `db:"user_id"`
}
var new_id data
new_id = data{Id: 4321,}
DB.Select(data{}).From(tablename).ForUpdate(exp.SkipLocked).Insert().Rows(new_id).Executor().Exec()
But this command seems to create new rows instead of overwriting the value. Any help is appreciated!

UPDATE in PostgreSQL generates same UUID

I have added new uuid column for one of my tables and installed uuid-ossp extension.
Now I want to update all existing records and set value for this new uuid column.
I do not want to use DEFAULT fallback for ADD COLUMN, but rather I want to do it in UPDATE statement, so I have something like this:
UPDATE table_name SET uuid = (SELECT uuid_generate_v4());
but the issue I have is that same UUID is generated for all records.
Is there a way to pass seed value or something to generate function or another way to enforce generated UUIDs to be unique?
You could try modifying the UUID subquery such that it forces/tricks Postgres into generating a new UUID for each record:
UPDATE table_name
SET uuid = uuid_generate_v4()
WHERE uuid IS NOT NULL;
The WHERE clause is just a dummy, but perhaps will result in Postgres calling the UUID function once for each record.
If you call the uuid_generate_v4() function in a subquery, PostgreSQL will assume that the subquery needs to be called only once, since it does not contain any reference to the surrounding query. Consequently, all rows will be updated to the same uuid, which fails.
If you remove the subquery and leave only the function call, the function is called for each row, since it is VOLATILE.

postgreSQL hstore if contains value

Is there a way to check if a value already exists in the hstore in the query itself.
I have to store various values per row ( each row is an "item").
I need to be able to check if the id already exists in database in one of the hstore rows without selecting everything first and doing loops etc in php.
hstore seems to be the only data type that offers something like that and also allows you to select the column for that row into an array.
Hstore may not be the best data type to store data like that but there isn't anything else better available.
The whole project uses 9.2 and i cannot change that - json is in 9.3.
The exist() function tests for the existence of a key. To determine whether the key '42' exists anywhere in the hstore . . .
select *
from (select test_id, exist(test_hs, '42') key_exists
from test) x
where key_exists = true;
test_id key_exists
--
2 t
The svals() function returns values as a set. You can query the result to determine whether a particular value exists.
select *
from (select test_id, svals(test_hs) vals
from test) x
where vals = 'Wibble';
hstore Operators and Functions
create table test (
test_id serial primary key,
test_hs hstore not null
);
insert into test (test_hs) values (hstore('a', 'b'));
insert into test (test_hs) values (hstore('42', 'Wibble'));

Sqlite - SELECT or INSERT and SELECT in one statement

I'm trying to avoid writing separate SQL queries to achieve the following scenario:
I have a Table called Values:
Values:
id INT (PK)
data TEXT
I would like to check if certain data exists in the table, and if it does, return its id, otherwise insert it and return its id.
The (very) naive way would be:
select id from Values where data = "SOME_DATA";
if id is not null, take it.
if id is null then:
insert into Values(data) values("SOME_DATA");
and then select it again to see its id or use the returned id.
I am trying to make the above functionality in one line.
I think I'm getting close, but I couldn't make it yet:
So far I got this:
select id from Values where data=(COALESCE((select data from Values where data="SOME_DATA"), (insert into Values(data) values("SOME_DATA"));
I'm trying to take advantage of the fact that the second select will return null and then the second argument to COALESCE will be returned. No success so far. What am I missing?
Your command does not work because in SQL, INSERT does not return a value.
If you have a unique constraint/index on the data column, you can use that to prevent duplicates if you blindly insert the value; this uses SQLite's INSERT OR IGNORE extension:
INSERT OR IGNORE INTO "Values"(data) VALUES('SOME_DATE');
SELECT id FROM "Values" WHERE data = 'SOME_DATA';

Finding Auto Incremented values from an INSERT OR IGNORE statement in SQLite

I have a table called "images":
CREATE TABLE images (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url TEXT NOT NULL UNIQUE,
caption TEXT
);
On inserting a row, I'd like the URL column to be unique. At the same time, I'd like to find out the id of the row with that URL.
INSERT OR IGNORE images (url, caption) VALUES ("http://stackoverflow.com", "A logo");
SELECT last_insert_rowid(); -- returns the id iff there was an insert.
Of course, I'd like to do it as quickly as possible, though my first thought was along the the lines of the following pseudo code:
int oldID = execSQL("SELECT last_insert_rowid()");
execSQL("INSERT OR IGNORE images (url, caption) VALUES ('http://stackoverflow.com', 'A logo')");
int newID = execSQL("SELECT last_insert_rowid()");
if (newID != oldID) {
// there was an insert.
return newID;
} else {
// we'd already seen this URL before
return execSQL("SELECT id FROM images WHERE url = 'http://stackoverflow.com'");
}
But this seems hopelessly inefficient.
What is the most performant way of getting the Auto incremented row id from an INSERT OR IGNORE statement.
In Android 2.2 you can use SQLiteDatabse.insertWithOnConflict. Link.
Returns the row ID of the newly
inserted row OR the primary key of the
existing row if the input param
'conflictAlgorithm' = CONFLICT_IGNORE
OR -1 if any errorOR -1 if any error
In older versions of SDK you can:
query if entry exists
If entry found - return id of selected item
If nothing found - insert entry using SQLiteDatabse.insert (it will return primary key of new item).
Also consider using transactions if needed.