update table with 4 columns specified, but only 2 columns are available - sql

I have one table called test, which has 4 columns:
id INT
v_out INT
v_in INT
label CHARACTER
I'm trying to update the table with the following query:
String sql = "
update
test
set
v_out = temp.outV
, v_in = temp.inV
, label = temp.label
from (
values(
(1,234,235,'abc')
,(2,234,5585,'def')
)
) as temp (e_id, outV, inV, label)
where
id = temp.e_id
";
When I execute it, I got the error:
org.postgresql.util.PSQLException: ERROR:
table "temp" has 2 columns available but 4 columns specified
Whats the problem, and how can i solve it?

The values for the values clause must not be enclosed in parentheses:
values (
(1,234,235,'abc'), (2,234,5585,'def')
)
creates a single row with two columns. Each column being an anonymous "record" with 4 fields.
What you want is:
from (
values
(1,234,235,'abc'),
(2,234,5585,'def')
) as temp (e_id, outV, inV, label)
SQLFiddle showing the difference: http://sqlfiddle.com/#!15/d41d8/2763
This behavior is documented, but that is quite hard to find:
http://www.postgresql.org/docs/current/static/rowtypes.html#AEN7362
It's essentially the same thing as select (col1, col2) from some_table vs. select col1, col2 from some_table. The first one returns one column with an anonymous composite type that has two fields. The second one returns two columns from the table.

Related

How to return ids of rows with conflicting values?

I am looking to insert or update values in an SQLite database (version > 3.35) avoiding multiple queries. upsert along with returning seems promising :
CREATE TABLE phonebook2(
name TEXT PRIMARY KEY,
phonenumber TEXT,
validDate DATE
);
INSERT INTO phonebook2(name,phonenumber,validDate)
VALUES('Alice','704-555-1212','2018-05-08')
ON CONFLICT(name) DO UPDATE SET
phonenumber=excluded.phonenumber,
validDate=excluded.validDate
WHERE excluded.validDate>phonebook2.validDate RETURNING name;
This helps me track names corresponding to inserted/modified rows. How to find rows where phonebook2 values conflict with values upserted in above statement, but no insert or update happened due to where clause?
The RETURNING clause can't be used to get non-affected rows.
What you can do is execute a SELECT statement before the UPSERT:
WITH cte(name, phonenumber, validDate) AS (VALUES
('Alice', '704-555-1212', '2018-05-08'),
('Bob','804-555-1212', '2018-05-09')
)
SELECT *
FROM phonebook2 p
WHERE EXISTS (
SELECT *
FROM cte c
WHERE c.name = p.name AND c.validDate <= p.validDate
);
In the CTE you may include as many tuples as you want

Dynamic column alias from another column value in SELECT

I was wondering if there a way, in a SELECT statement on Postgres, to alias a column with the value of another column in the same data set.
Given this table:
id
key
value
1
a
d
2
a
e
3
b
f
This would be the result:
id
a
b
1
d
NULL
2
e
NULL
3
NULL
f
Where for each instance the name of the column is determined from the value of key while the value is the value of the column value, not knowing what kind of values will be provided by the column key.
This is a possible (not working) query:
SELECT "id", "value" AS "t"."key" FROM testTable as t;
One way to achieve pivot in Postgres is using CASE :
select id,
max(case when (key='a') then value else NULL end) as a,
max(case when (key='b') then value else NULL end) as b
FROM TestTable
group by id
order by id;
It seems that there is no way to create the column alias dynamically without knowing the values since the beginning. As many commented the only way to achieve this kind of "table re-mapping" is to use the crosstab function.
Crosstab function summary
This function takes 2 arguments:
The first one is a SQL statement that must return 3 columns:
The first column contains the values identifying each instance and that must be grouped in order to get the final result.
The second column contains the values that are used as categories in the final pivot table: each value will create a separate column.
The third column contains the values used to compile the new columns formed: for each category this column has the value of the instance that had the category value in the original table.
The second argument is not mandatory and is a SQL statement that returns the distinct values the function should use as categories.
Example
In the example above we must pass a query to crosstab that:
Returns as the first column the identifier of each final instance (in this case id)
As second column the values used as categories (all values in key)
As third column the values used to fill the categories (all values in value)
So the final query should be:
select * from crosstab(
'select "id", "key", "value" from testTable order by 1, 2;',
'select distinct "key" from testTable order by 1;'
) as result ("id" int8, "a" text, "b" text);
Since the crosstab function requires a column definition for the final pivot table, there is no way to determine the column alias dynamically.
Dynamically infer column names with client
A possible way to do that, with a PostgreSQL client, is to launch the second query we passed as argument to crosstab in order to retrieve the final columns and then infer the final crosstab query.
As an example, with pseudo-javascript:
const client;
const aliases = client.query(`select distinct "key" from testTable order by 1;`);
const finalTable = client.query(`select * from crosstab(
'select "id", "key", "value" from testTable order by 1, 2;',
'select distinct "key" from testTable order by 1;'
) as result ("id" int8, ${aliases.map(v => v + ' data_type').join(',')});`)
Useful articles
https://learnsql.com/blog/creating-pivot-tables-in-postgresql-using-the-crosstab-function/

Clone a record, then use its auto increment id for further operations

Update:
After narrowing down the code it seems that the line
INSERT INTO table1 TABLE table1_temp RETURNING id
is causing the issue. Any tips what is wrong with this?
Original question:
table1 has many colums (I don't care which) and it has an auto increment primary key (id). This is what I need to do and how I'm trying:
First, I'd like to duplicate a record in table1.
BEGIN;
CREATE TEMP TABLE table1_temp ON COMMIT DROP AS
SELECT * FROM table1 WHERE id = <some integer>;
ALTER TABLE table1_temp DROP COLUMN id;
WITH generated_id AS (
INSERT INTO table1 TABLE table1_temp RETURNING id
)
Then, perform an insert to some_table where I need to use the generated id of the copy that was created in table1.
INSERT INTO some_table (something, the_id_into_this)
VALUES ('some value', (SELECT id FROM generated_id));
Then get some data from yet_another_table (columns: somestuff, id_here) and use this and the id for an insert into that same table.
INSERT INTO yet_another_table
(SELECT somestuff,
(SELECT id FROM generated_id) AS id_here
FROM yet_another_table
WHERE id_here = <some integer>)
Finally, I need to return the id so I can use it in my app...
RETURNING id_here AS id;
COMMIT;
Am I on the right path implementing this? When running the query, I get the following error:
column "id" is of type integer but expression is of type character
varying HINT: You will need to rewrite or cast the expression.
It doesn't tell me the line number where it occurrs and I have no idea what might cause this.
INSERT INTO table1 TABLE table1_temp
You cannot do that because table1_temp has different set of columns (you dropped id column).
You need to specify columns explicitly (all but id column):
INSERT INTO table1(col1, col2, ...) TABLE table1_temp
I found a simple solution for cloning a record with an auto increment id that doesn't require you to specify any other columns of the table:
BEGIN;
CREATE TEMP TABLE table1_temp ON COMMIT DROP AS
SELECT * FROM table1 WHERE id = #;
UPDATE table1_temp SET id = nextval('table1_seq');
INSERT INTO table1 TABLE table1_temp;
COMMIT;
And for the CTE part of the question, here is how you can reuse a returned value at multiple subsequent queries by concatenating WITH statements:
WITH generated_id AS (
INSERT INTO ... RETURNING id
), _ AS (
QUERY1 ... SELECT id FROM generated_id ...
), __ AS (
QUERY2 ... SELECT id FROM generated_id ...
...

Hive - getting the column names count of a table

How can I get the hive column count names using HQL? I know we can use the describe.tablename to get the names of columns. How do we get the count?
create table mytable(i int,str string,dt date, ai array<int>,strct struct<k:int,j:int>);
select count(*)
from (select transform ('')
using 'hive -e "desc mytable"'
as col_name,data_type,comment
) t
;
5
Some additional playing around:
create table mytable (id int,first_name string,last_name string);
insert into mytable values (1,'Dudu',null);
select size(array(*)) from mytable limit 1;
This is not bulletproof since not all combinations of columns types can be combined into an array.
It also requires that the table will contain at least 1 row.
Here is a more complex but also stronger solution (types versa), but also requires that the table will contain at least 1 row
select size(str_to_map(val)) from (select transform (struct(*)) using 'sed -r "s/.(.*)./\1/' as val from mytable) t;

SQL With... Update

Is there any way to do some kind of "WITH...UPDATE" action on SQL?
For example:
WITH changes AS
(...)
UPDATE table
SET id = changes.target
FROM table INNER JOIN changes ON table.id = changes.base
WHERE table.id = changes.base;
Some context information: What I'm trying to do is to generate a base/target list from a table and then use it to change values in another table (changing values equal to base into target)
Thanks!
You can use merge, with the equivalent of your with clause as the using clause, but because you're updating the field you're joining on you need to do a bit more work; this:
merge into t42
using (
select 1 as base, 10 as target
from dual
) changes
on (t42.id = changes.base)
when matched then
update set t42.id = changes.target;
.. gives error:
ORA-38104: Columns referenced in the ON Clause cannot be updated: "T42"."ID"
Of course, it depends a bit what you're doing in the CTE, but as long as you can join to your table withint that to get the rowid you can use that for the on clause instead:
merge into t42
using (
select t42.id as base, t42.id * 10 as target, t42.rowid as r_id
from t42
where id in (1, 2)
) changes
on (t42.rowid = changes.r_id)
when matched then
update set t42.id = changes.target;
If I create my t42 table with an id column and have rows with values 1, 2 and 3, this will update the first two to 10 and 20, and leave the third one alone.
SQL Fiddle demo.
It doesn't have to be rowid, it can be a real column if it uniquely identifies the row; normally that would be an id, which would normally never change (as a primary key), you just can't use it and update it at the same time.