Merging two tables into one with the same column names - sql

I use this command to merge 2 tables into one:
CREATE TABLE table1 AS
SELECT name, sum(cnt)
FROM (SELECT * FROM table2 UNION ALL SELECT * FROM table3) X
GROUP BY name
ORDER BY 1;
table2 and table3 are tables with columns named name and cnt, but the result table (table1) has the columns name and sum.
The question is how to change the command so that the result table will have the columns name and cnt?

Have you tried this (note the AS cnt)?
CREATE TABLE table1 AS SELECT name,sum(cnt) AS cnt
FROM ...

In the absence of an explicit name, the output of a function inherits the basic function name in Postgres. You can use a column alias in the SELECT list to fix this - like #hennes already supplied.
If you need to inherit all original columns with name and type (and possibly more) you can also create the table with a separate command:
To copy columns with names and data types only, still use CREATE TABLE AS, but add LIMIT 0:
CREATE TABLE table1 AS
TABLE table2 LIMIT 0; -- "TABLE" is just shorthand for "SELECT * FROM"
To copy (per documentation):
all column names, their data types, and their not-null constraints:
CREATE TABLE table1 (LIKE table2);
... and optionally also defaults, constraints, indexes, comments and storage settings:
CREATE TABLE table1 (LIKE table2 INCLUDING ALL);
... or, for instance, just defaults and constraints:
CREATE TABLE table1 (LIKE table2 INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
Then INSERT:
INSERT INTO table1 (name, cnt)
SELECT ... -- column names are ignored

Related

How to update a nested bigquery column with data from another bigquery table

I have 2 bigquery tables with nested columns, I need to update all the columns in table1 whenever table1.value1=table2.value, also those tables having a huge amount of data.
I could update a single nested column with static column like below,
#standardSQL
UPDATE `ck.table1`
SET promotion_id = ARRAY(
SELECT AS STRUCT * REPLACE (100 AS PromotionId ) FROM UNNEST(promotion_id)
)
But when I try to reuse the same to update multiple columns based on table2 data I am getting exceptions.
I am trying to update table1 with table2 data whenever the table1.value1=table2.value with all the nested columns.
As of now, both tables are having a similar schema.
I need to update all the columns in table1 whenever table1.value1=table2.value
... both tables are having a similar schema
I assume by similar you meant same
Below is for BigQuery Standard SQL
You can use below query to get combining result and save it back to table1 either using destination table or CREATE OR REPLACE TABLE syntax
#standardSQL
SELECT AS VALUE IF(value IS NULL, t1, t2)
FROM `project.dataset.table1` t1
LEFT JOIN `project.dataset.table2` t2
ON value1 = value
I have not tried this approach with UPDATE syntax - but you can try and let us know :o)

Postgres SQL: Find exceptions when using an in clause

I am running the following (Postgres) SQL against a table containing a list of ids. The SQL below will return all the ids found in the list* below.
select id from table
where id in (1,2,3,5,8,11,13,22,34,55);
How can I return ids which are contained in the list but not in the table? I realise I can do this using a temp table (with the list in it) and a left outer join but is there a quicker/cleverer way?
To check if arbitrary ids exist in your table, use a CTE and exists
WITH ids (id) AS ( VALUES (1),(2),(3),(5),(8),(11),(13),(22),(34),(55)
)
SELECT id
FROM ids
WHERE NOT EXISTS(SELECT TRUE FROM table WHERE table.id = ids.id)
note1: alternatively use a left join instead of WHERE NOT EXISTS
note2: it may be necessary to add the appropriate type casts
Or you can use EXCEPT
WITH ids (id) AS ( VALUES (1),(2),(3),(5),(8),(11),(13),(22),(34),(55)
)
SELECT id
FROM ids
EXCEPT ALL
SELECT id FROM ids

Use inserted value as a parameter for other inserts

There is a db2 database with two tables. The first one, table1, has autoincrement column ID. It is the foreign key for the table2.
A am writing an HTML generator for SQL queries. So with some input parameters it generates a query or multiple queries. It is not connected to the database.
What I need is to get that autoincrement field and use it in next queries.
So basically, the scenario is:
insert into table1;
select autogenerated field ID;
insert into table2 using that ID;
insert into table2 using that ID;
...some more similar inserts...
insert into table2 using that ID;
And all that SQL query should be generated and then used as a single SQL script.
I was thinking about something like this:
SELECT ID FROM FINAL TABLE (INSERT INTO Table1 (t1column1, t1column2, etc.)
VALUES (t1value1, t1value2, etc.))
But I don't know, how I can write the result into a variable so I could use it in next queries like this:
INSERT INTO Table2 (foreignKeyCol, t2column1, t2column2, etc.)
VALUES ($ID, t2value1, t2value2, etc.)
I could just paste that select instead of $ID, but the second query can be used several times with the same $ID and different values.
EDIT: DB2 10.5 on Linux.
You can chain several inserts together using CTEs, like so:
WITH idcte (id) as (
SELECT ID FROM FINAL TABLE (
INSERT INTO Table1 (t1column1, t1column2, etc.)
VALUES (t1value1, t1value2, etc.)
)
),
ins1 (id) as (
SELECT foreignKeyCol FROM FINAL TABLE (
INSERT INTO Table2 (foreignKeyCol, t2column1, t2column2, etc.)
SELECT id, t2value1, t2value2, etc.
FROM idcte
)
),
-- more CTEs
SELECT foreignKeyCol FROM FINAL TABLE (
-- your last INSERT ... SELECT FROM
)
Essentially you will have to wrap each INSERT into a SELECT FROM FINAL TABLE for this to work.
Alternatively, you can use a global variable to keep the ID value:
CREATE VARIABLE myNewId INT;
SET myNewId = (SELECT ID FROM FINAL TABLE (
INSERT INTO Table1 (t1column1, t1column2, etc.)
VALUES (t1value1, t1value2, etc.)
));
INSERT INTO Table2 (foreignKeyCol, t2column1, t2column2, etc.)
VALUES (myNewId, t2value1, t2value2, etc.);
DROP VARIABLE myNewId;
This assumes a recent version of Db2 for LUW.

Hive - getting the column names count of a table

How can I get the hive column count names using HQL? I know we can use the describe.tablename to get the names of columns. How do we get the count?
create table mytable(i int,str string,dt date, ai array<int>,strct struct<k:int,j:int>);
select count(*)
from (select transform ('')
using 'hive -e "desc mytable"'
as col_name,data_type,comment
) t
;
5
Some additional playing around:
create table mytable (id int,first_name string,last_name string);
insert into mytable values (1,'Dudu',null);
select size(array(*)) from mytable limit 1;
This is not bulletproof since not all combinations of columns types can be combined into an array.
It also requires that the table will contain at least 1 row.
Here is a more complex but also stronger solution (types versa), but also requires that the table will contain at least 1 row
select size(str_to_map(val)) from (select transform (struct(*)) using 'sed -r "s/.(.*)./\1/' as val from mytable) t;

Oracle Compare data between two different table

I have two table one is having all field VARCHAR2 but other having different type for different data.
For Example :
Table One
==========================
Col 1 VARCHAR2 UNIQUE KEY
Col 2 VARCHAR2
Col 3 VARCHAR2
===========================
Table Two
==========================
Col One VARCHAR2 UNIQUE KEY
Col Two TIMESTAMP
Col Three NUMBER
==========================
we are having one mapping table. it denotes which column of Table One has to compare with which column of Table Two.
For Example
Mapping Table
==============================
Table One Table Two
==============================
Col 1 Col One
Col 2 Col Three
Col 3 Col Two
==============================
Now with the help of UNIQUE KEY of TABLE ONE we have to find same row in TABLE TWO and compare rows column by column and get changes in data.
Currently we are using java program for comparing data row by row and column by column and getting changes between data in rows with same UNIQUE KEY. it is working fine but taking too much time as we are having 100000 records in DB.
Now my question is : is there any way i can compare data at SQL level and get changes in data?
You can do it 'manually' with a query like this: It's a lot of work, but there are only three different types of checks you need to do, so it's not very complex:
select
*
from
Table1 t1
full outer join Table2 t2 on t2.ID = t1.ID
where
-- Check ID, either record does not exist in either table.
t1.ID is null or
t2.ID = null or
-- Not nullable field can be easily compared.
t1.NotNullableField1 <> t2.NotNUllableField1 or
-- Nullable field is slightly more work.
t1.NullableField1 <> t2.NullableField1 or
(t1.NullableField1 is null and t2.NullableField1 is not null) or
(t1.NullableField1 is not null and t2.NullableField1 is null)
Another solution is to use MINUS, which is a bit like UNION, only it returns a dataset minus the records in a second dataset:
select * from Table1 t1
MINUS
select * from Table2 t2
This works only one way (which might be fine for your purpose), but you can also combine it with UNION to make it bidirectional.
select
*
from
( select * from Table1
MINUS
select * from Table2)
UNION ALL
( select * from Table2
MINUS
select * from Table1)
The output of both solutions is a bit different.
In the FULL OUTER JOIN query, the IDs will be joined and the values of the matching rows will be displayed next to each other as a single row.
In the MINUS query, the result will be presented as a single dataset. If a record does not exist in either one table, it will be displayed. If a record (ID) exists in both tables, but other fields are different, you will get both rows. So it's a bit harder to compare them.
See: http://www.techonthenet.com/oracle/minus.php