Insert into table contains Repeated Record in BigQuery - sql

I have a table in BigQuery with very complex scheme (up to 2300 column)
in these columns I have RECORD type fields, someof them are in REPEATED mode,
The insert statement is generated by processor in the code,
but when test this insertion statement on BigQuery Web-UI I see an error,
after investigating the issue, I found that inserting array is not done in the appropriate way.
INSERT INTO Table_X (RECORD_FIELD) VALUES (
...
STRUCT([STRUCT(X), STRUCT(Y)]) as property_z
...
it this format is correct for inserting REPEATED fields?
INSERT INTO TABLE_NAME (columns) VALUES (STRUCT([ STRUCT(...), STRUCT(...) ]), ...)

Repeated fields are arrays, so you want to insert them as arrays:
INSERT INTO TABLE_NAME (repeated_column)
VALUES (ARRAY[ STRUCT(...), STRUCT(...) ]);
Note that the array is a single column, You can include values for other columns in the INSERT as well.

Related

Inserting Values Into Table with Identity Column via Databricks

I've created a table in Databricks that is mapped to a table hosted in an Azure SQL DB. I'm trying to do a very simple insert statement on a small table, but an identity column is giving me issues. This table has the aforementioned identity column and three additional columns.
I first tried something similar to below:
%sql
INSERT INTO tableName (col2, col3, col4)
VALUES (1, 'Test Value', '2018-11-16')
That was giving me a syntax error, so I did some searching and learned that Hive SQL doesn't allow you to specify columns for an INSERT statement. So then I tried something like below as a test:
%sql
INSERT INTO tableName
VALUES (100, 1, 'Test Value', '2018-11-16')
That gives me an error message that I can't insert explicit values into an identity column, but that's what I expected to happen.
If I can't specify the columns for my INSERT statement, how do I avoid issues when I have an identity column? I just want to insert values for the non-identity columns, and I want the ID column to continue incrementing like normal. The above example is extremely watered-down. I will need to do much larger insertions based on SELECT statements eventually, so any solution involving toggling on IDENTITY_INSERT probably isn't feasible.
Below is how we can create a table with an identity column -
CREATE TABLE table_name
(column_name1 data_type GENERATED ALWAYS AS IDENTITY,
column_name2......)
Below are the two ways how we can insert the data into the table with the Identity column -
First way -
INSERT INTO T2 (CHARCOL2)
SELECT CHARCOL1 FROM T1;
Second way -
INSERT INTO T2 (CHARCOL2,IDENTCOL2) OVERRIDING USER VALUE
SELECT * FROM T1;
Links for reference-
Create table - https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-table-using.html
Insert into table - https://www.ibm.com/docs/en/db2-for-zos/11?topic=statement-rules-inserting-data-into-identity-column

inserting data via SQL script

I have a CSV file which consists of one column. I want to insert (not import) all the elements in the columns into the database table. I know that if I wanted to insert few elements, then I can use the below statement to insert individually.
INSERT INTO table(column_name )
VALUES (element1);
But is there a method that I can insert all the elements at once?
You can just comma separate the values like below. I sometimes use Excel to do the formatting if you have a lot of values.
INSERT INTO table
(column_name)
VALUES
(element1), (element2)

insert data and avoid duplication by checking a specific column

I have a local db that I'm trying to insert multiple rows of data, but I do not want duplicates. I do not have a second db that I'm trying to insert from. I have an sql file. The structure is this for the db I'm inserting into:
(db)artists
(table)names-> ID | ArtistName | ArtistURL | Modified
I am trying to do this insertion:
INSERT names (ArtistName, Modified)
VALUES (name1, date),
(name2, date2),
...
(name40, date40)
The question is, how can I insert data and avoid duplication by checking a specific column to this list of data that I want inserted using SQL?
Duplicate what? Duplicate name? Duplicate row? I'll assume no dup ArtistName.
Have UNIQUE(ArtistName) (or PRIMARY KEY) on the table.
Use INSERT IGNORE instead of IGNORE.
(No LEFT JOIN, etc)
I ended up following the advice of #Hart CO a little bit by inserting all my values into a completely new table. Then I used this SQL statement:
SELECT ArtistName
FROM testing_table
WHERE !EXISTS
(SELECT ArtistName FROM names WHERE
testing_table.ArtistName = testing_table.ArtistName)
This gave me all my artist names that were in my data and not in the name table.
I then exported to an sql file and adjusted the INSERT a little bit to insert into the names table with the corresponding data.
INSERT IGNORE INTO `names` (ArtistName) VALUES
*all my values from the exported data*
Where (ArtistName) could have any of the data returned. For example,
(ArtistName, ArtistUrl, Modified). As long as the values returned from the export has 3 values.
This is probably not the most efficient, but it worked for what I was trying to do.

Get value from insert statement sql

I have the following insert statement for about 1000 rows and now I want to add one new column. The new column will have the value of version+ID in a concat.
How can I get the values from the insert statement and add it to the new column?
Will I have to make a dynamic SQL?
INSERT INTO dbo.Table (Version,ID,Description) VALUES ('2002','1111','Desc')
INSERT INTO dbo.Table (Version,ID,Description) VALUES ('2002','1112','Desc')
Run the Query to insert data
Modify the table by adding the new column
Run the following query
UPDATE dbo.Table
SET NewColumn = Version + ID
I suggest you create a computed column with Version + ID. Depending on the usage you can make this a persisited computed column which will slow down writes but speed up reads.
There are so many ways to maintain scripts:
Find what: (Description)\)([^\(]+)\('([^']+)',\s*'([^']+)',\s*(.+)\)
Replace with: $1, NewColumn\)$2\('$3', '$4', $5, '$3-$4'\)
Literally gets values from insert statement and stores to new column %)

SQL How to insert null value

I want to insert data into one table from another table. In some of the columns I don't have data, so I want to set column to null. I don't know how I should do this?
This is the SQL:
INSERT INTO _21Appoint(
PCUCODE,PID,SEQ,
DATE_SERV,APDATE,
APTYPE,APDIAG,D_UPDATE,CID
) SELECT (
NULL,NULL,NULL,
treatment_date,appointment_date,
typeap_id,appointment_id,NULL,patient_id
) FROM cmu_treatment,cmu_appointment
WHERE cmu_treatment.treatment_id LIKE cmu_appointment.treatment_id;
Your insert is essentially correct. Just don't put the column list in parentheses:
INSERT INTO _21Appoint
(PCUCODE,PID,SEQ,DATE_SERV,APDATE,APTYPE,APDIAG,D_UPDATE,CID)
SELECT NULL,NULL,NULL,treatment_date,appointment_date,typeap_id,appointment_id,NULL,patient_id
FROM cmu_treatment,cmu_appointment
WHERE cmu_treatment.treatment_id LIKE cmu_appointment.treatment_id;
In Postgres (unlike other DBMS) putting a column list in parentheses makes the result a single "record", rather then individual columns. And therefore the select only returns a single column, not multiples and thus it doesn't match the column list for the insert
another option is to simply leave out the columns completely:
INSERT INTO _21Appoint
(DATE_SERV,APDATE,APTYPE,APDIAG,CID)
SELECT treatment_date,appointment_date,typeap_id,appointment_id,patient_id
FROM cmu_treatment,cmu_appointment
WHERE cmu_treatment.treatment_id LIKE cmu_appointment.treatment_id;