BigQuery Schema Update while copying data from other tables - google-bigquery

I have table1 which has lots of nested columns. And table2 has some additional columns which may have nested columns. I'm using golang client library.
Is there any way to update the schema while copying from one table to another table..?
Sample Code :
dataset := client.Dataset("test")
copier=dataset.Table(table1).CopierFrom(dataset.Table(table2))
copier.WriteDisposition = bigquery.WriteAppend
copier.CreateDisposition = bigquery.CreateIfNeeded
job, err = copier.Run(ctx)
if err != nil {
fmt.Println("error while run :", err)
}
status, err = job.Wait(ctx)
if err != nil {
fmt.Println("error in wait :", err)
}
if err := status.Err(); err != nil {
fmt.Println("error in status :", err)
}

Some background first:
I created 2 Tables under the data collection test as following:
1 Schema: name (String), age (Integer)
"Varun", 19
"Raja", 27
2 Schema pet_name (String), type (String)
"jimmy", "dog"
"ramesh", "cat"
Note that the two relations have different schemas.
Here I am copying the contents of data table 2 into 1. The bigquery.WriteAppend tells the query engine to append results of table 2 into 1.
test := client.Dataset("test")
copier := test.Table("1").CopierFrom(test.Table("2"))
copier.WriteDisposition = bigquery.WriteAppend
if _, err := copier.Run(ctx); err != nil {
log.Fatalln(err)
}
query := client.Query("SELECT * FROM `test.1`;")
results, err := query.Read(ctx)
if err != nil {
log.Fatalln(err)
}
for {
row := make(map[string]bigquery.Value)
err := results.Next(&row)
if err == iterator.Done {
return
}
if err != nil {
log.Fatalln(err)
}
fmt.Println(row)
}
Nothing happens and the result is:
map[age:19 name:Varun]
map[name:Raja age:27]
Table 1, the destination is unchanged.
What if source and destination had the same schemas in the copy?
For example:
copier := test.Table("1").CopierFrom(test.Table("1"))
Then the copy succeeds! Add table 1 has twice the rows that initially had.
map[name:Varun age:19]
map[age:27 name:Raja]
map[name:Varun age:19]
map[name:Raja age:27]
But what if we somehow wanted to combine tables even with different schemas?
Well first you need a GCP Billing account as your are technically doing Data Manipulations (DML). You can get $300 free credit.
Then the following will work
query := client.Query("SELECT * FROM `test.2`;")
query.SchemaUpdateOptions = []string{"ALLOW_FIELD_ADDITION", "ALLOW_FIELD_RELAXATION"}
query.CreateDisposition = bigquery.CreateIfNeeded
query.WriteDisposition = bigquery.WriteAppend
query.QueryConfig.Dst = client.Dataset("test").Table("1")
results, err := query.Read(ctx)
And the result is
map[pet_name:<nil> type:<nil> name:Varun age:19]
map[name:Raja age:27 pet_name:<nil> type:<nil>]
map[pet_name:ramesh type:cat name:<nil> age:<nil>]
map[pet_name:jimmy type:dog name:<nil> age:<nil>]
EDIT
Instead of query.Read() you can use query.Run() if you just want to run the query and not fetch results back, as show below:
if _, err := query.Run(ctx); err != nil {
log.Fatalln(err)
}
Important things to note:
We have set query.SchemaUpdateOptions to include ALLOW_FIELD_ADDITION which will allow for the resulting table to have columns not originally present.
We have set query.WriteDisposition to bigquery.WriteAppend for data to be appended.
We have set query.QueryConfig.Dst to client.Dataset("test").Table("1") which means the result of the query will be uploaded to 1.
Values that are not in both tables but in just one are nullified or set to nil in Golang sense.
This hack will give you the same results as combining two tables.
Hope this helps.

Related

Submitting an SQL query with a slice parameter

I have a Snowflake query where I'm trying to update a field on all items where another field is in a list which is submitted to the query as a variable:
UPDATE my_table SET download_enabled = ? WHERE provider_id = ? AND symbol IN (?)
I've tried doing this query using the gosnowflake.Array function like this:
enable := true
provider := 1
query := "UPDATE my_table SET download_enabled = ? WHERE provider_id = ? AND symbol IN (?)"
if _, err := client.db.ExecContext(ctx, query, enable, provider,
gosnowflake.Array(assets)); err != nil {
fmt.Printf("Error: %v", err)
}
However, this code fails with the following error:
002099 (42601): SQL compilation error: Batch size of 1 for bind variable 1 not the same as previous size of 2.
So then, how can I submit a variable representing a list of values to an SQL query?
I found a potential workaround, which is to submit each item in the list as a separate parameter explicitly:
func Delimit(s string, sep string, count uint) string {
return strings.Repeat(s+sep, int(count)-1) + s
}
func doQuery(enable bool, provider int, assets ...string) error {
query := fmt.Sprintf("UPDATE my_table SET download_enabled = ? " +
"WHERE provider_id = ? AND symbol IN (%s)", Delimit("?", ", ", uint(len(assets))))
params := []interface{}{enable, provider}
for _, asset := range assets {
params = append(params, asset)
}
if _, err := client.db.ExecContext(ctx, query, params...); err != nil {
return err
}
return nil
}
Needless to say this is a less elegant solution then what I wanted but it does work.

How can i unlock the Database in Go

Im a newbie in go and not the best in sql.
I have a simple Table in my Database with the name of users. I store the SAM, First Name and Last Name in the table. When i now try to change something in the database, i get the error database is locked. Thats my code:
func createNewUser(w http.ResponseWriter, r *http.Request) {
var user User
err := decodeJSONBody(w, r, &user)
if checkError(w, err) {
return
}
rows, err := mainDB.Query("SELECT * FROM users WHERE SAM = ?", user.Sam)
if checkError(w, err) {
return
}
defer rows.Close()
if rows.Next() {
http.Error(w, "User already exists", http.StatusConflict)
return
}
_, err = mainDB.Exec("INSERT INTO users (SAM, Vorname, Nachname) VALUES (?, ?, ?)", user.Sam, user.Vorname, user.Nachname)
if checkError(w, err) {
return
}
json.NewEncoder(w).Encode(user)
}
decodeJSONBody and checkError work and have nothing to do with the database.
And as far as I've learned, rows.Close should close the columns so that I can write something back in
As per the comments SQLite has some limitations around locking/concurrency which means you need to take care when running multiple statements concurrently. Unfortunately I had not reviewed your code in detail when posting my comment so, despite seemingly solving the issue, it was in error.
You had added a defer rows.Close(); this will free up the database connection used to run the query but, due to the defer, this will only happen when the surrounding function returns. Normally this is not a big issue because looping through a result set in its entirety automatically closes the rows. The documentation states:
If Next is called and returns false and there are no further result sets, the Rows are closed automatically and it will suffice to check the result of Err.
In your code you do return if rows.Next() is true:
if rows.Next() {
http.Error(w, "User already exists", http.StatusConflict)
return
}
This means that adding an extra rows.Close() should not be needed. However as you say "added rows.Close() multiple times, and now it works" I suspect that your full code may have been a bit more complicated than that presented (and one of the added rows.Close() was needed).
So adding extra calls to rows.Close() should not be needed; it will not cause an issue (other than an unnecessary function call). However you should check for errors:
rows, err := mainDB.Query("SELECT * FROM users WHERE SAM = ?", user.Sam)
if checkError(w, err) {
rows.Close()
return
}
if rows.Next() {
http.Error(w, "User already exists", http.StatusConflict)
return
}
if err = rows.Err(); err != nil {
return // It's worth checking fort an error here
}
Note that the FAQ for go-sqlite3 includes information on dealing with "Error: database is locked" (and it's worth ensuring you follow the recommendations).
Note2: Consider using EXISTS instead of running the query and then attempting to fetch a row - it is likely to be faster and allows you to use QueryRow which simplifies your code.

Change more than one row in postgresql

I need to change rows in my DB using 2 arrays, first stores names of rows which i need to change, second stores val. I've added the code , to understand what I want to do. Can i do it with 1 request to my DB
func update_1(){
key := []string{"Name1", "Name2", "Name4"}
val := []string{"1", "2", "4"}
for i, _ := range key{
_, err := db.Exec("UPDATE table SET val = $1 WHERE name = $2",val[i], key[i])
if err != nil {
errorLog.Println(err)
return
}
}
}
You can pass the arrays into a Postgres query as parameters. Then it is a simple unnest() and update:
update t
set val = u.val
from unnest(:ar_names, :ar_vals) u(name, val)
where t.name = u.name;

sql: expected 0 arguments, got 2

I'm struggling to properly utilize sqlx and the pq driver for Postgres to create a row in the database. Let's start simple:
I have a user, role and user_role table. I want to insert a role into the database and get the ID of the inserted row. This works flawlessly using the following sql:
const createRoleSQL = "INSERT INTO role (name) VALUES (:name) RETURNING id"
To make that work in go, I prepare the statement at some point:
createStmt, err := db.PrepareNamed(createRoleSQL)
if err != nil {
// ...
}
When creating, I run the query as part of a transaction tx. role is obviously a struct with the correct fields and db tags:
if err := tx.NamedStmt(createStmt).QueryRow(role).Scan(&role.ID); err != nil {
// ...
}
This works perfectly fine.
Now I wanted to extend that and insert a new role and assign it to a user:
const createUserRoleSQL = `
DO $$
DECLARE role_id role.id%TYPE;
BEGIN
INSERT INTO role (name) VALUES ($2) RETURNING id INTO role_id;
INSERT INTO user_role (user_id, role_id) VALUES ($1, role_id);
END $$`
createStmt, err := db.Preparex(createUserRoleSQL)
if err != nil {
// ...
}
if err := tx.Stmtx(createStmt).QueryRow(userID, role.Name).Scan(&role.ID); err != nil {
// ...
}
Unfortunately this fails with sql: expected 0 arguments, got 2. Is it possible to achieve what I want to do, with a single query?

How to get last inserted ID with GO-MSSQLDB driver?

I gathered that SQL Server does not return last inserted id automatically and I need to do it manually with: OUTPUT INSERTED.ID within SQL insert statement.
How do I pick it up later in Go code?
The function in question is:
func (sta *state) mkLogEntry(from time.Time, to time.Time, man bool) (id int64) {
qry := "INSERT INTO ROMEExportLog(FromDate,ToDate,ExecutedAt,ExecutedManually,ExportWasSuccessful,UpdatedDaysIrregular) OUTPUT INSERTED.ID " +
"VALUES(#FromDate,#ToDate,#ExecutedAt,#ExecutedManually,#ExportWasSuccessful,#UpdatedDaysIrregular)"
res, err := sta.db.Exec(qry,
sql.Named("FromDate", from),
sql.Named("ToDate", to),
sql.Named("ExecutedAt", time.Now()),
sql.Named("ExecutedManually", man),
sql.Named("ExportWasSuccessful", false),
sql.Named("UpdatedDaysIrregular", false),
)
if err != nil {
log.Fatal(err)
}
id, err = res.LastInsertId()
if err != nil {
log.Fatal(err)
}
return
}
The res.LastInsertId() returns There is no generated identity value.
FOR SQL SERVER:
This may be gimmicky... But I found that using QueryRow
will run multiple queries/commands and just return the LAST row.
if err := db.QueryRow(`
INSERT INTO [TABLE] ([COLUMN]) VALUES (?);
SELECT SCOPE_IDENTITY()`,
colValue
).Scan(&id); err != nil {
panic(err)
}
As long as SELECT SCOPE_IDENTITY() is the last row returned then this essentially does what I would have expected result.LastInsertId() to do.
The reason for this is because PostgreSQL does not return you the last inserted id. This is because last inserted id is available only if you create a new row in a table that uses a sequence.
If you actually insert a row in the table where a sequence is assigned, you have to use RETURNING clause. Something like this: INSERT INTO table (name) VALUES("val") RETURNING id.
I am not sure about your driver, but in pq you will do this in the following way:
lastInsertId := 0
err = db.QueryRow("INSERT INTO brands (name) VALUES($1) RETURNING id", name).Scan(&lastInsertId)