Bulk insert copy sql table with golang - sql

For the context, I'm new to go and I'm creating a program that can copy tables from Oracle to MySQL.
I use database/sql go package, so I assume it can be used for migrating any kind of database.
To simplify my question I'm coping on the same MySQL database table name world.city to world.city_copy2.
with my following code, I ended up with the same last values in all the rows in the table :-(
do I somehow need to read through all the values inside the loop? what is the efficient way to do that?
package main
import (
"database/sql"
"fmt"
"strings"
_ "github.com/go-sql-driver/mysql"
)
const (
user = "user"
pass = "testPass"
server = "localhost"
)
func main() {
fmt.Print("test")
conStr := fmt.Sprintf("%s:%s#tcp(%s)/world", user, pass, server)
db, err := sql.Open("mysql", conStr)
if err != nil {
panic(err.Error())
}
defer db.Close()
err = db.Ping()
if err != nil {
panic(err.Error())
}
rows, err := db.Query("SELECT * FROM city")
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
columns, err := rows.Columns()
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
// Make a slice for the values
values := make([]sql.RawBytes, len(columns))
// rows.Scan wants '[]interface{}' as an argument, so we must copy the
// references into such a slice
scanArgs := make([]interface{}, len(values))
for i := range values {
scanArgs[i] = &values[i]
}
// that string will be generated according to len of columns
placeHolders := "( ?, ?, ?, ?, ? )"
// slice will contain all the values at the end
bulkValues := []interface{}{}
valueStrings := make([]string, 0)
for rows.Next() {
// get RawBytes from data
err = rows.Scan(scanArgs...)
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
valueStrings = append(valueStrings, placeHolders)
bulkValues = append(bulkValues, scanArgs...)
//
}
stmStr := fmt.Sprintf("INSERT INTO city_copy2 VALUES %s", strings.Join(valueStrings, ","))
_, err = db.Exec(stmStr, bulkValues...)
if err != nil {
panic(err.Error())
}
}

I have checked out the docs of the library, and it seems that the problem here is that bulkValues keeps the address of the pointer so when scanArgs changes, bulkValues also changes to latest value of that scanArgs.
You need to use the values variable to get the values like below:
func main() {
fmt.Print("test")
conStr := fmt.Sprintf("%s:%s#tcp(%s)/soverflow", user, pass, server)
db, err := sql.Open("mysql", conStr)
if err != nil {
panic(err.Error())
}
defer db.Close()
err = db.Ping()
if err != nil {
panic(err.Error())
}
rows, err := db.Query("SELECT * FROM city")
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
columns, err := rows.Columns()
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
// Make a slice for the values
values := make([]sql.RawBytes, len(columns))
// rows.Scan wants '[]interface{}' as an argument, so we must copy the
// references into such a slice
scanArgs := make([]interface{}, len(values))
for i := range values {
scanArgs[i] = &values[i]
}
// that string will be generated according to len of columns
placeHolders := "( ?, ?, ?, ?, ? )"
// slice will contain all the values at the end
bulkValues := []interface{}{}
valueStrings := make([]string, 0)
// make an interface to keep the record's value
record := make([]interface{}, len(columns))
for rows.Next() {
// get RawBytes from data
err = rows.Scan(scanArgs...)
if err != nil {
panic(err.Error()) // proper error handling instead of panic in your app
}
valueStrings = append(valueStrings, placeHolders)
for i, col := range values {
// you need to be carefull with the datatypes here
// check out the docs for details on here
record[i] = string(value)
}
bulkValues = append(bulkValues, record...)
}
stmStr := fmt.Sprintf("INSERT INTO city_copy2 VALUES %s", strings.Join(valueStrings, ","))
_, err = db.Exec(stmStr, bulkValues...)
if err != nil {
panic(err.Error())
}
}
You can also find the example of the documentation here.
Note: There might be more efficient ways to copy database from psql to mysql but this answer only gives a quick solution for this particular issue that you are having.

Related

how can I achieve faster mariadb inserts

I am dealing with a bit over 15 billion rows of data in various text files. I am trying to insert them into MariaDB using golang. Golang is a fast language and is often used for big data but I cannot get more than 10k-15k inserts a second, at this rate its gonna take over 15 days, I need this data imported sooner than that. I have tried various batch sizes but they all give about the same results.
function I'm using to handle file data:
func handlePath(path string) {
file, err := os.Open(path)
if err != nil {
fmt.Printf("error opening %v: %v", path, err)
return
}
defer file.Close()
scanner := bufio.NewScanner(file)
var temp_lines []string
for scanner.Scan() {
if len(temp_lines) == line_batch {
insertRows(temp_lines)
temp_lines = []string{}
}
temp_lines = append(temp_lines, scanner.Text())
}
insertRows(temp_lines)
fmt.Printf("\nFormatted %v\n", path)
if err := scanner.Err(); err != nil {
fmt.Printf("\nScanner error %v\n", err)
return
}
}
function I'm using for inserting:
func insertRows(rows []string) {
var Args []string
for _, row := range rows {
line_split := strings.Split(row, "|")
if len(line_split) != 6 {return}
database_id := line_split[0]
email := line_split[1]
password := line_split[2]
username := line_split[3]
ip := line_split[4]
phone := line_split[5]
arg := fmt.Sprintf("('%v','%v','%v','%v','%v','%v')",database_id,email,password,username,ip,phone)
Args = append(Args, arg)
}
sqlQuery := fmt.Sprintf("INSERT INTO new_table (database_id, email, password, username, ip, phone_number) VALUES %s", strings.Join(Args, ","))
_, err := db.Exec(sqlQuery)
if err != nil {
//fmt.Printf("%v\n", err)
return
}
total+=line_batch
writes++
}
Server specs:
server

SQL Next not advancing cursor

I have a function that I used to iterate over a result set from a query:
func readRows(rows *sql.Rows, translator func(*sql.Rows) error) error {
defer rows.Close()
// Iterate over each row in the rows and scan each; if an error occurs then return
for shouldScan := rows.Next(); shouldScan; {
if err := translator(rows); err != nil {
return err
}
}
// Check if the rows had an error; if they did then return them. Otherwise,
// close the rows and return an error if the close function fails
if err := rows.Err(); err != nil {
return err
}
return nil
}
The translator function is primarily responsible for calling Scan on the *sql.Rows object. An example of this is:
readRows(rows, func(scanner *sql.Rows) error {
var entry gopb.TestObject
// Embed the variables into a list that we can use to pull information out of the rows
scanned := []interface{}{...}
if err := scanner.Scan(scanned...); err != nil {
return err
}
entries = append(entries, &entry)
return nil
})
I wrote a unit test for this code:
// Create the SQL mock and the RDS reqeuster
db, mock, _ := sqlmock.New()
requester := Requester{conn: db}
defer db.Close()
// Create the rows we'll use for testing the query
rows := sqlmock.NewRows([]string{"id", "data"}).
AddRow(0, "data")
// Verify the command order for the transaction
mock.ExpectBegin()
mock.ExpectQuery(regexp.QuoteMeta("SELECT `id`, `data`, FROM `data`")).WillReturnRows(rows)
mock.ExpectRollback()
// Attempt to get the data
data, err := requester.GetData(context.TODO())
However, it appears that Next is being called infinitely. I'm not sure if this is an sqlmock issue or an issue with my code. Any help would be appreciated.

How can i tell the PATCH Method which field i want to update

I'm working on a simple REST API and I'm having troubles with the PATCH method. I don't know how can i tell the method and the query which fields i want to update(for example which fields are passed as JSON) in the database. Here is what i have so far.
func PatchServer(c echo.Context) error {
patchedServer := new(structs.Server)
requestID := c.Param("id")
if err := c.Bind(patchedServer); err != nil {
return err
}
sql := "UPDATE servers SET server_name = CASE WHEN ? IS NOT NULL THEN ? END WHERE id = ?"
stmt, err := db.Get().Prepare(sql)
if err != nil {
panic(err)
}
_, err2 := stmt.Exec(patchedServer.Name, patchedServer.Name, requestID)
if err2 != nil {
panic(err2)
}
fmt.Println(patchedServer.ID, patchedServer.Name, patchedServer.Components)
fmt.Println("Requested id: ", requestID)
return c.JSON(http.StatusOK, "Patched!")
}

golang sql pointer values keep repeating itself

Below is a golang function which is being called with an input channel
func getOptions(inChannel <-chan Param) <-chan ParamOptions {
paramOptions := make(chan ParamOptions )
go func() {
defer close(paramOptions )
var wg sync.WaitGroup
conn, err := sql.Open("mssql", wellConnStr)
if err != nil {
log.Fatal("open connection failed:", err.Error())
}
defer conn.Close()
getParamOptions := func(db *sql.DB, param *Param) {
defer wg.Done()
fmt.Println("querying options for ", param.Code, param.Name)
rows, err := db.Query(`select *
from dbo.ParamOptions where code=? and name=?`, &param.Code, &param.Name)
fmt.Println("results for ", param.Name, param.Code)
if err != nil {
log.Fatal("query failed:", err.Error())
}
defer rows.Close()
found := false
...
paramOptions <- ParamOptions...
break
}
if found == false {
fmt.Println("did not find options for ", param.Code, param.Name)
}
}
for paramInChannel := range paramChannel {
wg.Add(1)
fmt.Println("retrieving inputs for ", paramInChannel.Code, paramInChannel.Name)
**go** getParamOptions(conn, &wellInChannel)
}
wg.Wait()
}()
return paramOptions
}
If i remove the go keyword before calling the function getParamOptions it works without any problems. However if I use go then the last code and name keeps repeating within the the getParamOptions function, even though the options retrieved seems to be of the correct Param, the Code and Name values are being repeated

GO lang : Communicate with shell process

I want to execute a shell script from Go.
The shell script takes standard input and echoes the result.
I want to supply this input from GO and use the result.
What I am doing is:
cmd := exec.Command("python","add.py")
in, _ := cmd.StdinPipe()
But how do I read from in?
Here is some code writing to a process, and reading from it:
package main
import (
"bufio"
"fmt"
"os/exec"
)
func main() {
// What we want to calculate
calcs := make([]string, 2)
calcs[0] = "3*3"
calcs[1] = "6+6"
// To store the results
results := make([]string, 2)
cmd := exec.Command("/usr/bin/bc")
in, err := cmd.StdinPipe()
if err != nil {
panic(err)
}
defer in.Close()
out, err := cmd.StdoutPipe()
if err != nil {
panic(err)
}
defer out.Close()
// We want to read line by line
bufOut := bufio.NewReader(out)
// Start the process
if err = cmd.Start(); err != nil {
panic(err)
}
// Write the operations to the process
for _, calc := range calcs {
_, err := in.Write([]byte(calc + "\n"))
if err != nil {
panic(err)
}
}
// Read the results from the process
for i := 0; i < len(results); i++ {
result, _, err := bufOut.ReadLine()
if err != nil {
panic(err)
}
results[i] = string(result)
}
// See what was calculated
for _, result := range results {
fmt.Println(result)
}
}
You might want to read/write from/to the process in different goroutines.