How I can use one parse (Scan) method for *sql.Rows and *sql.Row in Go?
Parse (Scan) methods use one code for parse one row
...
row := r.stmOne.QueryRow(id)
rows, err := r.stmOther.Query(ids, params)
parseRow(row, &item)
for rows.Next(){
parseRows(rows, &item)
}
...
func parseRows(row *sql.Rows, item *typeItem) error {
err := row.Scan(....) /// same
}
func parseRow(row *sql.Row, item *typeItem) error {
err := row.Scan(....) /// same
}
type RowScanner interface {
Scan(dest ...interface{}) error
}
func scanRowIntoItem(row RowScanner, item *typeItem) error {
err := row.Scan(...)
}
row := r.stmOne.QueryRow(id)
rows, err := r.stmOther.Query(ids, params)
scanRowIntoItem(row, &item)
for rows.Next(){
scanRowIntoItem(rows, &item)
}
Related
I am dealing with a bit over 15 billion rows of data in various text files. I am trying to insert them into MariaDB using golang. Golang is a fast language and is often used for big data but I cannot get more than 10k-15k inserts a second, at this rate its gonna take over 15 days, I need this data imported sooner than that. I have tried various batch sizes but they all give about the same results.
function I'm using to handle file data:
func handlePath(path string) {
file, err := os.Open(path)
if err != nil {
fmt.Printf("error opening %v: %v", path, err)
return
}
defer file.Close()
scanner := bufio.NewScanner(file)
var temp_lines []string
for scanner.Scan() {
if len(temp_lines) == line_batch {
insertRows(temp_lines)
temp_lines = []string{}
}
temp_lines = append(temp_lines, scanner.Text())
}
insertRows(temp_lines)
fmt.Printf("\nFormatted %v\n", path)
if err := scanner.Err(); err != nil {
fmt.Printf("\nScanner error %v\n", err)
return
}
}
function I'm using for inserting:
func insertRows(rows []string) {
var Args []string
for _, row := range rows {
line_split := strings.Split(row, "|")
if len(line_split) != 6 {return}
database_id := line_split[0]
email := line_split[1]
password := line_split[2]
username := line_split[3]
ip := line_split[4]
phone := line_split[5]
arg := fmt.Sprintf("('%v','%v','%v','%v','%v','%v')",database_id,email,password,username,ip,phone)
Args = append(Args, arg)
}
sqlQuery := fmt.Sprintf("INSERT INTO new_table (database_id, email, password, username, ip, phone_number) VALUES %s", strings.Join(Args, ","))
_, err := db.Exec(sqlQuery)
if err != nil {
//fmt.Printf("%v\n", err)
return
}
total+=line_batch
writes++
}
Server specs:
server
I have a function that I used to iterate over a result set from a query:
func readRows(rows *sql.Rows, translator func(*sql.Rows) error) error {
defer rows.Close()
// Iterate over each row in the rows and scan each; if an error occurs then return
for shouldScan := rows.Next(); shouldScan; {
if err := translator(rows); err != nil {
return err
}
}
// Check if the rows had an error; if they did then return them. Otherwise,
// close the rows and return an error if the close function fails
if err := rows.Err(); err != nil {
return err
}
return nil
}
The translator function is primarily responsible for calling Scan on the *sql.Rows object. An example of this is:
readRows(rows, func(scanner *sql.Rows) error {
var entry gopb.TestObject
// Embed the variables into a list that we can use to pull information out of the rows
scanned := []interface{}{...}
if err := scanner.Scan(scanned...); err != nil {
return err
}
entries = append(entries, &entry)
return nil
})
I wrote a unit test for this code:
// Create the SQL mock and the RDS reqeuster
db, mock, _ := sqlmock.New()
requester := Requester{conn: db}
defer db.Close()
// Create the rows we'll use for testing the query
rows := sqlmock.NewRows([]string{"id", "data"}).
AddRow(0, "data")
// Verify the command order for the transaction
mock.ExpectBegin()
mock.ExpectQuery(regexp.QuoteMeta("SELECT `id`, `data`, FROM `data`")).WillReturnRows(rows)
mock.ExpectRollback()
// Attempt to get the data
data, err := requester.GetData(context.TODO())
However, it appears that Next is being called infinitely. I'm not sure if this is an sqlmock issue or an issue with my code. Any help would be appreciated.
In my Golang application I make SQL request to the database. Usually, in the SQL query, I specify the columns that I want to get from the table and create a structure based on it. You can see an example of the working code below.
QUESTION:
What should I do if I don't know the number and name of columns in the table? For example, I make the SQL request like SELECT * from filters; instead of SELECT FILTER_ID, FILTER_NAME FROM filters;. How do I create a structure in this case?
var GetFilters = func(responseWriter http.ResponseWriter, request *http.Request) {
rows, err := database.ClickHouse.Query("SELECT * FROM filters;"); if err != nil {
fmt.Println(err)
return
}
defer rows.Close()
columns, err := rows.Columns(); if err != nil {
fmt.Println(err)
return
}
filters := make([]interface{}, len(columns))
for i, _ := range columns {
filters[i] = new(sql.RawBytes)
}
for rows.Next() {
if err = rows.Scan(filters...); err != nil {
fmt.Println(err)
return
}
}
utils.Response(responseWriter, http.StatusOK, filters)
}
Well, finally I found the solution. As you can see from the code below first I make SQL request where I do not specify the name of the columns. Then I take information about columns by ColumnTypes() function. This function returns column information such as column type, length and nullable. Next I will learn the name and type of columns, fill interface with these data:
for i, column := range columns {
object[column.Name()] = reflect.New(column.ScanType()).Interface()
values[i] = object[column.Name()]
}
The full code which I use looks like this:
var GetFilters = func(responseWriter http.ResponseWriter, request *http.Request) {
rows, err := database.ClickHouse.Query("SELECT * FROM table_name;"); if err != nil {
fmt.Println(err)
return
}
defer rows.Close()
var objects []map[string]interface{}
for rows.Next() {
columns, err := rows.ColumnTypes(); if err != nil {
fmt.Println(err)
return
}
values := make([]interface{}, len(columns))
object := map[string]interface{}{}
for i, column := range columns {
object[column.Name()] = reflect.New(column.ScanType()).Interface()
values[i] = object[column.Name()]
}
if err = rows.Scan(values...); err != nil {
fmt.Println(err)
return
}
objects = append(objects, object)
}
utils.Response(responseWriter, http.StatusOK, objects)
}
Use the USER_TAB_COLUMNS table to get the list of columns in the executing table query store it an array or collection. later execute the query and Scan the columns that you already know from the previous Query.
I have searched for a solution on the net and in SO, but found nothing that apply to return values. It is a simple sql query with several rows that I want to return. Error handling not included:
func Fetch(query string) (string) {
type User struct{
id string
name string
}
rows, err := db.Query(query)
users := make([]*User, 0)
for rows.Next() {
user := new(User)
err := rows.Scan(&user.id, &user.name)
users = append(users, user)
}
return(users)
}
I get this error when compiling:
cannot use users (type []*User) as type string in return argument
How should I do to get a correct return value?
The expected input is
JD John Doe --OR-- {id:"JD",name:"John Doe"}
Add this to your code:
type userSlice []*User
func (us userSlice) String() string{
var s []string
for _, u := range us {
if u != nil {
s = append(s, fmt.Sprintf("%s %s", u.id, u.name))
}
}
return strings.Join(s, "\n")
}
type User struct{
id string
name string
}
In your Fetch function replace the last return statement like this:
func Fetch(query string) (string) {
// Note that we declare the User type outside the function.
rows, err := db.Query(query)
users := make([]*User, 0)
for rows.Next() {
user := new(User)
err := rows.Scan(&user.id, &user.name)
users = append(users, user)
}
return(userSlice(users).String()) // Replace this line in your code
}
If you have to return a String you can use the encoding/json package to serialize your user object, but you have to use fields that begin with capital letters for them to be exported. See the full example:
import (
"encoding/json"
)
func Fetch(query string) (string) {
type User struct{
Id string // <-- CHANGED THIS LINE
Name string // <-- CHANGED THIS LINE
}
rows, err := db.Query(query)
users := make([]*User, 0)
for rows.Next() {
user := new(User)
err := rows.Scan(&user.id, &user.name)
users = append(users, user)
}
return(ToJSON(users)) // <-- CHANGED THIS LINE
}
func ToJSON(obj interface{}) (string) {
res, err := json.Marshal(obj)
if err != nil {
panic("error with json serialization " + err.Error())
}
return string(res)
}
Below is a golang function which is being called with an input channel
func getOptions(inChannel <-chan Param) <-chan ParamOptions {
paramOptions := make(chan ParamOptions )
go func() {
defer close(paramOptions )
var wg sync.WaitGroup
conn, err := sql.Open("mssql", wellConnStr)
if err != nil {
log.Fatal("open connection failed:", err.Error())
}
defer conn.Close()
getParamOptions := func(db *sql.DB, param *Param) {
defer wg.Done()
fmt.Println("querying options for ", param.Code, param.Name)
rows, err := db.Query(`select *
from dbo.ParamOptions where code=? and name=?`, ¶m.Code, ¶m.Name)
fmt.Println("results for ", param.Name, param.Code)
if err != nil {
log.Fatal("query failed:", err.Error())
}
defer rows.Close()
found := false
...
paramOptions <- ParamOptions...
break
}
if found == false {
fmt.Println("did not find options for ", param.Code, param.Name)
}
}
for paramInChannel := range paramChannel {
wg.Add(1)
fmt.Println("retrieving inputs for ", paramInChannel.Code, paramInChannel.Name)
**go** getParamOptions(conn, &wellInChannel)
}
wg.Wait()
}()
return paramOptions
}
If i remove the go keyword before calling the function getParamOptions it works without any problems. However if I use go then the last code and name keeps repeating within the the getParamOptions function, even though the options retrieved seems to be of the correct Param, the Code and Name values are being repeated