Export BigQuery temporary table into multiple files based on value in column - google-bigquery

I have a following problem.
Basing on query like this:
SELECT lp_id, MOD(ABS(FARM_FINGERPRINT(lp_id)), 10) AS bucket FROM dataset.table
I am doing the query and saving the result as csv in Google Storage.
// defined in scope:
// ctx context.Context
// bucket string
// folderName string
// queryString string
query := bqClient.Query(queryString)
job, err := query.Run(ctx)
conf, err := job.Config()
table := conf.(*bigquery.QueryConfig).Dst
gcsURI := fmt.Sprintf("gs://%s/%s/*.%s", bucket, folderName, "csv")
gcsRef := bigquery.NewGCSReference(gcsURI)
gcsRef.FieldDelimiter = ","
extractor := table.ExtractTo(gcsRef)
// run job...
And, what I want to do is to split the result into multiple files basing on the bucket the user is (like users from bucket n into file {{ n_filename }}) in one job, to avoid increasing costs for data processing.
Is it possible?
Thanks for your help.

Related

How to Fetch specific number of characters from a string using gorm?

I am using SQLite, if I have a text post or an article that is a 400 character string for example, I want to extract only the first 100 character from it to be used in the front-end.
In line 4 i extracted the latest 5 posts from the db, but i want to limit the body of the post to the first 100 characters only
func GetLatestPosts() *[]Post {
db := database.Connect()
var posts []Post
db.Limit(5).Order("created_at desc").Find(&posts)
// SELECT LEFT(body, 100) FROM posts
// db.Raw("SELECT id, title, body, tags FROM posts").Scan(&posts)
return &posts
}
How can i do that ?
What you want is to either use Select or Raw to run the substr sqlite function on the post's body.
Like this:
err := db.Select("id, title, substr(body, 1, 100) AS body, tags").
Limit(5).
Order("created_at DESC").
Find(&posts).
Error
// error check
Or with Raw:
err := db.Raw("SELECT id, title, substr(body, 1, 100) AS body, tags FROM posts").
Scan(&posts).
Error
// error check
Key aspect to remember is to make sure the column is named body so that scanning into your model works as before.

Why is QueryRow().Scan() returning an empty string when it is not empty in table?

I am trying to Query a single row from a PostgreSQL database table.
func getPrefix(serverID int64, db *sql.DB) string {
var prefix string
err := db.QueryRow("SELECT prefix FROM servers WHERE serverid = 1234").Scan(&prefix)
if err != nil {
fmt.Println(err.Error())
}
spew.Dump(prefix)
fmt.Println("Prefix is " + prefix)
return prefix
}
Apparently, the variable prefix is an empty String, but when I query it in the database, it's not empty
You are now connected to database "mewbot" as user "postgres".
mewbot=# select * from servers;
serverid | prefix
----------+--------
1234 | ;
(1 row)
mewbot=#
My question is, why is it returning an Empty String when it should be ;
All checks taken; I've made sure I'm connected to the same database et al
Apparently it was not working because SSL mode was disabled on my server but I was trying to connect without specifying it was disabled.
Changing postgres://postgres:7890#localhost:5432/mewbot to postgres://postgres:7890#localhost:5432/mewbot?sslmode=disable solved my issue.

Control flow over query results in SQLX (lazy/eager)

I'm implementing a messages table with postgres (aws-rds) and I'm using golang as a backend to query the table.
CREATE TABLE:
CREATE TABLE IF NOT EXISTS msg.Messages(
id SERIAL PRIMARY KEY,
content BYTEA,
timestamp DATE
);
Here is the INSERT query:
INSERT INTO msg.Messages (content,timestamp) VALUES ('blob', 'date')
RETURNING id;
Now I want to be able to fetch a specific message, like this:
specific SELECT query:
SELECT id, content,timestamp
FROM msg.Messages
WHERE id = $1
Now let's say the user was offline for a long time and he needs to get a lot of messages from this table, let's say 10M messages, I don't want to return all results because it might explode the app memory.
each user saves his last message.id that he fetched, so the query will be:
SELECT id, content, timestamp
FROM msg.Messages
WHERE id > $1
Implementing paging in this query is feeling like inventing the wheel again, there must be out of the box solution for that.
I'm using sqlx, here is a rough example of my code:
query := `
SELECT id, content, timestamp
FROM msg.Messages
WHERE id > $0
`
args = 5
query = ado.db.Rebind(query)
rows, err := ado.db.Queryx(query, args...)
var res []Message
for rows.Next() {
msg := Message{}
err = rows.StructScan(&msg)
if err != nil {
return nil, err
}
res = append(res, msg)
}
return res, nil
How can I convert this code to be with lazy loading, that only on rows.next() will fetch the next item (and not loading all items in advance), and what about the garbage collector,
will it release the memory on each iteration of the row.next()??

"Operator does not exist: integer =?" when using Postgres

I have a simple SQL query called within the QueryRow method provided by go's database/sql package.
import (
"github.com/codegangsta/martini"
"github.com/martini-contrib/render"
"net/http"
"database/sql"
"fmt"
_ "github.com/lib/pq")
)
type User struct {
Name string
}
func Show(db *sql.DB, params martini.Params) {
id := params["id"]
row := db.QueryRow(
"SELECT name FROM users WHERE id=?", id)
u := User{}
err := row.Scan(&u.Name)
fmt.Println(err)
}
However, I'm getting the error pq: operator does not exist: integer =? It looks like the code doesn't understand that the ? is just a placeholder. How can I fix this?
PostgreSQL works with numbered placeholders ($1, $2, ...) natively rather than the usual positional question marks. The documentation for the Go interface also uses numbered placeholders in its examples:
rows, err := db.Query("SELECT name FROM users WHERE age = $1", age)
Seems that the Go interface isn't translating the question marks to numbered placeholders the way many interfaces do so the question mark is getting all the way to the database and confusing everything.
You should be able to switch to numbered placeholders instead of question marks:
row := db.QueryRow(
"SELECT name FROM users WHERE id = $1", id)

Sql Select - Total Rows Returned

Using the database/sql package and drivers for Postgres and Mysql I want to achieve the following. I want to be able to Select one row and know that there is either zero rows, one row, or more than one row. the QueryRow function does not achieve that, because as far as I can ascertain, it will return one row without error regardless of if there is more than one row. For my situation, more than one row may be an error, and I want to know about it. I want to create a general function to do this.I looked at creating a function that uses the Query function, but I do not know how to return the first row if there is more than one row. I want to return the fact that there is more than one row, but I also want to return the first row. To determine that there is more than one row, I have to do a Next, and that overwrites the first row. Obviously I can achieve this without creating a general function, but I want a function to do it because I need to do this in a number of placesCould someone please explain to me how to achieve this. IE. To return the first row from a function when a successful Next has been done or the Next returned nothing.
I'm using both database/sql & MySQLDriver to achieve this. You can download MySQLDriver at https://github.com/go-sql-driver/ .
I wrote execQuery function myself to get one or more rows from database. It's based on MySQL but I think it can also used to Postgres with similar implement.
Assume you have a DB table named test, and have rows named id, name, age.
Code:
var db *sql.DB // it should be initialized by "sql.Open()"
func execQuery(SQL string, args ...interface{}) (rows *sql.Rows, is_succeed bool) {
rows, err := db.Query(SQL, args...)
var ret bool
if err == nil && rows != nil { // if DB query error rows will be nil, it will return false
ret = true
} else {
ret = false
}
return rows, ret
}
Usage:
var name, age string
rows, is_succeed = execQuery("SELECT `name`, `age` FROM `test` WHERE `id` = ?", "123")
if !is_succeed {
// error
return
}
for rows.Next() { // if have zero result rows, this for route won't execute
err := rows.Scan(&name, &age)
// check if has error & do something
}
If you want to know how much rows returned, just add a counter in for route, use SQL can also achieve this.
sql.Rows likes a list structure, rows *sql.Rows points on first row of returned rows. Use rows.Next() to traverse every rows. I think that's what you've asked.
If you really want to know rows count very often, using a cache mechanic like memcacheDB or Redis or just implement a simple counter yourself can help you solve the problem.