Golang stream write with gofpdf or flush the memory for big files

Golang stream write with gofpdf or flush the memory for big files - pdf

I have a pdf wrapper for gofpdf which is writing tables, the code to write rows looks like this, the headers of the table is another story...
func WriteTableRows(pdf *gofpdf.Fpdf, fontSize float64, columnSize []float64, rows [][]string) *gofpdf.Fpdf {
pdf.SetFont("Times", "", fontSize)
_, pageh := pdf.GetPageSize()
marginCell := 2.
_, _, _, mbottom := pdf.GetMargins()
_, lineHt := pdf.GetFontSize()
for _, row := range rows {
curx, y := pdf.GetXY()
x := curx
height := splitLines(row, pdf, columnSize, lineHt, marginCell)
y = AddAnotherPage(pdf, height, pageh, mbottom, y)
for i, txt := range row {
width := columnSize[i]
pdf.Rect(x, y, width, height, "")
pdf.MultiCell(width, lineHt+marginCell, txt, "", "", false)
x += width
pdf.SetXY(x, y)
}
pdf.SetXY(curx, y+height)
}
return pdf
}
This source code prepares a bunch of rows to be written in the pdf file,
But the thing is that all information remains in memory and I have very big tables to write, how can I write what is already processed and free the memory without close the file, because after a bunch of rows prepared, I need to read another set and 'prepare' it again, or maybe even closing the file but open again to append the next set of rows.

Related

Google flabtbuffer How to put array in struct in fbs file

This is extension to the following question
Size of Serialized data is not reducing using flatbuffer
As mentioned in the answer to reduce space we should use Struct. But in my case I need to define an idl file for Polygon
Each polygon will have five or more points, And I will have another DS which will have
array of polygons
I have define my fbs file as follow
namespace MyFlat;
struct Vertices {
x : double;
y :double;
}
table Polygon {
polygons : [Vertices];
}
table Layer {
polygons : [Polygon];
}
root_type Layer;
As expected with this my serialized data size is coming quite large. Is there any way to optimize the padding in table to reduce the serialized buffer size

There's no need to further optimize the structure of your data here, since >90% of the size of these buffers will typically be taken up by Vertices.
One thing to consider is to use float for x and y, given that you're unlikely to need to extra resolution.. that would almost half the size of your buffer.

Thanks for your answer. But When I am trying to print the size of 100 polygons having vertices 5 , the size is coming around 10.24KB. Ideally size should be around 8000 bytes(8 KB)
b := flatbuffers.NewBuilder(0)
var polyoffset []flatbuffers.UOffsetT
size := 100
StartedAtMarshal := time.Now()
for k := 0; k < size; k++ {
MyFlat.PolygonStartPolygonsVector(b, 5)
for i := 0; i < 5; i++ {
MyFlat.CreateVertices(b, 2.0, 2.4)
}
vec := b.EndVector(5)
MyFlat.PolygonStart(b)
MyFlat.PolygonAddPolygons(b, vec)
polyoffset = append(polyoffset, MyFlat.PolygonEnd(b))
}
MyFlat.LayerStartPolygonsVector(b, size)
for _, offset := range polyoffset {
b.PrependUOffsetT(offset)
}
vec := b.EndVector(size)
MyFlat.LayerStart(b)
MyFlat.LayerAddPolygons(b, vec)
finalOffset := MyFlat.LayerEnd(b)
b.Finish(finalOffset)
EndedAtMarshal := time.Now()
fmt.Println("Elapes Time for Seri", EndedAtMarshal.Sub(StartedAtMarshal).String())
mybyte := b.FinishedBytes()
fmt.Println(len(mybyte))
Is it expected size or My implementation is wrong

wx.Grid and ScrolledWindows

I'm creating a WX Widgets app that displays a large amount of data via wx.grid.
I'm actually using WX.Go.
No matter what I do, the grid will not be inside the scrolled window and the grid even goes over the status bar.
All I'm trying to do is have the Grid in a scrollable window/panel.
w := ControlWindow{}
w.Frame = wx.NewFrame(wx.NullWindow, -1, "FooBar", wx.DefaultPosition, wx.NewSizeT(600, 400))
w.statusbar = w.CreateStatusBar()
w.statusbar.SetStatusText("Welcome to FooBar")
w.SetBackgroundColour(wx.GetBLACK())
scroller := wx.NewScrolledWindow(w, wx.ID_ANY)
scroller.SetScrollbar(wx.VERTICAL, 1, 1, 1)
scroller.SetBackgroundColour(wx.GetGREEN())
w.menubar = wx.NewMenuBar()
menuFile := wx.NewMenu()
menuFile.Append(wx.ID_EXIT)
wx.Bind(w, wx.EVT_MENU, func(e wx.Event) {
w.Close(true)
}, wx.ID_EXIT)
w.menubar.Append(menuFile, "&File")
w.SetMenuBar(w.menubar)
vSizer := wx.NewBoxSizer(wx.VERTICAL)
/*add system choices */
filesFolder := strings.Join([]string{ThisFolder, "systems"}, Slash)
err := filepath.Walk(filesFolder, func(path string, info os.FileInfo, err error) error {
SystemFiles = append(SystemFiles, path)
return nil
})
if err != nil {
checkErr(err)
}
for _, file := range SystemFiles {
if !FileExists(file) {
continue
}
xmlFile, err := os.Open(file)
checkErr(err)
decoder := xml.NewDecoder(xmlFile)
...loop through to create SystemsAutoComplete = append(SystemsAutoComplete, attribute.Value) which is a []string
sort.Strings(SystemsAutoComplete)
systemsGrid := wx.NewGrid(w, wx.ID_ANY, wx.DefaultPosition, wx.DefaultSize)
systemsGrid.CreateGrid(0, 3)
for _, stemp := range SystemsAutoComplete {
systemsGrid.AppendRows(1)
renderer := wx.NewGridCellBoolRenderer()
systemsGrid.SetCellRenderer(systemsGrid.GetNumberRows()-1, 0, renderer)
editor := wx.NewGridCellBoolEditor()
systemsGrid.SetCellEditor(systemsGrid.GetNumberRows()-1, 0, editor)
systemsGrid.SetCellValue(systemsGrid.GetNumberRows()-1, 2, stemp)
systemsGrid.SetReadOnly(systemsGrid.GetNumberRows()-1, 2, true)
systemsGrid.AutoSizeColumns(true)
}
vSizer.Add(systemsGrid,1,wx.EXPAND,5)
scroller.SetSizer(vSizer)
scroller.FitInside()
return w
I've simplified the code as much as possible. I'm still getting the same results.
package main
import (
"github.com/dontpanic92/wxGo/wx"
)
type ControlWindow struct {
wx.Frame
statusbar wx.StatusBar
toolbar wx.ToolBar
menubar wx.MenuBar
auiManager wx.AuiManager
}
func main() {
wx1 := wx.NewApp()
w := ControlWindow{}
w.Frame = wx.NewFrame(wx.NullWindow, -1, "FooBar", wx.DefaultPosition, wx.NewSizeT(600, 400))
w.SetBackgroundColour(wx.GetBLACK())
scroller := wx.NewScrolledWindow(w, wx.ID_ANY)
scroller.SetScrollbar(wx.VERTICAL, 1, 1, 1)
scroller.SetBackgroundColour(wx.GetGREEN())
vSizer := wx.NewBoxSizer(wx.VERTICAL)
grid := wx.NewGrid(scroller, wx.ID_ANY, wx.DefaultPosition, wx.DefaultSize)
grid.CreateGrid(60, 1)
vSizer.Add(grid, 1, wx.EXPAND, 5)
scroller.SetSizer(vSizer)
scroller.SetAutoLayout(true)
scroller.Layout()
scroller.Fit()
scroller.SetScrollbar(0, 16, 50, 15)
w.Show()
wx1.MainLoop()
w.Destroy()
return
}

I can't really read Go, but if I understand it correctly you have some fundamental flaws there:
the parent-child relationship looks wrong: if you want to add systemsGrid to a sizer which will be set to scroller, then systemsGrid must have scroller as a parent. Does it?
do grid, scroller and systemsGrid all have the same parent? Because if they do, even after you'd change it for systemsGrid you'd still need to do something about the remaining two: remove the apparently useless grid OR put them both in a sizer which you'd need to set to their parent OR manually handle their positions (which is the least smart thing to do of all).

Use Gob to write logs to a file in an append style

Would it be possible to use Gob encoding for appending structs in series to the same file using append? It works for writing, but when reading with the decoder more than once I run into:
extra data in buffer
So I wonder if that's possible in the first place or whether I should use something like JSON to append JSON documents on a per line basis instead. Because the alternative would be to serialize a slice, but then again reading it as a whole would defeat the purpose of append.

The gob package wasn't designed to be used this way. A gob stream has to be written by a single gob.Encoder, and it also has to be read by a single gob.Decoder.
The reason for this is because the gob package not only serializes the values you pass to it, it also transmits data to describe their types:
A stream of gobs is self-describing. Each data item in the stream is preceded by a specification of its type, expressed in terms of a small set of predefined types.
This is a state of the encoder / decoder–about what types and how they have been transmitted–, a subsequent new encoder / decoder will not (cannot) analyze the "preceeding" stream to reconstruct the same state and continue where a previous encoder / decoder left off.
Of course if you create a single gob.Encoder, you may use it to serialize as many values as you'd like to.
Also you can create a gob.Encoder and write to a file, and then later create a new gob.Encoder, and append to the same file, but you must use 2 gob.Decoders to read those values, exactly matching the encoding process.
As a demonstration, let's follow an example. This example will write to an in-memory buffer (bytes.Buffer). 2 subsequent encoders will write to it, then we will use 2 subsequent decoders to read the values. We'll write values of this struct:
type Point struct {
X, Y int
}
For short, compact code, I use this "error handler" function:
func he(err error) {
if err != nil {
panic(err)
}
}
And now the code:
const n, m = 3, 2
buf := &bytes.Buffer{}
e := gob.NewEncoder(buf)
for i := 0; i < n; i++ {
he(e.Encode(&Point{X: i, Y: i * 2}))
}
e = gob.NewEncoder(buf)
for i := 0; i < m; i++ {
he(e.Encode(&Point{X: i, Y: 10 + i}))
}
d := gob.NewDecoder(buf)
for i := 0; i < n; i++ {
var p *Point
he(d.Decode(&p))
fmt.Println(p)
}
d = gob.NewDecoder(buf)
for i := 0; i < m; i++ {
var p *Point
he(d.Decode(&p))
fmt.Println(p)
}
Output (try it on the Go Playground):
&{0 0}
&{1 2}
&{2 4}
&{0 10}
&{1 11}
Note that if we'd use only 1 decoder to read all the values (looping until i < n + m, we'd get the same error message you posted in your question when the iteration reaches n + 1, because the subsequent data is not a serialized Point, but the start of a new gob stream.
So if you want to stick with the gob package for doing what you want to do, you have to slightly modify, enhance your encoding / decoding process. You have to somehow mark the boundaries when a new encoder is used (so when decoding, you'll know you have to create a new decoder to read subsequent values).
You may use different techniques to achieve this:
You may write out a number, a count before you proceed to write values, and this number would tell how many values were written using the current encoder.
If you don't want to or can't tell how many values will be written with the current encoder, you may opt to write out a special end-of-encoder value when you don't write more values with the current encoder. When decoding, if you encounter this special end-of-encoder value, you'll know you have to create a new decoder to be able to read more values.
Some things to note here:
The gob package is most efficient, most compact if only a single encoder is used, because each time you create and use a new encoder, the type specifications will have to be re-transmitted, causing more overhead, and making the encoding / decoding process slower.
You can't seek in the data stream, you can only decode any value if you read the whole file from the beginning up until the value you want. Note that this somewhat applies even if you use other formats (such as JSON or XML).
If you want seeking functionality, you'd need to manage an index file separately, which would tell at which positions new encoders / decoders start, so you could seek to that position, create a new decoder, and start reading values from there.
Check a related question: Efficient Go serialization of struct to disk

In addition to the above, I suggest using an intermediate structure to exclude the gob header:
package main
import (
"bytes"
"encoding/gob"
"fmt"
"io"
"log"
)
type Point struct {
X, Y int
}
func main() {
buf := new(bytes.Buffer)
enc, _, err := NewEncoderWithoutHeader(buf, new(Point))
if err != nil {
log.Fatal(err)
}
enc.Encode(&Point{10, 10})
fmt.Println(buf.Bytes())
}
type HeaderSkiper struct {
src io.Reader
dst io.Writer
}
func (hs *HeaderSkiper) Read(p []byte) (int, error) {
return hs.src.Read(p)
}
func (hs *HeaderSkiper) Write(p []byte) (int, error) {
return hs.dst.Write(p)
}
func NewEncoderWithoutHeader(w io.Writer, sample interface{}) (*gob.Encoder, *bytes.Buffer, error) {
hs := new(HeaderSkiper)
hdr := new(bytes.Buffer)
hs.dst = hdr
enc := gob.NewEncoder(hs)
// Write sample with header info
if err := enc.Encode(sample); err != nil {
return nil, nil, err
}
// Change writer
hs.dst = w
return enc, hdr, nil
}
func NewDecoderWithoutHeader(r io.Reader, hdr *bytes.Buffer, dummy interface{}) (*gob.Decoder, error) {
hs := new(HeaderSkiper)
hs.src = hdr
dec := gob.NewDecoder(hs)
if err := dec.Decode(dummy); err != nil {
return nil, err
}
hs.src = r
return dec, nil
}

Additionally to great icza answer, you could use the following trick to append to a gob file with already written data: when append the first time write and discard the first encode:
Create the file Encode gob as usual (first encode write headers)
Close file
Open file for append
Using and intermediate writer encode dummy struct (which write headers)
Reset the writer
Encode gob as usual (writes no headers)
Example:
package main
import (
"bytes"
"encoding/gob"
"fmt"
"io"
"io/ioutil"
"log"
"os"
)
type Record struct {
ID int
Body string
}
func main() {
r1 := Record{ID: 1, Body: "abc"}
r2 := Record{ID: 2, Body: "def"}
// encode r1
var buf1 bytes.Buffer
enc := gob.NewEncoder(&buf1)
err := enc.Encode(r1)
if err != nil {
log.Fatal(err)
}
// write to file
err = ioutil.WriteFile("/tmp/log.gob", buf1.Bytes(), 0600)
if err != nil {
log.Fatal()
}
// encode dummy (which write headers)
var buf2 bytes.Buffer
enc = gob.NewEncoder(&buf2)
err = enc.Encode(Record{})
if err != nil {
log.Fatal(err)
}
// remove dummy
buf2.Reset()
// encode r2
err = enc.Encode(r2)
if err != nil {
log.Fatal(err)
}
// open file
f, err := os.OpenFile("/tmp/log.gob", os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
log.Fatal(err)
}
// write r2
_, err = f.Write(buf2.Bytes())
if err != nil {
log.Fatal(err)
}
// decode file
data, err := ioutil.ReadFile("/tmp/log.gob")
if err != nil {
log.Fatal(err)
}
var r Record
dec := gob.NewDecoder(bytes.NewReader(data))
for {
err = dec.Decode(&r)
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
fmt.Println(r)
}
}

How to avoid slice reference to the same memory block

I met a problem when query from Database and tried to insert into a slice(contains some map[string]interface{})
Even I already used make to create a new memory block, the slice seems always mapping to a same memory block.
type DBResult []map[string]interface{}
func ResultRows(rows *sql.Rows, limit int) (DBResult, error) {
cols, err := rows.Columns()
if err != nil {
return nil, err
}
vals := make([]sql.RawBytes, len(cols))
scanArgs := make([]interface{}, len(vals))
for i := range vals {
scanArgs[i] = &vals[i]
}
if limit > QUERY_HARD_LIMIT {
limit = QUERY_HARD_LIMIT
}
res := make(DBResult, 0, limit)
for rows.Next() {
err = rows.Scan(scanArgs...)
m := make(map[string]interface{})
for i := range vals {
m[cols[i]] = vals[i]
}
/* Append m to res */
res = append(res, m)
/* The value of m has been changed */
fmt.Printf("lib: m:\n\n%s\n\n", m)
/* When printing res, always mapping to the same memory block */
fmt.Printf("lib: res:\n\n%s\n\n", res)
}
return res, err
}
The following is the result, you can find the contents of res are the same
m = map[comment:first_comment id:0]
res = [map[id:0 comment:first_comment]]
m = map[id:1 comment:first_comment]
res = [map[id:1 comment:first_comment] map[id:1 comment:first_comment]]
m = map[id:2 comment:first_comment]
res = [map[id:2 comment:first_comment] map[id:2 comment:first_comment] map[id:2 comment:first_comment]]
My expect of res = [map[id:0 comment:first_comment] map[id:1 comment:first_comment] map[id:2 comment:first_comment]]
Thanks for watching

according to the document of Rows (https://golang.org/pkg/database/sql/#Rows.Scan):
Scan copies the columns in the current row into the values pointed at by dest.
If an argument has type *[]byte, Scan saves in that argument a copy of the corresponding data. The copy is owned by the caller and can be modified and held indefinitely. The copy can be avoided by using an argument of type *RawBytes instead; see the documentation for RawBytes for restrictions on its use.
If an argument has type *interface{}, Scan copies the value provided by the underlying driver without conversion. If the value is of type []byte, a copy is made and the caller owns the result.
in your case, you use RawBytes as argument of Scan. That might be the problem you got. Try other type as argument of Scan function.

Go SQL scanned rows getting overwritten

I'm trying to read all the rows from a table on a SQL server and store them in string slices to use for later. The issue I'm running into is that the previously scanned rows are getting overwritten every time I scan a new row, even though I've converted all the mutable byte slices to immutable strings and saved the result slices to another slice. Here is the code I'm using:
rawResult := make([]interface{}, len(cols)) // holds anything that could be in a row
result := make([]string, len(cols)) // will hold all row elements as strings
var results [][]string // will hold all the result string slices
dest := make([]interface{}, len(cols)) // temporary, to pass into scan
for i, _ := range rawResult {
dest[i] = &rawResult[i] // fill dest with pointers to rawResult to pass into scan
}
for rows.Next() { // for each row
err = rows.Scan(dest...) // scan the row
if err != nil {
log.Fatal("Failed to scan row", err)
}
for i, raw := range rawResult { // for each scanned byte slice in a row
switch rawtype := raw.(type){ // determine type, convert to string
case int64:
result[i] = strconv.FormatInt(raw.(int64), 10)
case float64:
result[i] = strconv.FormatFloat(raw.(float64), 'f', -1, 64)
case bool:
result[i] = strconv.FormatBool(raw.(bool))
case []byte:
result[i] = string(raw.([]byte))
case string:
result[i] = raw.(string)
case time.Time:
result[i] = raw.(time.Time).String()
case nil:
result[i] = ""
default: // shouldn't actually be reachable since all types have been covered
log.Fatal("Unexpected type %T", rawtype)
}
}
results = append(results, result) // append the result to our slice of results
}
I'm sure this has something to do with the way Go handles variables and memory, but I can't seem to fix it. Can somebody explain what I'm not understanding?

You should create new slice for each data row. Notice, that a slice has a pointer to underlying array, so every slice you added into results have same pointer on actual data array. That's why you have faced with that behaviour.

When you create a slice using func make() it return a type (Not a pointer to type). But it does not allocate new memory each time a element is reassigned. Hence
result := make([]string, 5)
will have fix memory to contain 5 strings. when a element is reassigned, it occupies same memory as before hence overriding the old value.
Hopefully following example make things clear.
http://play.golang.org/p/3w2NtEHRuu
Hence in your program you are changing the content of the same memory and appending it again and again. To solve this problem you should create your result slice inside the loop.

Move result := make([]string, len(cols)) into your for loop that loops over the available rows.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Golang stream write with gofpdf or flush the memory for big files - pdf

Related

Google flabtbuffer How to put array in struct in fbs file

wx.Grid and ScrolledWindows

Use Gob to write logs to a file in an append style

How to avoid slice reference to the same memory block

Go SQL scanned rows getting overwritten

Categories

Resources