Allocate uninitialized slice - optimization

Is there some way to allocate an uninitialized slice in Go? A frequent pattern is to create a slice of a given size as a buffer, and then only use part of it to receive data. For example:
b := make([]byte, 0x20000) // b is zero initialized
n, err := conn.Read(b)
// do stuff with b[:n]. all of b is zeroed for no reason
This initialization can add up when lots of buffers are being allocated, as the spec states it will default initialize the array on allocation.

You can get non zeroed byte buffers from bufs.Cache.Get (or see CCache for the concurrent safe version). From the docs:
NOTE: The buffer returned by Get is not guaranteed to be zeroed. That's okay for e.g. passing a buffer to io.Reader. If you need a zeroed buffer use Cget.

Technically you could by allocating the memory outside the go runtime and using unsafe.Pointer, but this is definitely the wrong thing to do.
A better solution is to reduce the number of allocations. Move buffers outside loops, or, if you need per goroutine buffers, allocate several of them in a pool and only allocate more when they're needed.
type BufferPool struct {
Capacity int
buffersize int
buffers []byte
lock sync.Mutex
}
func NewBufferPool(buffersize int, cap int) {
ret := new(BufferPool)
ret.Capacity = cap
ret.buffersize = buffersize
return ret
}
func (b *BufferPool) Alloc() []byte {
b.lock.Lock()
defer b.lock.Unlock()
if len(b.buffers) == 0 {
return make([]byte, b.buffersize)
} else {
ret := b.buffers[len(b.buffers) - 1]
b.buffers = b.buffers[0:len(b.buffers) - 1]
return ret
}
}
func (b *BufferPool) Free(buf []byte) {
if len(buf) != b.buffersize {
panic("illegal free")
}
b.lock.Lock()
defer b.lock.Unlock()
if len(b.buffers) < b.Capacity {
b.buffers = append(b.buffers, buf)
}
}

Related

How to slice bytes memory in solidity?

Im trying to slice bytes such as
bytes memory bytesData = result[32:64];
and its throwing:
TypeError: Index range access is only supported for dynamic calldata arrays.
it works fine with calldata...
what about memory?
According to the Solidity docs, slicing memory arrays is not supported for now. As you've said, it does work on calldata bytes. This answer on EthereumSE seems to agree.
According to this question on EthSE, you can convert the memory to calldata with a workaround.
pragma solidity >=0.8.0 <0.9.0;
library BytesLib {
function slice(
bytes memory _bytes,
uint256 _start,
uint256 _length
)
internal
pure
returns (bytes memory)
{
require(_length + 31 >= _length, "slice_overflow");
require(_bytes.length >= _start + _length, "slice_outOfBounds");
bytes memory tempBytes;
// Check length is 0. `iszero` return 1 for `true` and 0 for `false`.
assembly {
switch iszero(_length)
case 0 {
// Get a location of some free memory and store it in tempBytes as
// Solidity does for memory variables.
tempBytes := mload(0x40)
// Calculate length mod 32 to handle slices that are not a multiple of 32 in size.
let lengthmod := and(_length, 31)
// tempBytes will have the following format in memory: <length><data>
// When copying data we will offset the start forward to avoid allocating additional memory
// Therefore part of the length area will be written, but this will be overwritten later anyways.
// In case no offset is require, the start is set to the data region (0x20 from the tempBytes)
// mc will be used to keep track where to copy the data to.
let mc := add(add(tempBytes, lengthmod), mul(0x20, iszero(lengthmod)))
let end := add(mc, _length)
for {
// Same logic as for mc is applied and additionally the start offset specified for the method is added
let cc := add(add(add(_bytes, lengthmod), mul(0x20, iszero(lengthmod))), _start)
} lt(mc, end) {
// increase `mc` and `cc` to read the next word from memory
mc := add(mc, 0x20)
cc := add(cc, 0x20)
} {
// Copy the data from source (cc location) to the slice data (mc location)
mstore(mc, mload(cc))
}
// Store the length of the slice. This will overwrite any partial data that
// was copied when having slices that are not a multiple of 32.
mstore(tempBytes, _length)
// update free-memory pointer
// allocating the array padded to 32 bytes like the compiler does now
// To set the used memory as a multiple of 32, add 31 to the actual memory usage (mc)
// and remove the modulo 32 (the `and` with `not(31)`)
mstore(0x40, and(add(mc, 31), not(31)))
}
// if we want a zero-length slice let's just return a zero-length array
default {
tempBytes := mload(0x40)
// zero out the 32 bytes slice we are about to return
// we need to do it because Solidity does not garbage collect
mstore(tempBytes, 0)
// update free-memory pointer
// tempBytes uses 32 bytes in memory (even when empty) for the length.
mstore(0x40, add(tempBytes, 0x20))
}
}
return tempBytes;
}
}
https://ethereum.stackexchange.com/questions/122029/how-does-bytes-utils-slice-function-work

Kotlin - Reading the least significant bit (Steganography)

I am creating a program that can read the least significant bit of a image file using Kotlin. I have a function that reads the bytes in a file, but I am unsure how to actully print the bytes in the function consumeArray.
My goal is to print the least significant bits of a image.
override fun run() {
val buff = ByteArray(1230)
File("src\\main\\kotlin\\day01_least_significant_bit_steganography\\eksempel_bakgrunnsbilde.png").inputStream().buffered().use { input ->
while(true) {
val sz = input.read(buff)
if (sz <= 0) break
///at that point we have a sz bytes in the buff to process
consumeArray(buff, 0, sz)
}
}
} // run
private fun consumeArray(buff: ByteArray, i: Int, sz: Int) {
println("??")
} // consumeArray
In Kotlin 1.4+ you can get the least significant bit of any byte with .takeLowestOneBit() method.
It may happen that it's equal to zero, so you need to iterate byteArray until any non-zero least significant bit is met (I believe this is what was meant under "least significant bit of byteArray"):
var lowestBit: Byte = 0
for (index in sz - 1 downTo 0) {
val currentLowestBit = buff[index].takeLowestOneBit()
if (currentLowestBit != 0.toByte()) {
lowestBit = currentLowestBit
break
}
}
Note that it will print the least significant bit of your buffer, not the whole image (if it's bigger than the buffer)

rapidJson: crashed in release mode

I used rapidJson to read json data. I can build my application in both Debug and Release mode, but the application crashes in Release mode.
using namespace rapidjson;
...
char *buffer;
long fileSize;
size_t fileReadingResult;
//obtain file size
fseek(pFile, 0, SEEK_END);
fileSize = ftell(pFile);
if (fileSize <= 0) return false;
rewind(pFile);
//allocate memory to contain the whole file
buffer = (char *)malloc(sizeof(char)*fileSize);
if (buffer == NULL) return false;
//copy the file into the buffer
fileReadingResult = fread(buffer, 1, fileSize, pFile);
if (fileReadingResult != fileSize) return false;
buffer[fileSize] = 0;
Document document;
document.Parse(buffer);
When I run it in Release mode, I encounter an Unhanded exception; A heap has been corrupted.
The application breaks at "res = _heap_alloc(size) in malloc.c file
void * __cdecl _malloc_base (size_t size)
{
void *res = NULL;
// validate size
if (size <= _HEAP_MAXREQ) {
for (;;) {
// allocate memory block
res = _heap_alloc(size);
// if successful allocation, return pointer to memory
// if new handling turned off altogether, return NULL
if (res != NULL)
{
break;
}
if (_newmode == 0)
{
errno = ENOMEM;
break;
}
// call installed new handler
if (!_callnewh(size))
break;
// new handler was successful -- try to allocate again
}
It runs fine in Debug mode.
Maybe it could be a memory leak issue with your Malloc since it runs fine one time in Debug, but when you keep the application up longer it crashes.
Do you free your buffer after using it?
The reason is simple. You allocate a buffer of fileSize bytes but after reading the file, you write at the fileSize+1-th position with buffer[fileSize] = 0;
Fix: change allocation with one larger.
buffer = (char *)malloc(fileSize + 1);
Debug builds pad memory allocations with additional bytes so it does not crash.

Use Gob to write logs to a file in an append style

Would it be possible to use Gob encoding for appending structs in series to the same file using append? It works for writing, but when reading with the decoder more than once I run into:
extra data in buffer
So I wonder if that's possible in the first place or whether I should use something like JSON to append JSON documents on a per line basis instead. Because the alternative would be to serialize a slice, but then again reading it as a whole would defeat the purpose of append.
The gob package wasn't designed to be used this way. A gob stream has to be written by a single gob.Encoder, and it also has to be read by a single gob.Decoder.
The reason for this is because the gob package not only serializes the values you pass to it, it also transmits data to describe their types:
A stream of gobs is self-describing. Each data item in the stream is preceded by a specification of its type, expressed in terms of a small set of predefined types.
This is a state of the encoder / decoder–about what types and how they have been transmitted–, a subsequent new encoder / decoder will not (cannot) analyze the "preceeding" stream to reconstruct the same state and continue where a previous encoder / decoder left off.
Of course if you create a single gob.Encoder, you may use it to serialize as many values as you'd like to.
Also you can create a gob.Encoder and write to a file, and then later create a new gob.Encoder, and append to the same file, but you must use 2 gob.Decoders to read those values, exactly matching the encoding process.
As a demonstration, let's follow an example. This example will write to an in-memory buffer (bytes.Buffer). 2 subsequent encoders will write to it, then we will use 2 subsequent decoders to read the values. We'll write values of this struct:
type Point struct {
X, Y int
}
For short, compact code, I use this "error handler" function:
func he(err error) {
if err != nil {
panic(err)
}
}
And now the code:
const n, m = 3, 2
buf := &bytes.Buffer{}
e := gob.NewEncoder(buf)
for i := 0; i < n; i++ {
he(e.Encode(&Point{X: i, Y: i * 2}))
}
e = gob.NewEncoder(buf)
for i := 0; i < m; i++ {
he(e.Encode(&Point{X: i, Y: 10 + i}))
}
d := gob.NewDecoder(buf)
for i := 0; i < n; i++ {
var p *Point
he(d.Decode(&p))
fmt.Println(p)
}
d = gob.NewDecoder(buf)
for i := 0; i < m; i++ {
var p *Point
he(d.Decode(&p))
fmt.Println(p)
}
Output (try it on the Go Playground):
&{0 0}
&{1 2}
&{2 4}
&{0 10}
&{1 11}
Note that if we'd use only 1 decoder to read all the values (looping until i < n + m, we'd get the same error message you posted in your question when the iteration reaches n + 1, because the subsequent data is not a serialized Point, but the start of a new gob stream.
So if you want to stick with the gob package for doing what you want to do, you have to slightly modify, enhance your encoding / decoding process. You have to somehow mark the boundaries when a new encoder is used (so when decoding, you'll know you have to create a new decoder to read subsequent values).
You may use different techniques to achieve this:
You may write out a number, a count before you proceed to write values, and this number would tell how many values were written using the current encoder.
If you don't want to or can't tell how many values will be written with the current encoder, you may opt to write out a special end-of-encoder value when you don't write more values with the current encoder. When decoding, if you encounter this special end-of-encoder value, you'll know you have to create a new decoder to be able to read more values.
Some things to note here:
The gob package is most efficient, most compact if only a single encoder is used, because each time you create and use a new encoder, the type specifications will have to be re-transmitted, causing more overhead, and making the encoding / decoding process slower.
You can't seek in the data stream, you can only decode any value if you read the whole file from the beginning up until the value you want. Note that this somewhat applies even if you use other formats (such as JSON or XML).
If you want seeking functionality, you'd need to manage an index file separately, which would tell at which positions new encoders / decoders start, so you could seek to that position, create a new decoder, and start reading values from there.
Check a related question: Efficient Go serialization of struct to disk
In addition to the above, I suggest using an intermediate structure to exclude the gob header:
package main
import (
"bytes"
"encoding/gob"
"fmt"
"io"
"log"
)
type Point struct {
X, Y int
}
func main() {
buf := new(bytes.Buffer)
enc, _, err := NewEncoderWithoutHeader(buf, new(Point))
if err != nil {
log.Fatal(err)
}
enc.Encode(&Point{10, 10})
fmt.Println(buf.Bytes())
}
type HeaderSkiper struct {
src io.Reader
dst io.Writer
}
func (hs *HeaderSkiper) Read(p []byte) (int, error) {
return hs.src.Read(p)
}
func (hs *HeaderSkiper) Write(p []byte) (int, error) {
return hs.dst.Write(p)
}
func NewEncoderWithoutHeader(w io.Writer, sample interface{}) (*gob.Encoder, *bytes.Buffer, error) {
hs := new(HeaderSkiper)
hdr := new(bytes.Buffer)
hs.dst = hdr
enc := gob.NewEncoder(hs)
// Write sample with header info
if err := enc.Encode(sample); err != nil {
return nil, nil, err
}
// Change writer
hs.dst = w
return enc, hdr, nil
}
func NewDecoderWithoutHeader(r io.Reader, hdr *bytes.Buffer, dummy interface{}) (*gob.Decoder, error) {
hs := new(HeaderSkiper)
hs.src = hdr
dec := gob.NewDecoder(hs)
if err := dec.Decode(dummy); err != nil {
return nil, err
}
hs.src = r
return dec, nil
}
Additionally to great icza answer, you could use the following trick to append to a gob file with already written data: when append the first time write and discard the first encode:
Create the file Encode gob as usual (first encode write headers)
Close file
Open file for append
Using and intermediate writer encode dummy struct (which write headers)
Reset the writer
Encode gob as usual (writes no headers)
Example:
package main
import (
"bytes"
"encoding/gob"
"fmt"
"io"
"io/ioutil"
"log"
"os"
)
type Record struct {
ID int
Body string
}
func main() {
r1 := Record{ID: 1, Body: "abc"}
r2 := Record{ID: 2, Body: "def"}
// encode r1
var buf1 bytes.Buffer
enc := gob.NewEncoder(&buf1)
err := enc.Encode(r1)
if err != nil {
log.Fatal(err)
}
// write to file
err = ioutil.WriteFile("/tmp/log.gob", buf1.Bytes(), 0600)
if err != nil {
log.Fatal()
}
// encode dummy (which write headers)
var buf2 bytes.Buffer
enc = gob.NewEncoder(&buf2)
err = enc.Encode(Record{})
if err != nil {
log.Fatal(err)
}
// remove dummy
buf2.Reset()
// encode r2
err = enc.Encode(r2)
if err != nil {
log.Fatal(err)
}
// open file
f, err := os.OpenFile("/tmp/log.gob", os.O_WRONLY|os.O_APPEND, 0600)
if err != nil {
log.Fatal(err)
}
// write r2
_, err = f.Write(buf2.Bytes())
if err != nil {
log.Fatal(err)
}
// decode file
data, err := ioutil.ReadFile("/tmp/log.gob")
if err != nil {
log.Fatal(err)
}
var r Record
dec := gob.NewDecoder(bytes.NewReader(data))
for {
err = dec.Decode(&r)
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
fmt.Println(r)
}
}

Go SQL scanned rows getting overwritten

I'm trying to read all the rows from a table on a SQL server and store them in string slices to use for later. The issue I'm running into is that the previously scanned rows are getting overwritten every time I scan a new row, even though I've converted all the mutable byte slices to immutable strings and saved the result slices to another slice. Here is the code I'm using:
rawResult := make([]interface{}, len(cols)) // holds anything that could be in a row
result := make([]string, len(cols)) // will hold all row elements as strings
var results [][]string // will hold all the result string slices
dest := make([]interface{}, len(cols)) // temporary, to pass into scan
for i, _ := range rawResult {
dest[i] = &rawResult[i] // fill dest with pointers to rawResult to pass into scan
}
for rows.Next() { // for each row
err = rows.Scan(dest...) // scan the row
if err != nil {
log.Fatal("Failed to scan row", err)
}
for i, raw := range rawResult { // for each scanned byte slice in a row
switch rawtype := raw.(type){ // determine type, convert to string
case int64:
result[i] = strconv.FormatInt(raw.(int64), 10)
case float64:
result[i] = strconv.FormatFloat(raw.(float64), 'f', -1, 64)
case bool:
result[i] = strconv.FormatBool(raw.(bool))
case []byte:
result[i] = string(raw.([]byte))
case string:
result[i] = raw.(string)
case time.Time:
result[i] = raw.(time.Time).String()
case nil:
result[i] = ""
default: // shouldn't actually be reachable since all types have been covered
log.Fatal("Unexpected type %T", rawtype)
}
}
results = append(results, result) // append the result to our slice of results
}
I'm sure this has something to do with the way Go handles variables and memory, but I can't seem to fix it. Can somebody explain what I'm not understanding?
You should create new slice for each data row. Notice, that a slice has a pointer to underlying array, so every slice you added into results have same pointer on actual data array. That's why you have faced with that behaviour.
When you create a slice using func make() it return a type (Not a pointer to type). But it does not allocate new memory each time a element is reassigned. Hence
result := make([]string, 5)
will have fix memory to contain 5 strings. when a element is reassigned, it occupies same memory as before hence overriding the old value.
Hopefully following example make things clear.
http://play.golang.org/p/3w2NtEHRuu
Hence in your program you are changing the content of the same memory and appending it again and again. To solve this problem you should create your result slice inside the loop.
Move result := make([]string, len(cols)) into your for loop that loops over the available rows.