Go application making SQL Query using GROUP_CONCAT on FLOATS returns []uint8 instead of actual []float64 - sql

Have a problem using group_concat in a query made by my go application.
Any idea why a group_concat of FLOATS would look like a []uint8 on the Go side?
Cant seem to properly convert the suckers either.
It's definitely floats, I can see it in the raw query results, but when I do the same query in go and try to .Scan the result, Go complains that it's a []uint8 not a []float64 (which it actually is) Attempts to convert to floats gives me the wrong values (and way too many of them).
For example, at the database, I query and get 2 floats for the column in question, looks like this:
"5650.50, 5455.00"
On the go side however, go sees a []uint8 instead of []float64. Why does this happen? How does one workaround this to get the actual results?
My problem is that I have to use this SQL with the group_concat, due to the nature of the database I am working with, this is the best way to get the information, and more importantly the query itself works great, returns the data the function needs, but now I cant read it out because of type issues. No stranger to those, but Go isn't cooperating with me today.
I'd be more than pleased to learn WHY go is doing it this way, and delighted to learn of a way to deal with it.
Example:
SELECT ID, getDistance(33.1543,-110.4353, Loc.Lat, Loc.Lng) as distance,
GROUP_CONCAT(values) FROM stuff INNER JOIN device on device.ID = stuff.ID WHERE (someConditionsETC) GROUP BY ID ORDER BY ID
The actual result, when interfacing with the actual database (not within my application), is
"5650.00, 5850.50"
It's clearly 2 floats.
The same result produces a slice of uint8 when queried from Go and trying to .Scan the result in. If I range through and print those values, I get way more than 2, and they are uint8 (bytes) that look like this:
53,55,56,48,46,48,48
Not sure how Go expects me to handle this.
Solution.... stupid simple and not terribly obvious:
The solution: 
crazyBytes := []uint8("5760.00,5750.50")
aString := string(crazyBytes)
strSlice := strings.Split(aString,",") // string representation of our array (of floats)
var floatz []float64
for _, x := range strSlice {
fmt.Printf("At last, Float: %s \r\n",x)
f,err := strconv.ParseFloat(x,64)
if err != nil { fmt.Printf("Error: %s",err) }
floatz = append(floatz, f)
fmt.Printf("as float: %s \r\n", strconv.FormatFloat(f,'f',-1,64))
}
Yea sure, it's obvious NOW.

GROUP_CONCAT returns a string. So in Go you get a byte array of characters, not a float. The result you posted 53,55,56,48,46,48,48 translates into a string "5780.00" which does look like one of your values. So you need to either fix your SQL to return floats or use strings and strconv modules in Go to parse and convert your string into floats. I think the former approach is better, but it is up to you.

Related

Generating Random String of Numbers and Letters Using Go's "testing/quick" Package

I've been breaking my head over this for a few days now and can't seem to be able to figure it out. Perhaps it's glaringly obvious, but I don't seem to be able to spot it. I've read up on all the basics of unicode, UTF-8, UTF-16, normalisation, etc, but to no avail. Hopefully somebody's able to help me out here...
I'm using Go's Value function from the testing/quick package to generate random values for the fields in my data structs, in order to implement the Generator interface for the structs in question. Specifically, given a Metadata struct, I've defined the implementation as follows:
func (m *Metadata) Generate(r *rand.Rand, size int) (value reflect.Value) {
value = reflect.ValueOf(m).Elem()
for i := 0; i < value.NumField(); i++ {
if t, ok := quick.Value(value.Field(i).Type(), r); ok {
value.Field(i).Set(t)
}
}
return
}
Now, in doing so, I'll end up with both the receiver and the return value being set with random generated values of the appropriate type (strings, ints, etc. in the receiver and reflect.Value in the returned reflect.Value).
Now, the implementation for the Value function states that it will return something of type []rune converted to type string. As far as I know, this should allow me to then use the functions in the runes, unicode and norm packages to define a filter which filters out everything which is not part of 'Latin', 'Letter' or 'Number'. I defined the following filter which uses a transform to filter out letters which are not in those character rangetables (as defined in the unicode package):
func runefilter(in reflect.Value) (out reflect.Value) {
out = in // Make sure you return something
if in.Kind() == reflect.String {
instr := in.String()
t := transform.Chain(norm.NFD, runes.Remove(runes.NotIn(rangetable.Merge(unicode.Letter, unicode.Latin, unicode.Number))), norm.NFC)
outstr, _, _ := transform.String(t, instr)
out = reflect.ValueOf(outstr)
}
return
}
Now, I think I've tried just about anything, but I keep ending up with a series of strings which are far from the Latin range, e.g.:
𥗉똿穊
𢷽嚶
秓䝏小𪖹䮋
𪿝ท솲
𡉪䂾
ʋ𥅮ᦸ
堮𡹯憨𥗼𧵕ꥆ
𢝌𐑮𧍛併怃𥊇
鯮
𣏲𝐒
⓿ꐠ槹𬠂黟
𢼭踁퓺𪇖
俇𣄃𔘧
𢝶
𝖸쩈𤫐𢬿詢𬄙
𫱘𨆟𑊙
欓
So, can anybody explain what I'm overlooking here and how I could instead define a transformer which removes/replaces non-letter/number/latin characters so that I can use the Value function as intended (but with a smaller subset of 'random' characters)?
Thanks!
Confusingly the Generate interface needs a function using the type not a the pointer to the type. You want your type signature to look like
func (m Metadata) Generate(r *rand.Rand, size int) (value reflect.Value)
You can play with this here. Note: the most important thing to do in that playground is to switch the type of the generate function from m Metadata to m *Metadata and see that Hi Mom! never prints.
In addition, I think you would be better served using your own type and writing a generate method for that type using a list of all of the characters you want to use. For example:
type LatinString string
const latin = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz01233456789"
and then use the generator
func (l LatinString) Generate(rand *rand.Rand, size int) reflect.Value {
var buffer bytes.Buffer
for i := 0; i < size; i++ {
buffer.WriteString(string(latin[rand.Intn(len(latin))]))
}
s := LatinString(buffer.String())
return reflect.ValueOf(s)
}
playground
Edit: also this library is pretty cool, thanks for showing it to me
The answer to my own question is, it seems, a combination of the answers provided in the comments by #nj_ and #jimb and the answer provided by #benjaminkadish.
In short, the answer boils down to:
"Not such a great idea as you thought it was", or "Bit of an ill-posed question"
"You were using the union of 'Letter', 'Latin' and 'Number' (Letter || Number || Latin), instead of the intersection of 'Latin' with the union of 'Letter' and 'Number' ((Letter || Number) && Latin))
Now for the longer version...
The idea behind me using the testing/quick package is that I wanted random data for (fuzzy) testing of my code. In the past, I've always written the code for doing things like that myself, again and again. This meant a lot of the same code across different projects. Now, I could of course written my own package for it, but it turns out that, even better than that, there's actually a standard package which does just about exactly what I want.
Now, it turns out the package does exactly what I want very well. The codepoints in the strings which it generates are actually random and not just restricted to what we're accustomed to using in everyday life. Now, this is of course exactly the thing which you want in doing fuzzy testing in order to test the code with values outside the usual assumptions.
In practice, that means I'm running into two problems:
There's some limits on what I would consider reasonable input for a string. Meaning that, in testing the processing of a Name field or a URL field, I can reasonably assume there's not going to be a value like 'James Mc⌢' (let alone 'James Mc🙁') or 'www.🕸site.com', but just 'James McFrown' and 'www.website.com'. Hence, I can't expect a reasonable system to be able to support it. Of course, things shouldn't completely break down, but it also can't be expected to handle the former examples without any problems.
When I filter the generated string on values which one might consider reasonable, the chance of ending up with a valid string is very small. The set of possible characters in the set used by the testing/quick is just so large (0x10FFFF) and the set of reasonable characters so small, you end up with empty strings most of the time.
So, what do we need to take away from this?
So, whilst I hoped to use the standard testing/quick package to replace my often repeated code to generate random data for fuzzy testing, it does this so well that it provides data outside the range of what I would consider reasonable for the code to be able to handle. It seems that the choice, in the end, is to:
Either be able to actually handle all fuzzy options, meaning that if somebody's name is 'Arnold 💰💰' ('Arnold Moneybags'), it shouldn't go arse over end. Or...
Use custom/derived types with their own Generator. This means you're going to have to use the derived type instead of the basic type throughout the code. (Comparable to defining a string as wchar_t instead of char in C++ and working with those by default.). Or...
Don't use testing/quick for fuzzy testing, because as soon as you run into a generated string value, you can (and should) get a very random string.
As always, further comments are of course welcome, as it's quite possible I overlooked something.

wxWidgets - wxGrid - reading/writing non string cell values

I have a wxGrid to edit an array of numerical data.
I was wondering what's the best way to get non-string data in and out of the cells without going through the string to numeric conversion all the time.
I've used SetCellEditor() to control the data entry.
currently I use this:
// numeric value into cell
str.clear();
str << val1;
m_grid4->SetCellValue(row, col, str);
..
// read value from back into variable
val = atoi(m_grid4->GetCellValue(row, col));
Apart from the fact that atoi() is a bit ugly and a template function with a stringstream would be better, is there a way do get non-string values a bit better in and out of cells?
I was looking at the editors and renderers but can't figure it out.
If you worry about efficiency, you almost certainly should use a custom table class deriving from wxGridTableBase instead of using the default trivial wxGridStringTable implementation which stores everything as strings. Then, and much less importantly, if it makes sense in your case, you can use wxGridCellNumberRenderer which will call your table GetValueAsLong() method instead of GetValue() (which returns a string).
Both of those are demonstrated in wxGrid sample, notably look at BugsGridTable there.
Good luck!

Golang server: send JSON with SQL query result that has variable number of columns

I'm creating a little implementation of RESTful API with Go server.
I'm extracting query parameters from URL (I know it's not safe, I'll try to fix this later on, but if you have any recommendations even on this topic, they would be helpful).
I have table name, desired columns and some conditions saved in 3 sring variables.
I'm using this query:
rows, _ := db.Query(fmt.Sprintf("SELECT %s FROM %s WHERE %s", columns, table, conditions))
I want to send the query result back to my frontend, as JSON. I have variable number of unknown columns, so I can't do it "standard" way.
One solution I can think of is to build a JSON string "manually" from from query result and rows.Columns().
But I'd like to do this in a more sofisticated way using something like variadic interface and stuff like that. The problem is, even after trying a lot, I still dont understand how it works.
I tried using following code
cols, err := rows.Columns() // Get the column names; remember to check err
vals := make([]sql.RawBytes, len(cols)) // Allocate enough values
ints := make([]interface{}, len(cols)) // Make a slice of []interface{}
for i := range ints {
vals[i] = &ints[i] // Copy references into the slice
}
for rows.Next() {
err := rows.Scan(vals...)
// Now you can check each element of vals for nil-ness,
// and you can use type introspection and type assertions
// to fetch the column into a typed variable.
}
from this tutorial but it doesn't work, I'm getting errors like
cannot use &ints[i] (type *interface {}) as type sql.RawBytes in assignment
And even if it'd work, I dont understand it.
Does anyone have a good solution for this? Some explanation would be great aswell.
Thanks a lot.
The first problem is here:
for i := range ints {
vals[i] = &ints[i] // Copy references into the slice
}
This is you setting values that are meant to be RawBytes as pointers to interfaces.
Before I explain what that's meant to be doing I'll see if I can explain what the general idea is here.
So normally when getting the response from SQL in Go you'd have a slice with each column and type (id int, name string, ...) so you can then read each SQL record into this slice and each column will be mapped to the value of the same type.
For cases like yours where you will have more variety in the response from SQL and need Go to handle it, you'd do this:
for i := range ints {
ints[i] = &vals[i] // Copy references into the slice
}
What this is saying is each of your interface values holds a pointer to the vals array that will hold the response from SQL. (In my examples I use [][]byte instead of RawBytes so value in vals would be a slice of byte values from SQL.)
You'd then do:
err := rows.Scan(ints...)
Since interface can evaluate to any type, when the ints array becomes populated it will accept any value then update the position in vals array based on the pointer with the value from SQL as a RawBytes type.
HTH

Regex match SQL values string with multiple rows and same number of columns

I tried to match the sql values string (0),(5),(12),... or (0,11),(122,33),(4,51),... or (0,121,12),(31,4,5),(26,227,38),... and so on with the regular expression
\(\s*\d+\s*(\s*,\s*\d+\s*)*\)(\s*,\s*\(\s*\d+\s*(\s*,\s*\d+\s*)*\))*
and it works. But...
How can I ensure that the regex does not match a values string like (0,12),(1,2,3),(56,7) with different number of columns?
Thanks in advance...
As i mentioned in comment to the question, the best way to check if input string is valid: contains the same count of numbers between brackets, is to use client side programm, but not clear SQL.
Implementation:
List<string> s = new List<string>(){
"(0),(5),(12)", "(0,11),(122,33),(4,51)",
"(0,121,12),(31,4,5),(26,227,38)","(0,12),(1,2,3),(56,7)"};
var qry = s.Select(a=>new
{
orig = a,
newst = a.Split(new string[]{"),(", "(", ")"},
StringSplitOptions.RemoveEmptyEntries)
})
.Select(a=>new
{
orig = a.orig,
isValid = (a.newst
.Sum(b=>b.Split(new char[]{','},
StringSplitOptions.RemoveEmptyEntries).Count()) %
a.newst.Count()) ==0
});
Result:
orig isValid
(0),(5),(12) True
(0,11),(122,33),(4,51) True
(0,121,12),(31,4,5),(26,227,38) True
(0,12),(1,2,3),(56,7) False
Note: The second Select statement gets the modulo of sum of comma instances and the count of items in string array returned by Split function. If the result isn't equal to zero, it means that input string is invalid.
I strongly believe there's a simplest way to achieve that, but - at this moment - i don't know how ;)
:(
Unless you add some more constraints, I don't think you can solve this problem only with regular expressions.
It isn't able to solve all of your string problems, just as it cannot be used to check that the opening and closing of brackets (like "((())()(()(())))") is invalid. That's a more complicated issue.
That's what I learnt in class :P If someone knows a way then that'd be sweet!
I'm sorry, I spent a bit of time looking into how we could turn this string into an array and do more work to it with SQL but built in functionality is lacking and the solution would end up being very hacky.
I'd recommend trying to handle this situation differently as large scale string computation isn't the best way to go if your database is to gradually fill up.
A combination of client and serverside validation can be used to help prevent bad data (like the ones with more numbers) from getting into the database.
If you need to keep those numbers then you could rework your schema to include some metadata which you can use in your queries, like how many numbers there are and whether it all matches nicely. This information can be computed inexpensively from your server and provided to the database.
Good luck!

OpenRefine remove duplicates from list with jython

I have a column with values that are duplicated e.g.
VMS5796,VMS5650,VMS5650,CSL,VMA5216,CSL,VMA5113
I'm applying a transform using jython that removes the duplicates (On error is set to keep original), here's the code:
return list(set(value.split(",")))
Which works in the preview, but isn't getting applied to the column. What am I doing wrong?
The Map function is very powerful and an underused function in Python / Jython. It probably is unclear what this code does internally, but it is extremely fast in processing millions of bits of values from a list or array in your columns cells' values that need to be 'mapped' as a string type and then applying a join with a separator char such as a comma ', '
deduped_list = list(set(value.split(",")))
return ', '.join(map(str, deduped_list))
There are probably other, even slightly faster variations than this, but this should get you going in the right direction.
Interestingly, you can also get the 'printable representation' repr(object) which is acceptable to an EVAL like OpenRefine's and can be useful for seeing the representation of your values as well..., which I just found out about, researching this answer in more depth for you.
deduped_list = list(set(value.split(",")))
return ', '.join(map(repr, deduped_list))
Preview implicitly formats things for display. Your expression returns an array (which can't be stored in a cell), so if you'd like to get it string form, tack a .join(',') on the end.