According to the documentation, pg_attribute.attgenerated is typed as char and has a value of "a zero byte" if the column is not generated, and there is at least one other possible value, with potentially more in the future.
I want to query for all non-generated columns. Since I would prefer to not be tripped up by additions in future versions, the query predicate needs to be WHERE attgenerated = ZERO BYTE rather than an inequality, but I have no idea how to represent that value correctly in SQL.
What's the correct way to write this? In most programming languages you'd say '\0', and you can use escape sequences by prepending an e to the string literal, but if I say e'\0' it errors out with "invalid byte sequence for encoding "UTF8": 0x00". So I'm not quite sure what the right way to do this is.
It's simply an empty string:
WHERE attgenerated = ''
I tried to match the sql values string (0),(5),(12),... or (0,11),(122,33),(4,51),... or (0,121,12),(31,4,5),(26,227,38),... and so on with the regular expression
\(\s*\d+\s*(\s*,\s*\d+\s*)*\)(\s*,\s*\(\s*\d+\s*(\s*,\s*\d+\s*)*\))*
and it works. But...
How can I ensure that the regex does not match a values string like (0,12),(1,2,3),(56,7) with different number of columns?
Thanks in advance...
As i mentioned in comment to the question, the best way to check if input string is valid: contains the same count of numbers between brackets, is to use client side programm, but not clear SQL.
Implementation:
List<string> s = new List<string>(){
"(0),(5),(12)", "(0,11),(122,33),(4,51)",
"(0,121,12),(31,4,5),(26,227,38)","(0,12),(1,2,3),(56,7)"};
var qry = s.Select(a=>new
{
orig = a,
newst = a.Split(new string[]{"),(", "(", ")"},
StringSplitOptions.RemoveEmptyEntries)
})
.Select(a=>new
{
orig = a.orig,
isValid = (a.newst
.Sum(b=>b.Split(new char[]{','},
StringSplitOptions.RemoveEmptyEntries).Count()) %
a.newst.Count()) ==0
});
Result:
orig isValid
(0),(5),(12) True
(0,11),(122,33),(4,51) True
(0,121,12),(31,4,5),(26,227,38) True
(0,12),(1,2,3),(56,7) False
Note: The second Select statement gets the modulo of sum of comma instances and the count of items in string array returned by Split function. If the result isn't equal to zero, it means that input string is invalid.
I strongly believe there's a simplest way to achieve that, but - at this moment - i don't know how ;)
:(
Unless you add some more constraints, I don't think you can solve this problem only with regular expressions.
It isn't able to solve all of your string problems, just as it cannot be used to check that the opening and closing of brackets (like "((())()(()(())))") is invalid. That's a more complicated issue.
That's what I learnt in class :P If someone knows a way then that'd be sweet!
I'm sorry, I spent a bit of time looking into how we could turn this string into an array and do more work to it with SQL but built in functionality is lacking and the solution would end up being very hacky.
I'd recommend trying to handle this situation differently as large scale string computation isn't the best way to go if your database is to gradually fill up.
A combination of client and serverside validation can be used to help prevent bad data (like the ones with more numbers) from getting into the database.
If you need to keep those numbers then you could rework your schema to include some metadata which you can use in your queries, like how many numbers there are and whether it all matches nicely. This information can be computed inexpensively from your server and provided to the database.
Good luck!
I'm trying to write binary text files from a data frame in Julia using something along the lines of:
for x in RICT["$i"]["Sick"]
write(f9, convert(Int16, x ))
and everything works nicely except for when it comes to NA values. Missing values are treated as NA it seems, and I know that there are different ways of handling such values using the data frames package. Does anyone have any experience with these NAtypes? Should I convert the NAtypes to a more conventional type and then write them in? As always any help is much appreciated.
If you are writing a 16-byte integer value, there's no canonical representation of "blank", so you'd have to pick a special 16-byte integer value that represents NA. A common choice for this kind of thing is the smallest representable value – in this case typemin(Int16) == -32768. You can generalize this to other signed integer types.
I was looking for a way to enumerate String types in (vb).NET, but .NET enums only accept numeric type values.
The first alternative I came across was to create a dictionary of my enum values and the string I want to return. This worked, but was hard to maintain because if you changed the enum you would have to remember to also change the dictionary.
The second alternative was to set field attributes on each enum member, and retrieve it using reflection. Surely enough this worked aswell and also solved the maintenance problem, but it uses reflection and I've always read that using reflection should be a last resort thing.
So I started thinking and I came up with this: every ASCII character can be represented as a hexadecimal value, and you can assign hexadecimal values to enum members.
You could get rid of the attributes, assign the hexadecimal values to the enum members. Then, when you need the text value, convert the value to a byte array and use System.Text.Encodings.ASCII.GetString(enumMemberBytes) to get the string value.
Now speaking out of experience, anything I come up with is usually either flawed or just plain wrong. What do you guys think about this approach? Is there any reason not to do it like that?
Thanks.
EDIT
As pointed out by David W, enum member values are limited in length, depending on the underlying type (integer by default). So yes, I believe my method works but you are limited to characters in the ASCII table, with a maximum length of 4 or 8 characters using integers or longs respectively.
The easiest way I have found to dynamically parse a String representation of an Enumeration into the actual Enumeration type was to do the following:
Private EnumObject
[Undefined]
ValueA
ValueB
End Enum
dim enumVal as EnumObject = DirectCast([Enum].Parse(GetType(EnumObject), "ValueA"), EnumObject)
This removes the need to maintain a dictionary and allows you to just handle strings instead of converting to an Int or a Long. This does use reflection, but I have not come across any issues as long as you catch and handle any exceptions with the String Parse.
having a bit of trouble finding a solution to this.
I want to take a large ordered text file of words and create - in the same order - a text file of fixed length numeric values.
For example:
Input File Output File
AAA -> 00000001
AAH -> 00002718
AAZ -> 71827651
Initially it seemed a hash function would do the trick. However they are one way. Also perhaps they are a bit "heavyweight" for this. After all, I don't need any cryptography. Plus, it's a reference file. It will never change.
Any compression is a bonus not essential. That said, I don't want the file to get any bigger than it already is. Which is why I don't just want to write out the words as text but with fixed lengths.
So, bottom line; input is a NSString of variable length, output is an integer of fixed length. And, I must be able to take the integer and figure out the string.
Any help much appreciated!
Thanks!
xj
Well, this would be a bit of a brute force method, but here's my guess.
Start by making a custom function to convert one letter of text to an integer less than 100. (I'm not sure if such a function already exists, if so then great!) You might need to just go to stuff like "if ([input isEqual: #"a"]){ return 1;}
Then, run that function on each letter of text, and get the final integer by combining the previous results.
For example:
int myVal1 = [intConverter firstLetter];
int myVal2 = [intConverter secondLetter];
int myVal3 = [intConverter thirdLetter];
int finalValue =100^3 + 100^2*myVal1 + 100*myVal2 + myVal3;
Then, finalValue would be of the form 1(myVal1)(myVal2)(myVal3), which is what I think you're looking for.
To get back the original string, simply use the mod (%) and division functions to get the individual values back, then run the intConverter function backwards. (This would probably mean writing a new function that basically runs those if statements in reverse, but oh well.)
I hope this helps.