Find records where length of array equal to - Rails 4 - sql

In my Room model, I have an attribute named available_days, which is being stored as an array.
For example:
Room.first.available_days
=> ["wed", "thurs", "fri"]
What is the best way to find all Rooms where the size of the array is equal to 3?
I've tried something like
Room.where('LENGTH(available_days) = ?', 3)
with no success.
Update: the data type for available_days is a string, but in order to store an array, I am serializing the attribute from my model:
app/models/room.rb
serialize :available_days

Can't think of a purely sql way of doing it for sqlite since available_days is a string.
But here's one way of doing it without loading all records at once.
rooms = []
Room.in_batches(of: 10).each_record do |r|
rooms << r if r.available_days.length == 3
end
p rooms

If you're using postgres you can parse the serialized string to an array type, then query on the length of the array. I expect other databases may have similar approaches. How to do this depends on how the text is being serialized, but by default for Rails 4 should be YAML, so I expect you data is encoded like this:
---
- first
- second
The following SQL will remove the leading ---\n- as well as the final newline, then split the remaining string on - into an array. It's not strictly necessary to cleanup the extra characters to find the length, but if you want to do other operations you may find it useful to have a cleaned up array (no leading characters or trailing newline). This will only work for simple YAML arrays and simple strings.
Room.where("ARRAY_LENGTH(STRING_TO_ARRAY(RTRIM(REPLACE(available_days,'---\n- ',''),'\n'), '\n- '), 1) = ?", 3)
As you can see, this approach is rather complex. If possible you may want to add a new structured column (array or jsonb) and migrate the serialized string into the a typed column to make this easier and more performant. Rails supports jsonb serialization for postgres.

Related

Splunk: Entry looks like an array but can't be accessed as one

I've got a portion of a log entry which looks like an array, but I can only access it with the {} notation.
For example, I think the path is line.ul-log-data.meta.data[0].foo, but the only way I can access the value is line.ul-log-data.meta.data{}.foo.
I've been experimenting with various multivalue field evaluations but coming up short. For example, when I do an mvcount("line.ul-log-data.meta.data"), it returns 1.
What do I have to do to use the array notation [0] and get that count to return 2?
Splunk uses curly brackets to access JSON arrays because square brackets have a very different, historical function.
Have you tried mvcount("line.ul-log-data.meta.data{}")?

Convert String to array and validate size on Vertica

I need to execute a SQL query, which converts a String column to a Array and then validate the size of that array
I was able to do it easily with postgresql:
e.g.
select
cardinality(string_to_array('a$b','$')),
cardinality(string_to_array('a$b$','$')),
cardinality(string_to_array('a$b$$$$$','$')),
But for some reason trying to convert String on vertica to array is not that simple, Saw this links:
https://www.vertica.com/blog/vertica-quick-tip-dynamically-split-string/
https://forum.vertica.com/discussion/239031/how-to-create-an-array-in-vertica
And much more that non of them helped.
I also tried using:
select REGEXP_COUNT('a$b$$$$$','$')
But i get an incorrect value - 1.
How can i Convert String to array on Vertica and gets his Length ?
$ has a special meaning in a regular expression. It represents the end of the string.
Try escaping it:
select REGEXP_COUNT('a$b$$$$$', '[$]')
You could create a UDx scalar function (UDSF) in Java, C++, R or Python. The input would be a string and the output would be an integer. https://www.vertica.com/docs/9.2.x/HTML/Content/Authoring/ExtendingVertica/UDx/ScalarFunctions/ScalarFunctions.htm
This will allow you to use language specific array logic on the strings passed in. For example in python, you could include this logic:
input_list = input.split("$")
filtered_input_list = list(filter(None, input_list))
list_count = len(filtered_input_list)
These examples are a good starting point for writing UDx's for Vertica. https://github.com/vertica/UDx-Examples
I wasn't able to convert to an array - but Im able to get the length of the values
What i do is convert to Rows an use count - its not best performance wise
But with this way Im able to do also manipulation like filtering of each value between delimiter - and i dont need to use [] for characters like $
select (select count(1)
from (select StringTokenizerDelim('a$b$c','$') over ()) t)
Return 3

Search for words into an arbitrary string. Which method is fast sql query or binary search?

So if I have the following string:
orig_string = 'adklsdntheasnienwordsnsaldkngarelskndlinasldknhere'
and I iterate through it like so:
orig_string.length.times do |index1|
orig_string[index1..orig_string.length].length.times do |index2|
puts orig_string[index2..orig_string.length]
unless orig_string[index1..index2].length == 0 then puts orig_string[index1..index2] end
end
end
to get every possible combination of the string with order preserved. I am trying to pull as many english words from this string as possible by referencing a dictionary of ~5,000 words. Eventually I plan to iterate over many strings so performance is key, which is why I am deferring to my peers.
Would it be quicker to load the dictionary into memory and binary search through it, or load it into an sqlite3 db and run a query for each permutation?
Also, is there a better way to get all permutations of the original string with order preserved?
Thanks!!
Find all substrings inside a string:
I think the following implementation for break string into words is more clear, ruby-like and little bit faster:
orig_string = 'adklsdntheasnienwordsnsaldkngarelskndlinasldknhere'
orig_string_len = orig_string.length
orig_string_len.downto(1) do |len|
(orig_string_len - len).downto(0) do |index|
puts orig_string.slice(index, len)
end
end
Search for valid words:
I guess binary search is faster than SQL queries, since data is already in memory and is just a function call.
SQL will parse the query and will do many other calculation before return the value.
There is other aspects to considerate, like sqlite3 is a C implementation, maybe it is faster than ruby binary search for a large set.
If this algorithm will be heavy use, I suggest you to benchmark both approaches.
Ruby has a pretty easy lib for this stuff http://rubydoc.info/stdlib/benchmark/Benchmark, which comes with the Ruby Standard Lib.

Convert an alphanumeric string to integer format

I need to store an alphanumeric string in an integer column on one of my models.
I have tried:
#result.each do |i|
hex_id = []
i["id"].split(//).each{|c| hex_id.push(c.hex)}
hex_id = hex_id.join
...
Model.create(:origin_id => hex_id)
...
end
When I run this in the console using puts hex_id in place of the create line, it returns the correct values, however the above code results in the origin_id being set to "2147483647" for every instance. An example string input is "t6gnk3pp86gg4sboh5oin5vr40" so that doesn't make any sense to me.
Can anyone tell me what is going wrong here or suggest a better way to store a string like the aforementioned example as a unique integer?
Thanks.
Answering by request form OP
It seems that the hex_id.join operation does not concatenate strings in this case but instead sums or performs binary complement of the hex values. The issue could also be that hex_id is an array of hex-es rather than a string, or char array. Nevertheless, what seems to happen is reaching the maximum positive value for the integer type 2147483647. Still, I was unable to find any documented effects on array.join applied on a hex array, it appears it is not concatenation of the elements.
On the other hand, the desired result 060003008600401100500050040 is too large to be recorded as an integer either. A better approach would be to keep it as a string, or use different algorithm for producing a number form the original string. Perhaps aggregating the hex values by an arithmetic operation will do better than join ?

ROR: Zero-padding removed upon database query

Is there anyway to make .find not remove zeropadding on the values it pulls back from the database?
ie. I have zipcodes in a database and some of them are shorter than 5 characters. I am zeropadding thme back to 5 characters in the database, so I end up with "00210" for example. However, this value just becomes "210" in my array.
I know I can use "%05d" % value to zeropad it when it's going back into the views... but I'd rather not have to zeropad it on the way out like that.
A Fixnum (the Ruby type you're dealing with) only cares about value. my_var = 00000001 in ruby sets my_var to 1, and outputting it as a string results in "1". If you want to format it differently, you'll have to rely on string functionality.