Openrefine: split with regex gives strange result - openrefine

I applied the GREL expression "value.split(/a/)" to some cells:
abcdef -> [ "", "bcdef" ]
bcdefa -> [ "bcdef" ]
badef -> [ "b", "def" ]
I can't understand why the first cell gives me a "" element in the resulting table. Is it a bug?
Thanks!

I don't know Java enough to comment on the source code for this function, but according to one of the developers of Open Refine this behavior is normal (edit : More details in Owen's comment, below). This is why there are other functions to split a string.
value.smartSplit(/a/), for example, gives a more consistent result when sep is at the begining or at the end of the string:
row value value.smartSplit(/a/)
1. abcdef [ "", "bcdef" ]
2. bcdefa [ "bcdef", "" ]
3. badef [ "b", "def" ]
This is the same result as using partition() with the omitfragment = true option enabled:
row value value.partition(/a/, true)
1. abcdef [ "", "bcdef" ]
2. bcdefa [ "bcdef", "" ]
3. badef [ "b", "def" ]

Related

Karate contains only shortcut fails in v1.2.0

According to Karate documentation for contains short-cuts The following match arguments should work and do the same:
#test_contains_only
Scenario: Test contains only shortcut
* def expected = [ "a", "b", "c", "d" ]
* def to_match = { "properties": { "additional_information": "Some put-updated info", "types": [ "a", "b", "c", "d" ] } }
* print to_match.properties.relation_types
# This will pass
* match to_match.properties contains deep
"""
{
"additional_information": "Some put-updated info",
"types": [ "a", "b", "c", "d" ]
}
"""
# This will pass
* match to_match.properties.types == '#(^^expected)'
# This will fail
* match to_match.properties contains deep
"""
{
"additional_information": "Some put-updated info",
"types": '#(^^expected)'
}
"""
When executing the above, I get this error:
* match to_match.properties contains deep
"""
{
"additional_information": "Some put-updated info",
"types": '#(^^expected)'
}
"""
match failed: CONTAINS_DEEP
$ | actual does not contain expected | all key-values did not match, expected has un-matched keys - [types] (MAP:MAP)
{"additional_information":"Some put-updated info","types":["a","b","c","d"]}
{"additional_information":"Some put-updated info","types":"#(^^expected)"}
$.types | actual does not contain expected | actual array does not contain expected item - #(^^expected) (LIST:STRING)
["a","b","c","d"]
'#(^^expected)'
$.types[3] | data types don't match (STRING:LIST)
'd'
["a","b","c","d"]
$.types[2] | data types don't match (STRING:LIST)
'c'
["a","b","c","d"]
$.types[1] | data types don't match (STRING:LIST)
'b'
["a","b","c","d"]
$.types[0] | data types don't match (STRING:LIST)
'a'
["a","b","c","d"]
In my opinion, using '#(^^expected)' doesn't work as expected. Or is it happening because I'm using match to_match.properties contains deep together with the ^^ (contains only shortcut)?
I'm using Karate v1.2.0. The same scenario doesn't fail in v1.0.1.
See if this is fixed in 1.2.1.RC2: https://github.com/karatelabs/karate/issues/2007
Else please file an issue. Ideally a PR ;)

Search string and return matching substring MongoDB

I am working on a golang project with database MongoDB. I have a collection with following records:
[
{
"_id": 1,
"vals": [
"110",
"2211"
]
},
{
"_id": 1,
"vals": [
"Abcd",
"102"
]
}
]
I want to perform a search like if I pass "11001" then 1st record will return. But I have not found any solution to do the same. I have tried the following query:
db.getCollection('ColName').find({"vals":{"$regex": "^11001", "$options": "i"}})
Characters that are saved in db are less that I passed in the search. If I pass the "110" or "11" then it will gives the result, but my requirement is different I have full string and need to match with 2,3, or 4 characters.
It is about regex.
db.getCollection('ColName').find({"vals":{"$regex": "^110(01)?", "$options": "i"}})
will work for you.
"?" in regex means match 0 or 1.

Karate API framework - Validate randomly displayed items in response

I am using Karate API framework for the API automation and came across with one scenario, the scenario is when I am hitting a post call it gives me some json response and few of the items are having tags whereas few of them are showing tags as blank to get all the tags below is the feature file scenario line
* def getTags = get response.items[*].resource.tags
It is giving me response as
[
[
],
[
],
[
{
"tags" : "Entertainment"
}
],
[
],
[
{
"tags" : "Family"
}
],
As you can see out of 5 or 6 tags only 2 tags are having the value, so I want to capture if any tags value is showing or not. What would be the logic for the assertion considering these tags can all come as empty and sometimes with come with a string value. In above case "Family" & "Entertainment"
Thanks in advance !
* match each response.items[*].resource.tags == "##string"
This will validate that tags either doesn't exist or is a string.
I think you can use a second variable to strip out the empties, or maybe your original JsonPath should use .., you can experiment:
* def allowed = ['Music', 'Entertainment', 'Documentaries', 'Family']
* def response =
"""
[
[
],
[
],
[
{
"tags":"Entertainment"
}
],
[
],
[
{
"tags":"Family"
}
]
]
"""
* def temp = get response..tags
* print temp
* match each temp == "#? allowed.contains(_)"

How to properly read SQL Server syntax?

In the MSDN Library or the Technet website, Microsoft tend to use a pseudo syntax in explaining how to use T-SQL statements with all available options. Here is a sample taking from the Technet page on UPDATE STATISTICS :
UPDATE STATISTICS table_or_indexed_view_name
[
{
{ index_or_statistics__name }
| ( { index_or_statistics_name } [ ,...n ] )
}
]
[ WITH
[
FULLSCAN
| SAMPLE number { PERCENT | ROWS }
| RESAMPLE
| <update_stats_stream_option> [ ,...n ]
]
[ [ , ] [ ALL | COLUMNS | INDEX ]
[ [ , ] NORECOMPUTE ]
] ;
<update_stats_stream_option> ::=
[ STATS_STREAM = stats_stream ]
[ ROWCOUNT = numeric_constant ]
[ PAGECOUNT = numeric_contant ]
How to properly read such description and quickly figure out what is required and what is optional and a clean way to write your query?
You should refer to this Transact-SQL Syntax Conventions
The first table in that article explains pretty much everything.
In your example we can see the following:
UPDATE STATISTICS table_or_indexed_view_name
UPDATE STATISTICS is the keyword used
table_or_indexed_view_name is the name of the table or the view to update statistics for
[
{
{ index_or_statistics__name }
| ( { index_or_statistics_name } [ ,...n ] )
}
]
This is optional [], but if supplied, you have to put a statistic name {index_or_statistics__name}, or | a list of statistic names separated by commas { index_or_statistics_name } [ ,...n ]
[ WITH
[
FULLSCAN
| SAMPLE number { PERCENT | ROWS }
| RESAMPLE
| <update_stats_stream_option> [ ,...n ]
]
[ [ , ] [ ALL | COLUMNS | INDEX ]
[ [ , ] NORECOMPUTE ]
] ;
This is optional too []. If used then you must begin with a WITH and you have 4 options that you must choose from.
Your options are
FULLSCAN
SAMPLE number { PERCENT | ROWS }, where you have to define the number and you must choose from PERCENT or | ROWS
RESAMPLE
` [ ,...n ]' which is a list separated by commas
Then you have to choose either ALL, COLUMNS or INDEX and preside that with a comma if you have used the WITH.
Lastly you have another option to use the NORECOMPUTE and put a comma before it if you have used any other option before it.
<update_stats_stream_option> ::=
[ STATS_STREAM = stats_stream ]
[ ROWCOUNT = numeric_constant ]
[ PAGECOUNT = numeric_contant ]
These are the list of predefined options you may use where <update_stats_stream_option> is used before (in 4).
Any thing between Square Brackets [...] are Optional
Any thing seperated by the pipe | symbol is a one or the other option.
In your above example, you could read it as
UPDATE STATISTICS table_or_indexed_view_name
[ optionally specify an index as well]
[ optionally specify options using **WITH**
If you use WITH then you can follow it with one of the following keywords
FULLSCAN
OR SAMPLE number { PERCENT | ROWS }
OR RESAMPLE
].. and so on

Rebol switch and type?

Why do I have to cast typeof to string with switch to make it work ?
This doesn't work:
typeof: type? get 'optional
switch typeof [
word! [
print "word"
]
string! [
print "string"
]
]
This works:
typeof: type? get 'optional
switch to-string typeof [
"word" [
print "word"
]
"string" [
print "string"
]
]
switch type?/word :optional [
word! [ print "word" ]
string! [ print "string" ]
]
OR
switch type? :optional reduce [
word! [ print "word" ]
string! [ print "string" ]
]
The reason is that the REBOL doesn't reduce ("evaluate") the cases in the switch statement. Without the /word refinement, the type? function returns a datatype!, but the switch statement tries to match this against a word!, and it fails.
I realize this might be confusing, so your best bet is either to convert the type to a string (as you did), or use one of the two idioms I've suggested. I prefer the first one, using type?/word.