REST API - Escaping characters - api

Let's assume I have a notes field with new lines characters in it.
What solution is correct and what is the difference between them?
1
{
"notes" : "test test test \n line2"
}
2
{
"notes" : "test test test \\n line2"
}
Thank you

In the output below:
{
"notes" : "test test test \n line2"
}
a new line character has been escaped using \n.
On your UI, you probably want to show
test test test
line2
I am assuming that you are retrieving the json from a data store (perhaps MySQL). If yes please see this answer about escaping newline chars in MYSSQL and MySQL string literals

Related

awk set variable to line after if statement match

I have the following text...
BIOS Information
Manufacturer : Dell Inc.
Version : 2.5.2
Release Date : 01/28/2015
Firmware Information
Name : iDRAC7
Version : 2.21.21 (Build 12)
Firmware Information
Name : Lifecycle Controller 2
Version : 2.21.21.21
... which is piped into the following awk statement...
awk '{ if ($1" "$2 == "BIOS Information") var=$1} END { print var }'
This will output 'BIOS' in this case.
I want to look for 'BIOS Information' and then set the third field, two lines down, so in this case 'var' would equal '2.5.2'. Is there a way to do this with awk?
EDIT:
I tried the following:
awk ' BEGIN {
FS="[ \t]*:[ \t]*";
}
NF==1 {
sectname=$0;
}
NF==2 && $1 == "Version" && sectname="BIOS Information" {
bios_version=$2;
}
END {
print bios_version;
}'
Which gives me '2.21.21.21' with the above text. Can this be modified to give me the first 'Version" following "BIOS Information"?
Following script may be an overkill but it is robust in cases if you have multiple section names and/or order of fields is changed.
BEGIN {
FS="[ \t]*:[ \t]*";
}
NF==1 {
sectname=$0;
}
NF==2 && $1 == "Version" && sectname=="BIOS Information" {
bios_version=$2;
}
END {
print bios_version;
}
First, we set input field separator so that words are not separated into different fields. Next, we check whether current line is section name or a key-value pair. If it is section name, set sectname to section name. If it is a key-value pair and current section name is "BIOS Information" and key is "Version" then we set bios_version.
To answer the question as asked:
awk -v RS= '
/^BIOS Information\n/ {
for(i=1;i<=NF;++i) { if ($i=="Version") { var=$(i+2); exit } }
}
END { print var }
' file
-v RS= puts awk in paragraph mode, so that each run of non-empty lines becomes a single record.
/^BIOS Information\n/ then only matches a record (paragraph) whose first line equals "BIOS Information".
Each paragraph is internally still split into fields by any run of whitespace (awk's default behavior), so the for loop loops over all fields until it finds literal Version, assigns the 2nd field after it to a variable (because : is parsed as a separate field) and exits, at which point the variable value is printed in the END block.
Note: A more robust and complete way to extract the version number can be found in the update below (the field-looping approach here could yield false positives and also only ever reports the first (whitespace-separated) token of the version field).
Update, based on requirements that emerged later:
To act on each paragraph's version number and create individual variables:
awk -v RS= '
# Helper function that that returns the value of the specified field.
function getFieldValue(name) {
# Split the record into everything before and after "...\n<name> : "
# and the following \n; the 2nd element of the array thus created
# then contains the desired value.
split($0, tokens, "^.*\n" name "[[:blank:]]+:[[:blank:]]+|\n")
return tokens[2]
}
/^BIOS Information\n/ {
biosVer=getFieldValue("Version")
print "BIOS version = " biosVer
}
/^Firmware Information\n/ {
firmVer=getFieldValue("Version")
print "Firmware version (" getFieldValue("Name") ") = " firmVer
}
' file
With the sample input, this yields:
BIOS version = 2.5.2
Firmware version (iDRAC7) = 2.21.21 (Build 12)
Firmware version (Lifecycle Controller 2) = 2.21.21.21
Given:
$ echo "$txt"
BIOS Information
Manufacturer : Dell Inc.
Version : 2.5.2
Release Date : 01/28/2015
Firmware Information
Name : iDRAC7
Version : 2.21.21 (Build 12)
Firmware Information
Name : Lifecycle Controller 2
Version : 2.21.21.21
You can do:
$ echo "$txt" | awk '/^BIOS Information/{f=1; printf($0)} /^Version/ && f{f=0; printf(":%s\n", $3)}'
BIOS Information:2.5.2

Perl6 grammars: match full line

I've just started exploring perl6 grammars. How can I make up a token "line" that matches everything between the beginning of a line and its end? I've tried the following without success:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {
<line>
}
token line {
^^.*$$
}
}
my $match = sample.parse($txt);
say $match<line>[0];
I can see 2 problem in your Grammar here, the first one here is the token line, ^^ and $$ are anchor to start and end of line, howeve you can have new line in between. To illustrate, let's just use a simple regex, without Grammar first:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
if $txt ~~ m/^^.*$$/ {
say "match";
say $/;
}
Running that, the output is:
match
「row 1
row 2
row 3」
You see that the regex match more that what is desired, however the first problem is not there, it is because of ratcheting, matching with a token will not work:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
my regex r {^^.*$$};
if $txt ~~ &r {
say "match regex";
say $/;
} else {
say "does not match regex";
}
my token t {^^.*$$};
if $txt ~~ &t {
say "match token";
say $/;
} else {
say "does not match token";
}
Running that, the output is:
match regex
「row 1
row 2
row 3」
does not match token
I am not really sure why, but token and anchor $$ does not seems to work well together. But what you want instead is searching for everything except a newline, which is \N*
The following grammar solve mostly your issue:
grammar sample {
token TOP {<line>}
token line {\N+}
}
However it only matches the first occurence, as you search for only one line, what you might want to do is searching for a line + an optional vertical whitespace (In your case, you have a new line at the end of your string, but i guess you would like to take the last line even if there is no new line at the end ), repeated several times:
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {[<line>\v?]*}
token line {\N+}
}
my $match = sample.parse($txt);
for $match<line> -> $l {
say $l;
}
Output of that script begin:
「row 1」
「row 2」
「row 3」
Also to help you using and debugging Grammar, 2 really usefull modules : Grammar::Tracer and Grammar::Debugger . Just include them at the beginning of the script. Tracer show a colorful tree of the matching done by your Grammar. Debugger allows you to see it matching step by step in real time.
Your original aproach can be made to work via
grammar sample {
token TOP { <line>+ %% \n }
token line { ^^ .*? $$ }
}
Personally, I would not try to anchor line and use \N instead as already suggested.
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP {
<line>+
}
token line {
\N+ \n
}
}
my $match = sample.parse($txt);
say $match<line>[0];
Or if you can be specific about the line:
grammar sample {
token TOP {
<line>+
}
rule line {
\w+ \d
}
}
my $txt = q:to/EOS/;
row 1
row 2
row 3
EOS
grammar sample {
token TOP { <line> }
token line { .* }
}
for $txt.lines -> $line {
## An single line of text....
say $line;
## Parse line of text to find match obj...
my $match = sample.parse($line);
say $match<line>;
}

Break down JSON string in simple perl or simple unix?

ok so i have have this
{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}
and at the moment i'm using this shell command to decode it to get the string i need,
echo $x | grep -Po '"utterance":.*?[^\\]"' | sed -e s/://g -e s/utterance//g -e 's/"//g'
but this only works when you have a grep compiled with perl and plus the script i use to get that JSON string is written in perl, so is there any way i can do this same decoding in a simple perl script or a simpler unix command, or better yet, c or objective-c?
the script i'm using to get the json is here, http://pastebin.com/jBGzJbMk and if you want a file to use then download http://trevorrudolph.com/a.flac
How about:
perl -MJSON -nE 'say decode_json($_)->{hypotheses}[0]{utterance}'
in script form:
use JSON;
while (<>) {
print decode_json($_)->{hypotheses}[0]{utterance}, "\n"
}
Well, I'm not sure if I can deduce what you are after correctly, but this is a way to decode that JSON string in perl.
Of course, you'll need to know the data structure in order to get the data you need. The line that prints the "utterance" string is commented out in the code below.
use strict;
use warnings;
use Data::Dumper;
use JSON;
my $json = decode_json
q#{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}#;
#print $json->{'hypotheses'}[0]{'utterance'};
print Dumper $json;
Output:
$VAR1 = {
'status' => 0,
'hypotheses' => [
{
'utterance' => 'hello how are you',
'confidence' => '0.96311796'
}
],
'id' => '7aceb216d02ecdca7ceffadcadea8950-1'
};
Quick hack:
while (<>) {
say for /"utterance":"?(.*?)(?<!\\)"/;
}
Or as a one-liner:
perl -lnwe 'print for /"utterance":"(.+?)(?<!\\)"/g' inputfile.txt
The one-liner is troublesome if you happen to be using Windows, since " is interpreted by the shell.
Quick hack#2:
This will hopefully go through any hash structure and find keys.
my $json = decode_json $str;
say find_key($json, 'utterance');
sub find_key {
my ($ref, $find) = #_;
if (ref $ref) {
if (ref $ref eq 'HASH' and defined $ref->{$find}) {
return $ref->{$find};
} else {
for (values $ref) {
my $found = find_key($_, $find);
if (defined $found) {
return $found;
}
}
}
}
return;
}
Based on the naming, it's possible to have multiple hypotheses. The prints the utterance of each hypothesis:
echo '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}' | \
perl -MJSON::XS -n000E'
say $_->{utterance}
for #{ JSON::XS->new->decode($_)->{hypotheses} }'
Or as a script:
use feature qw( say );
use JSON::XS;
my $json = '{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}';
say $_->{utterance}
for #{ JSON::XS->new->decode($json)->{hypotheses} };
If you don't want to use any modules from CPAN and try a regex instead there are multiple variants you can try:
# JSON is on a single line:
$json = '{"other":"stuff","hypo":[{"utterance":"hi, this is \"bob\"","moo":0}]}';
# RegEx with negative look behind:
# Match everything up to a double quote without a Backslash in front of it
print "$1\n" if ($json =~ m/"utterance":"(.*?)(?<!\\)"/)
This regex works if there is only one utterance. It doesn't matter what else is in the string around it, since it only searches for the double quoted string following the utterance key.
For a more robust version you could add whitespace where necessary/possible and make the . in the RegEx match newlines: m/"utterance"\s*:\s*"(.*?)(?<!\\)"/s
If you have multiple entries for the utterance confidence hash/object, changing case and weird formatting of the JSON string try this:
# weird JSON:
$json = <<'EOJSON';
{
"status":0,
"id":"an ID",
"hypotheses":[
{
"UtTeraNcE":"hello my name is \"Bob\".",
"confidence":0.0
},
{
'utterance' : 'how are you?',
"confidence":0.1
},
{
"utterance"
: "
thought
so!
",
"confidence" : 0.9
}
]
}
EOJSON
# RegEx with alternatives:
print "$1\n" while ( $json =~ m/["']utterance["']\s*:\s*["'](([^\\"']|\\.)*)["']/gis);
The main part of this RegEx is "(([^\\"]|\\.)*)". Description in detail as extended regex:
/
["'] # opening quotes
( # start capturing parentheses for $1
( # start of grouping alternatives
[^\\"'] # anything that's not a backslash or a quote
| # or
\\. # a backslash followed by anything
) # end of grouping
* # in any quantity
) # end capturing parentheses
["'] # closing quotes
/xgs
If you have many data sets and speed is a concern you can add the o modifier to the regex and use character classes instead of the i modifier. You can suppress the capturing of the alternatives to $2 with clustering parenthesis (?:pattern). Then you get this final result:
m/["'][uU][tT][tT][eE][rR][aA][nN][cC][eE]["']\s*:\s*["']((?:[^\\"']|\\.)*)["']/gos
Yes, sometimes perl looks like a big explosion in a bracket factory ;-)
Just stubmled upon another nice method of doing this, i finaly found how to acsess the Mac OS X JavaScript engine form commandline, heres the script,
alias jsc='/System/Library/Frameworks/JavaScriptCore.framework/Versions/A/Resources/jsc'
x='{"status":0,"id":"7aceb216d02ecdca7ceffadcadea8950-1","hypotheses":[{"utterance":"hello how are you","confidence":0.96311796}]}'
jsc -e "print(${x}['hypotheses'][0]['utterance'])"
Ugh, yes i came up with another answer, im strudying python and it reads arrays in both its python format and the same format as a json so, i jsut made this one liner when your variable is x
python -c "print ${x}['hypotheses'][0]['utterance']"
figured it out for unix but would love to see your perl and c, objective-c answers...
echo $X | sed -e 's/.*utterance//' -e 's/confidence.*//' -e s/://g -e 's/"//g' -e 's/,//g'
:D
shorter copy of the same sed:
echo $X | sed -e 's/.*utterance//;s/confidence.*//;s/://g;s/"//g;s/,//g'

Lex : line with one character but spaces

I have sentences like :
" a"
"a "
" a "
I would like to catch all this examples (with lex), but I don't how to say the beginning of the line
I'm not totally sure what exactly you're looking for, but the regex symbol to specify matching the beginning of a line in a lex definition is the caret:
^
If I understand correctly, you're trying to pull the "a" out as the token, but you don't want to grab any of the whitespace? If this is the case, then you just need something like the following:
[\n\t\r ]+ {
// do nothing
}
"a" {
assignYYText( yylval );
return aToken;
}

Multi-line strings in objective-c localized strings file

I have a template for an email that I've put in a localized strings file, and I'm loading the string with the NSLocalizedString macro.
I'd rather not make each line its own string with a unique key. In Objective-C, I can create a human-readable multiline string like so:
NSString *email = #"Hello %#,\n"
"\n"
"Check out %#.\n"
"\n"
"Sincerely,\n"
"\n"
"%#";
I tried to put that in a .strings file with:
"email" = "Hello %#,\n"
"\n"
"Check out %#.\n"
"\n"
"Sincerely,\n"
"\n"
"%#";
But I get the following error at build time:
CFPropertyListCreateFromXMLData(): Old-style plist parser: missing semicolon in dictionary.
email-template.strings: Unexpected character " at line 1
Command /Developer/Library/Xcode/Plug-ins/CoreBuildTasks.xcplugin/Contents/Resources/copystrings failed with exit code 1
I can concatenate it all together like this:
"email" = "Hello %#,\n\nCheck out %#.\n\nSincerely,\n\n%#";
But that will be a mess to maintain, particularly as the email gets longer.
Is there a way to do this in a localized strings file? I've already tried adding backslashes at the end of each line, to no avail.
Just use the new lines directly.
"email" = "Hello %#,
Check out %#.
Sincerely,
%#";