I have list of comma-separated string values of 600 characters and I need to pass it to JSON in such a way that 10 characters are loaded and send to the JSON object and then the result needs to be loaded out.
Thanks in advance.
Related
I have, in a database, records that are serialized PHP strings that I must obfuscate emails if there are any. The simplest record is like {s:20:"pika.chu#pokemon.com"}. It is basically saying: this is a string of length 20 which is pika.chu#pokemon.com. This field can be kilobytes long with lot of emails (or none) and sometimes it is empty.
I wish I could use a SQL regular expression function to obfuscate the user part of the email while preserving the length of the string in order not to break the PHP serialization. The example email above shall be turned into {s:20:"xxxxxxxx#pokemon.com"} where the number of x matches the length of pika.chu.
Any thoughts?
Here is a more complete example of what can be found as serialized PHP:
a:4:{s:7:"locales";a:3:{i:0;s:5:"fr_FR";i:1;s:5:"de_DE";i:2;s:5:"en_US";}s:9:"publisher";s:18:"john#something.com";s:7:"authors";a:2:{i:0;s:21:"william#something.com";i:1;s:19:"debbie#software.org";}s:12:"published_at";O:8:"DateTime":3:{s:4:"date";s:26:"2022-01-26 13:05:26.531289";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}}
I tried to do it using native functions but it not worked because functions like REGEXP_REPLACE don't let you manipulate the match to get the size of it, for example.
Instead, I've created a UDF to do that:
CREATE TEMP FUNCTION hideEmail(str STRING)
RETURNS STRING
LANGUAGE js AS """
return str
.replace(/([a-zA-Z.0-9_\\+-:]*)#/g, function(txt){return '*'.repeat(txt.length-1)+"#";})
""";
select hideEmail('a:4:{s:7:"locales";a:3:{i:0;s:5:"fr_FR";i:1;s:5:"de_DE";i:2;s:5:"en_US";}s:9:"publisher";s:18:"john#something.com";s:7:"authors";a:2:{i:0;s:21:"william#something.com";i:1;s:19:"debbie#software.org";}s:12:"published_at";O:8:"DateTime":3:{s:4:"date";s:26:"2022-01-26 13:05:26.531289";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}}')
Result:
a:4:{s:7:"locales";a:3:{i:0;s:5:"fr_FR";i:1;s:5:"de_DE";i:2;s:5:"en_US";}s:9:"publisher";s:18:"****#something.com";s:7:"authors";a:2:{i:0;s:21:"*******#something.com";i:1;s:19:"******#software.org";}s:12:"published_at";O:8:"DateTime":3:{s:4:"date";s:26:"2022-01-26 13:05:26.531289";s:13:"timezone_type";i:3;s:8:"timezone";s:3:"UTC";}}
I have the following code that extracts tab-separated strings into a string array:
static public List<String> getContents(File aFile, String separator){
// all strings, split based on separator
List<String> contentList = new ArrayList<String>();
StringTokenizer tokenizer = new StringTokenizer(Util.getContents(aFile), separator);
while (tokenizer.hasMoreTokens()){
contentList.add(tokenizer.nextToken());
}
return contentList;
}
The separator in this case is therefore a "\t".
As long as two strings are separated by one tab, everything is great. However, my dataset sometimes has two strings between separated by two tabs. This means that one parameter is missing and an emptry string shoulid be added to the list. However the method ignores that and just returns an array with one string less.
In my particular case, I always want an array of 5 strings back. That means, a text containing only 4 tabs with no text returns an array of 5 empty strings (needed for a parsing job that is based on that). Unfortunately, I have no control over the content and I am working with millions of files that are generated out of my control.
Is there a better way to do this with StringTokenizer ? Or do I have to implement something on my own?
Here some examples:
String ok = a\tb\tc\td\te
String nok = a\tb\tc\t\te
Ralf
Found this: How to split a string in Java
and that I can do it with
"myString".split("\t", -1);
to obtain the empty strings if there are multiple separators custering in one place.
Thanks anyway!
I'm trying to navigate through not being able to read multidimensional arrays with JavaScriptSerializer.
I think there's a workaround if I can do what's in this answer https://stackoverflow.com/a/9547490/1382306
Basically, if I can store json arrays in each field[] and loop through field, it should be no problem.
How do I loop through field if it's in the query string of this format
?field[]=["a","b","c"]&field[]=["d","e","f"]
Try
Request.QueryString ["field[]"][0]
... to return:
["a","b","c"] {in quotes}
and
Request.QueryString ["field[]"][1]
... to return:
["d","e","f"]
You will have to strip off the square brackets and then use split () over the commas.
I am working in hive / SQL. I have a column in my table with strings which represent an array of json objects. I need to convert the strings to arrays of JSON strings.
For example, I have this,
"[{a:1, b:1},{a:2, b:2}]"
And I want to get this:
["{a:1, b:1}","{a:2, b:2}"]
Tried casting the string as array but that didn't work. Any ideas on how do this in a smart way short of splitting by "},{"?
never mind, I ended up just splitting the string on "}" and then adding back the "}" to each piece, worked well!
I am reading JSON code from the database and then parsing the string using json parsers available for java. But I am getting JSONexception. Even if I try to parse this string on an online parser http://json.parser.online.fr/ there also the strings are taken as errors. Is there a way out to get rid of these errors or in other words how can I take care of such special symbols. The value of match is a regular expression.
Here is subpart of the sample string I am trying to parse as a json object.
{"RULE":[{"replace":{"value":"","type":"text"},"match":{"value":"<a [^>]*><img src="[^"]*WindowsLiveWriter/IconsfordifferentSocialBookmarkingSites[^>]*>\s*</a>","type":"text"}},{"replace":{"value":"","type":"text"},"match":{"value":"<a [^>]*><img src="[^"]*WindowsLiveWriter/IconsfordifferentSocialBookmarkingSites[^>]*>\s*</a>","type":"text"}}]}
use this json
{"RULE":[{"replace":{"value":"","type":"text"},"match":{"value":"<a [^>]*><img src=\"[^\"]*WindowsLiveWriter/IconsfordifferentSocialBookmarkingSites[^>]*>\\s*</a>","type":"text"}},{"replace":{"value":"","type":"text"},"match":{"value":"<a [^>]*><img src=\"[^\"]*WindowsLiveWriter/IconsfordifferentSocialBookmarkingSites[^>]*>\\s*</a>","type":"text"}}]}