iPhone NSArray from Dictionary of Dictionary values - objective-c

I have a Dictionary of Dictionaries which is being returned to me in JSON format
{
"neverstart": {
"color": 0,
"count": 0,
"uid": 32387,
"id": 73129,
"name": "neverstart"
},
"dev": {
"color": 0,
"count": 1,
"uid": 32387,
"id": 72778,
"name": "dev"
},
"iphone": {
"color": 0,
"count": 1,
"uid": 32387,
"id": 72777,
"name": "iphone"
}
}
I also have an NSArray containing the id's required for an item. e.g. [72777, 73129]
What I need to be able to do is get a dictionary of id => name for the items in the array. I know this is possible by iterating through the array, and then looping through all the values in the Dictionary and checking values, but it seems like it there should be a less longwinded method of doing this.
Excuse my ignorance as I am just trying to find my way around the iPhone SDK and learning Objective C and Cocoa.

First off, since you're using JSON, I'm hoping you've already found BSJSONAdditions and/or json-framework, both of them open-source projects for parsing JSON into native Cocoa structures for you. This blog post gives an idea of how to use the latter to get an NSDictionary from a JSON string.
The problem then becomes one of finding the matching values in the dictionary. I'm not aware of a single method that does what you're looking for — the Cocoa frameworks are quite powerful, but are designed to be very generic and flexible. However, it shouldn't be too hard to put together in not too many lines... (Since you're programming on iPhone, I'll use fast enumeration to make the code cleaner.)
NSDictionary* jsonDictionary = ...
NSDictionary* innerDictionary;
NSArray* requiredIDs = ...
NSMutableDictionary* matches = [NSMutableDictionary dictionary];
for (id key in jsonDictionary) {
innerDictionary = [jsonDictionary objectForKey:key];
if ([requiredIDs containsObject:[innerDictionary objectForKey:#"id"]])
[matches setObject:[innerDictionary objectForKey:#"name"]
forKey:[innerDictionary objectForKey:#"id"]];
}
This code may have typos, but the concepts should be sound. Also note that the call to [NSMutableDictionary dictionary] will return an autoreleased object.

Have you tried this NSDictionary method:
+ (id)dictionaryWithObjects:(NSArray *)objects forKeys:(NSArray *)keys

Drew's got the answer... I found that the GCC manual for the NSDictionary was helpful in a bare-bones way the other day when I had a similar question.
http://www.gnustep.org/resources/documentation/Developer/Base/Reference/NSDictionary.html

Related

How to validate a JSON object against a JSON schema based on object's type described by a field?

I have a number of objects (messages) that I need to validate against a JSON schema (draft-04). Each objects is guaranteed to have a "type" field, which describes its type, but every type have a completely different set of other fields, so each type of object needs a unique schema.
I see several possibilities, none of which are particularly appealing, but I hope I'm missing something.
Possibility 1: Use oneOf for each message type. I guess this would work, but the problem is very long validation errors in case something goes wrong: validators tend to report every schema that failed, which include ALL elements in "oneOf" array.
{
"oneOf":
[
{
"type": "object",
"properties":
{
"t":
{
"type": "string",
"enum":
[
"message_type_1"
]
}
}
},
{
"type": "object",
"properties":
{
"t":
{
"type": "string",
"enum":
[
"message_type_2"
]
},
"some_other_property":
{
"type": "integer"
}
},
"required":
[
"some_other_property"
]
}
]
}
Possibility 2: Nested "if", "then", "else" triads. I haven't tried it, but I guess that maybe errors would be better in this case. However, it's very cumbersome to write, as nested if's pile up.
Possibility 3: A separate scheme for every possible value of "t". This is the simplest solution, however I dislike it, because it precludes me from using common elements in schemas (via references).
So, are these my only options, or can I do better?
Since "type" is a JSON Schema keyword, I'll follow your lead and use "t" as the type-discrimination field, for clarity.
There's no particular keyword to accomplish or indicate this (however, see https://github.com/json-schema-org/json-schema-spec/issues/31 for discussion). This is because, for the purposes of validation, everything you need to do is already possible. Errors are secondary to validation in JSON Schema. All we're trying to do is limit how many errors we see, since it's obvious there's a point where errors are no longer productive.
Normally when you're validating a message, you know its type first, then you read the rest of the message. For example in HTTP, if you're reading a line that starts with Date: and the next character isn't a number or letter, you can emit an error right away (e.g. "Unexpected tilde, expected a month name").
However in JSON, this isn't true, since properties are unordered, and you might not encounter the "t" until the very end, if at all. "if/then" can help with this.
But first, begin by by factoring out the most important constraints, and moving them to the top.
First, use "type": "object" and "required":["t"] in your top level schema, since that's true in all cases.
Second, use "properties" and "enum" to enumerate all its valid values. This way if "t" really is entered wrong, it will be an error out of your top-level schema, instead of a subschema.
If all of these constraints pass, but the document is still invalid, then it's easier to conclude the problem must be with the other contents of the message, and not the "t" property itself.
Now in each sub-schema, use "const" to match the subschema to the type-name.
We get a schema like this:
{
"type": "object",
"required": ["t"],
"properties": { "t": { "enum": ["message_type_1", "message_type_2"] } }
"oneOf": [
{
"type": "object",
"properties": {
"t": { "const": "message_type_1" }
}
},
{
"type": "object",
"properties":
"t": { "const": "message_type_2" },
"some_other_property": {
"type": "integer"
}
},
"required": [ "some_other_property" ]
}
]
}
Now, split out each type into a different schema file. Make it human-accessible by naming the file after the "t". This way, an application can read a stream of objects and pick the schema to validate each object against.
{
"type": "object",
"required": ["t"],
"properties": { "t": { "enum": ["message_type_1", "message_type_2"] } }
"oneOf": [
{"$ref": "message_type_1.json"},
{"$ref": "message_type_2.json"}
]
}
Theoretically, a validator now has enough information to produce much cleaner errors (though I'm not aware of any validators that can do this).
So, if this doesn't produce clean enough error reporting for you, you have two options:
First, you can implement part of the validation process yourself. As described above, use a streaming JSON parser like Oboe.js to read each object in a stream, parse the object and read the "t" property, then apply the appropriate schema.
Or second, if you really want to do this purely in JSON Schema, use "if/then" statements inside "allOf":
{
"type": "object",
"required": ["t"],
"properties": { "t": { "enum": ["message_type_1", "message_type_2"] } }
"allOf": [
{"if":{"properties":{"t":{"const":"message_type_1"}}}, "then":{"$ref": "message_type_1.json"}},
{"if":{"properties":{"t":{"const":"message_type_2"}}}, "then":{"$ref": "message_type_2.json"}}
]
}
This should produce errors to the effect of:
t not one of "message_type_1" or "message_type_2"
or
(because t="message_type_2") some_other_property not an integer
and not both.

Can a JSON schema validator be killed with this schema?

I tried this out with some JSON schema validators and some fail, but the problem is to figure out how much memory a validator uses that causes it to choke and be killed.
It turns out that we can implement finite state machines in JSON schema. To do so, the FSM nodes are object schemas and the FSM edges are a set of JSON Pointers wrapped in an anyOf. The whole thing is rather simple to do, but being able to do this has some consequences: what if we create an FSM that requires 2^N time or memory (depth first search or breadth first search, respectively) given a JSON schema with N definitions and some input to validate?
So let's create a JSON Schema with N definitions to implement a non-deterministic finite state machine (NFA) over an alphabet of two symbols a and b. All we need to do is to encode the regex
(a{N}|a(a|b+){0,N-1}b)*x, where x denotes the end. In the worst case, the NFA for this regex takes 2^N time to match text or 2^N memory (e.g. when converted to a deterministic finite state machine). Now notice that the word abbx can be represented by a JSON pointer a/b/b/x which in JSON is equivalent to {"a":{"b":{"b":{"x":true}}}}.
To encode this NFA as a schema, we first add a definition for state "0":
{
"$schema": "http://json-schema.org/draft-04/schema#",
"$ref": "#/definitions/0",
"definitions": {
"0": {
"type": "object",
"properties": {
"a": { "$ref": "#/definitions/1" },
"x": { "type": "boolean" }
},
"additionalProperties": false
},
Then we add N-1 definitions for each state <DEF> to the schema where <DEF> is enumerated "1", "2", "3", ... "N-1":
"<DEF>": {
"type": "object",
"properties": {
"a": { "$ref": "#/definitions/<DEF>+1" },
"b": {
"anyOf": [
{ "$ref": "#/definitions/0" },
{ "$ref": "#/definitions/<DEF>" }
]
}
},
"additionalProperties": false
},
where "<DEF>+1" wraps back to "0" when <DEF> is equal to N-1.
This "NFA" on a two-letter alphabet has N states, only one initial and one
final state. The equivalent minimal DFA has 2^N (2 to the power N) states.
This means that in the worst case, a validator that uses this schema either must be taking 2^N time or use 2^N memory "cells" to validate the input.
I don't see where this logic can go wrong, unless validators take shortcuts to approximate the validity checking.
I found this here.
I think in principle you are right. I am not 100% sure about the schema construction you've described, but theoretically it should be possible to construct a schema which required ^N time or space, exactly for the reasons you describe.
Practically most schema processors will probably just try to recursively validate anyOf. So, that would be exponential time.

Swift 3 NSDictionary to Dictionary conversion causes NSInvalidArgumentException

I have just converted my project from Swift 2.2 to 3.0, and I'm getting a new exception thrown in my tests. I have some Objective C code in one of my tests which reads in some JSON from a file:
+ (NSDictionary *)getJSONDictionaryFromFile:(NSString *)filename {
/* some code which checks the parameter and gets a string of JSON from a file.
* I've checked in the debugger, and jsonString is properly populated. */
NSDictionary *jsonDict = [NSJSONSerialization JSONObjectWithData:[jsonString dataUsingEncoding:NSUTF8StringEncoding] options:0 error:nil];
return jsonDict;
}
I'm calling this from some Swift code:
let expectedResponseJSON = BZTestCase.getJSONDictionary(fromFile: responseFileName)
This works just fine most of the time, but I have one JSON file which causes the error:
failed: caught "NSInvalidArgumentException", "-[__NSSingleObjectArrayI enumerateKeysAndObjectsUsingBlock:]: unrecognized selector sent to instance 0x608000201fa0"
The strange thing about this is that the error is generated after the getJSONDictionaryFromFile method returns and before expectedResponseJSON in the Swift code is populated. To me, this seems to say that it's the conversion from an NSDictionary to Dictionary which is the problem. The offending JSON file is this one:
[
{
"status": "403",
"title": "Authentication Failed",
"userData": {},
"ipRangeError": {
"libraryName": "Name goes here",
"libraryId": 657,
"requestIp": "127.0.0.1"
}
}
]
If I remove the outermost enclosing [], this error goes away. I can't be the only person using an array as the top level entity of a JSON file in Swift 3, am I doing something wrong? What can I do to get around this error?
As is referenced in the comments, the problem is that getJSONDictionaryFromFile returns an NSDictionary * and my JSON input is an array. The only mystery is why this used to work in Swift 2.2! I ended up changing expectedResponseJSON to be an Any?, and rewrote my Objective C code in Swift:
class func getStringFrom(file fileName: String, fileExtension: String) -> String {
let filepath = Bundle(for: BZTestCase.self).path(forResource: fileName, ofType: fileExtension)
return try! NSString(contentsOfFile: filepath!, usedEncoding: nil) as String
}
class func getJSONFrom(file fileName: String) -> Any? {
let json = try! JSONSerialization.jsonObject(with: (getStringFrom(file: fileName, fileExtension: ".json").data(using: .utf8))!, options:.allowFragments)
return json
}
As a note to anyone who might cut and paste this code, I used try! and filepath! instead of try? and if let... because this code is used exclusively in tests, so I want it to crash as soon as possible if my inputs are not what I expect them to be.

NSJSONSerialization generating NSCFString* and NSTaggedPointerString*

Executing NSJSONSerialization on the following json sometimes gives me NSCFString* and sometimes NSTaggedPointerString* on string values. Does anyone know why this is the case and what NSJSONSerialization uses to determine which type it returns?
jsonData = [NSJSONSerialization JSONObjectWithData:data
options:kNilOptions
error:&parseError];
{
"UserPermissionsService": {
"ServiceHeader": {},
"UserApplicationPermissions": {
"ApplicationPermissions": {
"ApplicationID": "TEST",
"Permission": [
{
"Locations": [
"00000"
],
"PermissionID": "LOGIN"
},
{
"Locations": [
"00000"
],
"PermissionID": "SALES_REPORT_VIEW"
}
]
}
}
}
}
"LOGIN" comes back as a NSTaggedPointerString*. "SALES_REPORT_VIEW" comes back is a NSCFString*. This is having an impact downstream where I'm using and casting the values.
UPDATE
Here's what I learned...
"NSTaggedPointerString results when the whole value can be kept in the pointer itself without allocating any data."
There's a detailed explanation here...
https://www.mikeash.com/pyblog/friday-qa-2015-07-31-tagged-pointer-strings.html
Since NSTaggedPointerString is a subclass of NSString showing quotes / not showing quotes should never been an issue for me as the data is used.
Thanks for everyone that commented. I'm comfortable I understand what NSJSONSerialization is doing.
Much of Foundation is implemented as class clusters. TL;DR you interact with it as an NSString but foundation will change the backing implementation to optimize for certain performance or space characteristics based on the actual contents.
If you are curious one of the Foundation team dumped a list of the class clusters as of iOS 11 here
I FIXED IT BY USING "MUTABLECOPY"
I had the same issue. For some "performance" mechanism apparently apple uses NSTaggedPointerString for "well known" strings such as "California" but this might be an issue since for some weird reason the NSJSONSerialization doesn't add the quotes around this NSTaggedPointerString type of strings. The work around is simple:
NSString *taggedString = #"California";
[data setObject:[taggedString mutableCopy] forKey:#"state"]
Works like a charm.

Elasticsearch / Lucene Misspelled Whitespace

How can I make Elasticsearch correct queries in which keyword should contain whitespace but instead typed adjacent. E.g.
"thisisaquery" -> "this is a query"
my current settings are:
"settings": {
"index": {
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "whitespace",
"filter": [
"lowercase", "engram"
]
}
},
"filter": {
"engram": {
"type": "edgeNGram",
"min_gram": 3,
"max_gram": 10
}
}
}
}
}
There isn't an out of the box tokenizer/token filter to explicitly handle what you're asking for. The closest would be the compound word token filter which requires manually providing a dictionary file which in your case would may require the full english dictionary to work correctly. Even with that it would likely have issues with words that are stems of other words, abbreviations, etc without a lot of additional logic. It may be good enough though depending on your exact requirements.
This ruby project claims to do this. You might try it if you're using ruby, or just look at their code and copy their analyzer settings for it :)
https://github.com/ankane/searchkick