How do I compare two simple Dictionaries using BeEquivalentTo from FluentAssertions? - fluent-assertions

I've created the following very simple test which is a simplification of a scenario I've come across in the wild.
var actual = new JObject
{
{"prop1" , "1"},
{"prop2" , "2"},
};
var expected = new JObject
{
{"prop1" , "3"},
{"prop2" , "2"},
};
actual.Should().BeEquivalentTo(expected);
I have two JObjects which differ at a single key and I would like a test to fail if they differ and tell me where the difference is.
This passes but I wouldn't expect it to because of the difference I've already mentioned. I've spend quite a bit of time messing around with customizing the equivalency check with no luck. Hopefully someone can point out what I've overlooked.

You can also assert the equality of the entire dictionary, where the equality of the keys and values will be validated using their Equals implementation.
Reference FluentAssertions - Dictionaries
Like
actual.Should().Equal(expected);

Related

Large list literals in Kotlin stalling/crashing compiler

I'm using val globalList = listOf("a1" to "b1", "a2" to "b2") to create a large list of Pairs of strings.
All is fine until you try to put more than 1000 Pairs into a List. The compiler either takes > 5 minutes or just crashes (Both in IntelliJ and Android Studio).
Same happens if you use simple lists of Strings instead of Pairs.
Is there a better way / best practice to include large lists in your source code without resorting to a database?
You can replace a listOf(...) expression with a list created using a constructor or a factory function and adding the items to it:
val globalList: List<Pair<String, String>> = mutableListOf().apply {
add("a1" to "b1")
add("a2" to "b2")
// ...
}
This is definitely a simpler construct for the compiler to analyze.
If you need something quick and dirty instead of data files, one workaround is to use a large string, then split and map it into a list. Here's an example mapping into a list of Ints.
val onCommaWhitespace = "[\\s,]+".toRegex() // in this example split on commas w/ whitespace
val bigListOfNumbers: List<Int> = """
0, 1, 2, 3, 4,
:
:
:
8187, 8188, 8189, 8190, 8191
""".trimIndent()
.split(onCommaWhitespace)
.map { it.toInt() }
Of course for splitting into a list of Strings, you'd have to choose an appropriate delimiter and regex that don't interfere with the actual data set.
There's no good way to do what you want; for something that size, reading the values from a data file (or calculating them, if that were possible) is a far better solution all round — more maintainable, much faster to compile and run, easier to read and edit, less likely to cause trouble with build tools and frameworks…
If you let the compiler finish, its output will tell you the problem.  (‘Always read the error messages’ should be one of the cardinal rules of development!)
I tried hotkey's version using apply(), and it eventually gave this error:
…
Caused by: org.jetbrains.org.objectweb.asm.MethodTooLargeException: Method too large: TestKt.main ()V
…
There's the problem: MethodTooLargeException.  The JVM allows only 65535 bytes of bytecode within a single method; see this answer.  That's the limit you're coming up against here: once you have too many entries, its code would exceed that limit, and so it can't be compiled.
If you were a real masochist, you could probably work around this to an extent by splitting the initialisation across many methods, keeping each one's code just under the limit.  But please don't!  For the sake of your colleagues, for the sake of your compiler, and for the sake of your own mental health…

What is the benefit of defining datatypes for literals in an RDF graph?

I am using rdflib in Python to build my first rdf graph. However, I do not understand the explicit purpose of defining Literal datatypes. I have scraped over the documentation and did my due diligence with google and the stackoverflow search, but I cannot seem to find an actual explanation for this. Why not just leave everything as a plain old Literal?
From what I have experimented with, is this so that you can search for explicit terms in your Sparql query with BIND? Does this also help with FILTERing? i.e. FILTER (?var1 > ?var2), where var1 and var2 should represent integers/floats/etc? Does it help with querying speed? Or am I just way off altogether?
Specifically, why add the following triple to mygraph
mygraph.add((amazingrdf, ns['hasValue'], Literal('42.0', datatype=XSD.float)))
instead of just this?
mygraph.add((amazingrdf, ns['hasValue'], Literal("42.0")))
I suspect that there must be some purpose I am overlooking. I appreciate your help and explanations - I want to learn this right the first time! Thanks!
Comparing two xsd:integer values in SPARQL:
ASK { FILTER (9 < 15) }
Result: true. Now with xsd:string:
ASK { FILTER ("9" < "15") }
Result: false, because when sorting strings, 9 comes after 1.
Some equality checks with xsd:decimal:
ASK { FILTER (+1.000 = 01.0) }
Result is true, it’s the same number. Now with xsd:string:
ASK { FILTER ("+1.000" = "01.0") }
False, because they are clearly different strings.
Doing some maths with xsd:integer:
SELECT (1+1 AS ?result) {}
It returns 2 (as an xsd:integer). Now for strings:
SELECT ("1"+"1" AS ?result) {}
It returns "11" as an xsd:string, because adding strings is interpreted as string concatenation (at least in Jena where I tried this; in other SPARQL engines, adding two strings might be an error, returning nothing).
As you can see, using the right datatype is important to communicate your intent to code that works with the data. The SPARQL examples make this very clear, but when working directly with an RDF API, the same kind of issues crop up around object identity, ordering, and so on.
As shown in the examples above, SPARQL offers convenient syntax for xsd:string, xsd:integer and xsd:decimal (and, not shown, for xsd:boolean and for language-tagged strings). That elevates those datatypes above the rest.

JSON issue with SQL lines return

I have an issue when I try to parse my JSON. I create my JSON "by my hand" like this in PHP :
$outp ='{"records":['.$outp.']}'; and I create it so I can take field from my database to show them in the page. The thing is, in my database I have a field "description" where people can give a description about something. So some people make return to line like this for example :
Interphone
Equipe:
Canape-lit
Autre:
Local
And when I try to parse my JSON there is an error because of these line's return. "SyntaxError: Unexpected token".
Here's an example of my JSON :
{"records":[{"Parking":"Aucun","Description":"Interphone
Equipé :
Canapé-lit
","Chauffage":"Fioul"}]}
Can someone help me please ?
You've really dug yourself into a very bad hole here.
The problem
The problem you're running into is that a newline (line feed and carriage return characters) are not valid JSON. They must be escaped as \n and \r. You can see the full JSON standard here here.
You need to do two things.
Fix your code
In spite of the fact that the JSON standard is comparatively simple, you should not create your JSON by hand. You already know why. You have to handle several edge cases and the like. Your users could enter anything on the page, and you need to make sure that it gets properly encoded no matter what.
You need to use a JSON serialization tool. json_encode is built in as of 5.2. If you can't use this for any reason, find an existing, widely used (and therefore heavily tested) third party library with a JSON serializer.
If you're asking, "Why can't I create my own serializer?", you could, in theory. Realistically, there is no point. Yours won't be better than existing ones. It will be much more likely to have bugs and to perform worse than something a lot of people have used in production. It will also take much longer to create and test than using an existing one.
If you need this data in code after you pull it back out of the database, then you need a JSON deserializer. json_decode should also be fine, but again, if you can't use it, look for a widely used third party library.
Fix your data
If you haven't hit production yet, you have really dodged a bullet here, and you can skip this whole section. If you have gone to production and you have data from users, you've got a major problem.
Even after you fix your code, you still have bad data in your production database that won't parse correctly. You have to do something to make this data usable. Unfortunately, it is impossible to automatically recover the original data for every possible case. This is because users might have entered the characters/substrings you added to the data to turn it into "JSON"; for example, they might have entered a comma separated list of quoted words: "dog","cat","pig", and "cow". That is an intractable problem, since you know for a fact you didn't properly serialize all your incoming input. There's no way to tell the difference between text your code generated and text the user entered. You're going to have to settle for a best effort and try to throw errors when you can't figure it out in code, and it might mess up a user's data in some special cases. You might have to fix some things manually.
Start by discussing this with your manager, team lead, whoever you answer to. Assuming that you can't lose the data, this is the most sound process to follow for creating a fix for your data:
Create a database dump of your production data.
Import that dump into a development database.
Develop and test your method of repairing this data against the development database from the last step.
Ensure you have a recovery plan for deployments gone wrong. Test this plan in your testing environment.
Once you've gone through your typical release process, it's time to release the fixed code and the data update together.
Take the website offline.
Back up the database.
Update the website with the new code.
Implement your data fix.
Verify that it worked.
Bring the site online.
If your data fix doesn't work (possibly because you didn't think of an edge case or something), then you have a nice back up you can restore and you can cancel the release. Then go back to step 1.
As for how you can fix the data, I don't recommend queries here. I recommend a little script tool. It would have to load the data from the database, pull the string apart, try to identify all the pieces, build up an object from those pieces, and finally serialize them to JSON correctly, and put them back into the database.
Here's an example function of how you might go about pulling the string apart:
const ELEMENT_SEPARATOR = '","';
const PAIR_SEPARATOR = '":"';
function recover_object_from_malformed_json($malformed_json, $known_keys) {
$tempData = substr($malformed_json, 14); // Removes {"records":[{" prefix
$tempData = substr($tempData, 0, -4); // Removes "}]} suffix
$tempData = explode(ELEMENT_SEPARATOR, $tempData); // Split into what we think are pairs
$data = array();
$lastKey = NULL;
foreach ($tempData as $i) {
$explodedI = explode(KEY_VALUE_SEPARATOR, $i, 2); // Split what we think is a key/value into key and value
if (in_array($explodedI[0], $known_keys)) { // Check if it's actually a key
// It's a key
$lastKey = $explodedI[0];
if (array_key_exists($lastKey, $data)) {
throw new RuntimeException('Duplicate key: ' + $lastKey);
}
// Assign the value to the key
$data[$lastKey] = $explodedI[1];
}
else {
// This isn't a key vlue pair, near as we can tell
// So it must actually be part of the last value,
// and the user actually entered the delimiter as part of the value.
if (is_null($lastKey)) {
// This one is REALLY messed up
throw new RuntimeException('Does not begin with a known key');
}
$data[$lastKey] += ELEMENT_SEPARATOR;
$data[$lastKey] += $i;
}
}
return $data;
}
Note that I'm assuming that your "list" is a single element. This gets much harder and much messier if you have more than one. You'll also need to know ahead of time what keys you expect to have. The bottom line is that you have to undo whatever your code did to create the "JSON", and you have to do everything you can to try to not mess up a user's data.
You would use it something like this:
$knownKeys = ["Parking", "Description", "Chauffage"];
// Fetch your rows and loop over them
foreach ($dbRows as $row) {
try {
$dataFromDb = $row.myData // or however you would pull out this string.
$recoveredData = recover_object_from_malformed_json($dataFromDb);
// Save it back to the DB
$row.myData = json_encode($recoveredData);
// Make sure to commit here.
}
catch (Exception $e) {
// Log the row's ID, the content that couldn't be fixed, and the exception
// Make sure to roll back here
}
}
(Forgive me if the database stuff looks really wonky. I don't do PHP, so I have no idea how that code should look. Hopefully, you can at least get the concept.)
Why I don't recommend trying to parse your data as JSON to recover it.
The bottom line is that your data in the database is not JSON. IF you try to parse it as such, all the other edge cases you didn't handle properly will get screwed up in the process. You'll see bad things like
\\ becomes \
\j becomes j
\t becomes a tab character
In the end, it will just mess up your data even more.
Conclusion
This is a huge mess, and you should never try to convert something into a standard format without using a properly built, well tested serializer. Fixing the data is going to be hard, and it's going to take time. I also seriously doubt you have a lot of background in text processing techniques, and lacking that knowledge is going to make this harder. You can get some good info on text processing by studying how compilers are made. Good luck.

How can I get Chai to show actual and expected values using toString()

I recently switched from should.js to chai.js, as I discovered the former was causing snags in browser-based testing. The change didn't require any changes to my test suite, as the syntax is supported, but I see that the output of failing tests no longer shows me the actual and expected values in a useful way:
AssertionError: expected [ Array(9) ] to deeply equal [ Array(9) ]
I can get it to spit out a representation of these values by adding this line:
chai.config.truncateThreshold = 0;
However this results in every value being exhaustively output, including functions, and including prototype properties. Also pretty useless.
So is there some way I am missing to have chai behave like should.js, where actual/expected values are shown using their toString() method?
One way to get Chai (v1.10.0) to show actual and expected values using .toString() is to patch its utils.objDisplay at runtime.
The basic gist is:
chai.use(function(_chai,utils) {
utils.objDisplay = function(obj) { return obj+''; };
});
However, it's slightly-more complicated to do in practice because of the way Chai instantiates duplicate (ie: !==) module references internally; in this case making it necessary to patch utils.getMessage also.
Here is a fiddle to demonstrate the overall patch along with a trivial example of custom formatting for Array objects: https://jsfiddle.net/humbletim/oc1tnqpy/
As per https://www.chaijs.com/guide/styles/#configuration, chai now lets you disable truncating the length threshold for actual and expected values in assertion errors.
chai.config.truncateThreshold = 0; // disable truncating

Specify wildcard in JSON?

I have searched the web, but I cannot find an answer (or a duplicate question for that matter).
I am POSTing JSON via Jersey REST services.
I usually POST something like this, where each value is specified:
{
"w":"val1",
"x":"val2",
"y":"val3",
"z":{
"zz":{
"za":"val9",
"zb":"val8",
"zc":"val7"
}
}
}
I would like to POST something like this, where the asterisk is a wildcard.
{
"w":"val1",
"x":"val2",
"y":"val3",
"z":{
"zz":{
"za":"val9",
"zb":"*",
"zc":"val7"
}
}
}
The JSON values will ultimately be passed as parameters to a Sybase Stored procedure, but in this case I do not know any valid values for "zb".
For example, "zb" may be a primary key ID. But I do not know any of the 10 digit ID's. So rather than repeatedly trying random combinations of 10 digits until I get a result back, I would like to specify that ANY existing primary key in the table would suffice.
Is this possible? If so, how?
For the sake of having a complete Q&A repository on this website, I am posting what I have come up with, although it is an unsatisfactory solution.
My workaround is a combination of (at the time of this writing) the two comments left to the question.
That being said, I did not specify a "wildcard" character. I POSTed empty strings for data I did not have, and, of course, received a broader data set in the response. For example, my JSON looked something like this:
{
"w":"val1",
"x":"",
"y":"val3",
"z":{
"zz":{
"za":"",
"zb":"",
"zc":"val7"
}
}
}
Ultimately, I could not do exactly what I wanted to do as described in the question. That is either because (it's still unknown to me) 1) it is currently impossible to do what I am trying to do in JSON or 2) it is possible, but I still do not know how.