Understanding XSD error in a third party document - lxml

Using Python and lxml to validate a third-party XSD, I get the following error already while parsing the FB2 (FictionBook) schema:
lxml.etree.XMLSchemaParseError: attribute use (unknown),
attribute 'ref': The QName value
'{http://www.w3.org/XML/1998/namespace}lang'
does not resolve to a(n) attribute declaration., line 52
The schema part responsible to this error looks like the following:
<xs:attribute ref="xml:lang"/>
In fact, FB2 is a well known format, so I can't imagine the schema really has an error; rather my parser has a wrong setting?
Here is my code following a working example from the lxml manual.
f = StringIO(file_to_string("data/spec/FictionBook.xsd"))
xmlschema_doc = etree.parse(f)
xmlschema = etree.XMLSchema(xmlschema_doc)

Related

Type name 'serializeObject' does not exist in the type 'JsonConvert'

I am trying to convert some data to Json to send to the CloudFlare API. I am attempting to use Newtonsoft.Json.JsonConvert.SeriablizeObject() to accomplish this but I am getting the Intellisense error that the type name 'serializeObject' does not exist in the type 'JsonConvert'. I have the NuGet package Newtonsoft.Json installed but it does not recognize the SerializeObject() method. I am not sure what I am missing.
Its because you're calling the method wrong, remove the new operator from the line
var json = new JsonConvert.SerializeObject(updateDnsRecord);
to
var json = JsonConvert.SerializeObject(updateDnsRecord);

Processing Event Hub Capture AVRO files with Azure Data Lake Analytics

I'm attempting to extract data from AVRO files produced by Event Hub Capture. In most cases this works flawlessly. But certain files are causing me problems. When I run the following U-SQL job, I get the error:
USE DATABASE Metrics;
USE SCHEMA dbo;
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
REFERENCE ASSEMBLY [Avro];
REFERENCE ASSEMBLY [log4net];
USING Microsoft.Analytics.Samples.Formats.ApacheAvro;
USING Microsoft.Analytics.Samples.Formats.Json;
USING System.Text;
//DECLARE #input string = "adl://mydatalakestore.azuredatalakestore.net/event-hub-capture/v3/{date:yyyy}/{date:MM}/{date:dd}/{date:HH}/{filename}";
DECLARE #input string = "adl://mydatalakestore.azuredatalakestore.net/event-hub-capture/v3/2018/01/16/19/rcpt-metrics-us-es-eh-metrics-v3-us-0-35-36.avro";
#eventHubArchiveRecords =
EXTRACT Body byte[],
date DateTime,
filename System.String
FROM #input
USING new AvroExtractor(#"
{
""type"":""record"",
""name"":""EventData"",
""namespace"":""Microsoft.ServiceBus.Messaging"",
""fields"":[
{""name"":""SequenceNumber"",""type"":""long""},
{""name"":""Offset"",""type"":""string""},
{""name"":""EnqueuedTimeUtc"",""type"":""string""},
{""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}},
{""name"":""Body"",""type"":[""null"",""bytes""]}
]
}
");
#json =
SELECT Encoding.UTF8.GetString(Body) AS json
FROM #eventHubArchiveRecords;
OUTPUT #json
TO "/outputs/Avro/testjson.csv"
USING Outputters.Csv(outputHeader : true, quoting : true);
I get the following error:
Unhandled exception from user code: "The given key was not present in the dictionary."
An unhandled exception from user code has been reported when invoking the method 'Extract' on the user type 'Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor'
Am I correct in assuming the problem is within the AVRO file produced by Event Hub Capture, or is there something wrong with my code?
The Key Not Present error is referring to the fields in your extract statement. It's not finding the data and filename fields. I removed those fields and your script runs correctly in my ADLA instance.
The current implementation only supports primitive types, not complex types of the Avro specification at the moment.
You have to build and use an extractor based on apache avro and not use the sample extractor provided by MS.
We went the same path

JSON Schema for FHIR false positives

I am new to JSON Schema, and am trying to validate JSON based on the HL7-FHIR schemas. Data I think should be invalid (and that the official Java-based validator says are invalid) shows up as valid.
For example, {"dog": "food"} should be invalid, because when I run the validator, I get:
> java -jar org.hl7.fhir.validator.jar bad.json -defn definitions.json.zip
.. load FHIR from definitions.json.zip
.. connect to tx server # http://tx.fhir.org/r3
(vnull-null)
.. validate
*FAILURE* validating bad.json: error:1 warn:0 info:0
Fatal # $ (line 1, col2) : Unable to find resourceType property
But if I paste the fhir.schema.json file from here into a JSON Schema validator like the one here, and evaluate {"dog": "food"}, it's valid.
It's valid even if I supply a resourceType, which I thought might cause the restrictions to kick in. It's also valid if I copy an example I expect to be valid—say, this Practitioner example—and change some of the types (set name to be a string rather than an array, for example).
I'm not sure if I'm running into a problem with the HL7-FHIR JSON Schema in particular or with JSON Schemas in general. I believe my question is different than this one because it appears that we're up to release 3.0, and so the schema I'm using is updated.

common xsd schema imported into another schema not being unmarshalled

re http://blog.bdoughan.com/2011/12/reusing-generated-jaxb-classes.html
I am trying to switch from using castor to jaxb.
I am importing a commontypes.xsd schema into another schema and then using jaxb to generate the java classes but when I unmarhsal a sample XML file the imported types are null unless I explicitly set all the namespaces in the sample xml.
This is a real pain because I want calling apps to be able to send me plain XML not one littered with a tonne of namespaces and prefixes etc.
Any suggestions as to how to avoid having to do this?
I generated .episodes files in maven using the above article and XJC episode with maven but it doesnt help and Im still getting nulls when I unmarshal.
Can anyone help?
thanks
I got it working!
The problem was the package-info.java file generated by xjc from my .xsd file had elementFormDefault set to be QUALIFIED
#javax.xml.bind.annotation.XmlSchema(
namespace = "http://www.example.com/commontypes",
elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED
)
package com.example.commontypes;
When I changed this to be unqualified and recompiled the java code, the unmarshall then worked.
The root cause fix was in my .xsd file, where I set elementFormDefault="unqualified"
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.com/commontypes"
xmlns="http://www.example.com/commontypes"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
This resulting in the following generated package-info.java file
#javax.xml.bind.annotation.XmlSchema(
namespace = "http://www.example.com/commontypes"
)
package com.example.commontypes;
and again, the unmarshall then worked!
Thanks to Blaise for all the work he puts in, it was comment on one of his blog posts that let me figure it out!

.NET, XmlSerializer InvalidOperationException, due to XmlSchema definition?

I've uploaded a ZIP file containing both the XML file I'm trying to read and the corresponding XSD files to http://www.bonnland.de/FIBEX.zip
I'm trying to deserialize the following XML (fragment) using XmlSerializer. While doing so I get the error: (Sorry for it being German I'll give a rough translation in italics)
System.InvalidOperationException==>Fehler im XML-Dokument (90,7).
System.InvalidOperationException==>Der angegebene Typ wurde nicht erkannt: Name='CONTROLLER-TYPE', Namespace='http://www.asam.net/xml/fbx/can', bei .
This translates as something like:
System.InvalidOperationException==>error in XML document (90,7).
System.InvalidOperationException==>the given type could not be found: Name='CONTROLLER-TYPE', Namespace='http://www.asam.net/xml/fbx/can', at
Here's the source document:
<fx:ECU ID="ecuSpeedControl">
<ho:SHORT-NAME>SpeedControl</ho:SHORT-NAME>
<ho:DESC>ECU controlling drive speed</ho:DESC>
<fx:CONTROLLERS>
<fx:CONTROLLER xsi:type="can:CONTROLLER-TYPE" ID="ctrlSpeedControl">
<ho:SHORT-NAME>ctrlSpeedControl</ho:SHORT-NAME>
<ho:DESC>CAN controller of ECU</ho:DESC>
<fx:CHIP-NAME>SJA1000</fx:CHIP-NAME>
<can:TIME-SEG0>11</can:TIME-SEG0>
<can:TIME-SEG1>4</can:TIME-SEG1>
<can:SYNC-JUMP-WIDTH>2</can:SYNC-JUMP-WIDTH>
<can:NUMBER-OF-SAMPLES>1</can:NUMBER-OF-SAMPLES>
</fx:CONTROLLER>
</fx:CONTROLLERS>
</fx:ECU>
The root element is:
<fx:FIBEX xmlns:fx="http://www.asam.net/xml/fbx" xmlns:ho="http://www.asam.net/xml"
xmlns:can="http://www.asam.net/xml/fbx/can" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="fibex4can.xsd" VERSION="3.1.0">
the class definition for this fragment is:
public ref class FIBEXECU : AbstractFIBEXNode, IGenericContainable
{
public:
ref class ControllersContainer : FIBEXGenericContainer<FIBEXController^>{
public:
[XmlElement("CONTROLLER")]
property array<FIBEXController^>^ ControllerObjs {
array<FIBEXController^>^ get() { return Children;}
void set(array<FIBEXController^>^ value) { Children = value;}
}
};
[XmlAttribute("ID")]
virtual property String^ ID;
[XmlElement("SHORT-NAME", Namespace="http://www.asam.net/xml")]
property String^ ShortName;
[XmlElement("CONTROLLERS")]
property ControllersContainer^ Controllers;
};
I hope that (yet again) someone can help me, as I didn't find a solution on google or here.
The error you get seems to indicate that a certain type is not available. Looking through your XSD, there are quite some types undefined, but that's likely because you haven't included the imported and included XSD files, so I cannot reliably check the validity of your documents.
The XML itself contains errors. For instance, the xsi:schemaLocation is not correct, it must contain pairs with the namespace and the location. Instead of this:
xsi:schemaLocation="fibex4can.xsd"
it should be this (assuming the file is indeed in the same directory as the XML):
xsi:schemaLocation="http://www.asam.net/xml/fbx/can fibex4can.xsd"
My guess is, that the apparent errors of your document are the reasons that it cannot be parsed. Basically, when dealing with XML, you must be very strict (as with any programming language). If you tell the processor to validate the document, then the Schema's must be available, they must themselves be valid, any related schema's must be locatable and finally, the XML document must be valid against these schema's. Conforming processors (like the ones with .NET) must obey these and other rules for XML and they must throw an error and stop parsing the document when the XML is not well-formed or not valid.