No data output when extracting part of filename in U-SQL - azure-data-lake

When I do an extract from multiple files and include part of the filename in the fields list and in the FROM clause (e.g. FROM "/input/filename-{filedate:*}.nc"), the resulting output file only contains a header row. If I remove "filedate" from the fields list and the FROM clause, I get the correct output.
I noticed in the job graph that when including "filedate", an "Empty Input" and an "Extract Cross" step is added before the "PodAggregate" step, and in the "Extract Cross" no data is written. What is this step?
Also, if I run the original extract including "filedate" locally, I get the correct output, so it's only in ADLA this error occurs.
I use a custom extractor and I don't know if this has anything to do with it. I haven't tested with a built-in extractor.

We released the new "fast file set" option by default. Unfortunately, it introduced a regression for some plans. Until we fix it, please add the following statement to your script:
SET ##InternalDebug = "FileSetV2:off";
Our apologies for any inconvenience this may have caused.

Related

No output is generated when using reference data in Azure Stream Analytics

I have written a simple query and join it with a json reference data. I can see correct results when testing the query in "Test results" tab. However, no output is generated when starting the job.
I have confirmed that the output blob is created when no join with reference data is used in the query.
Any help is appreciated. The sample reference json follows:
[
{
"DeviceId":"DEV-021",
"Brand":"brand01",
"Model":"model01"
}
]
Use flat json structure instead of array. It should give you the output
Check the path you specified in the reference data, maybe it is not correct or you did not specify the file name. Does it contain something like {date}/{time}/filename.json?
If you forget to specify the file name, it does not work as well.
And if you are testing the job, usually you specify the file manually and that is why your query works.

How do you differentiate between QVD source files and target files when reading a QVW's XML MetaData?

I am currently trying to find an alternative to the Governance Dashboard that Rob Wunderlich (Qlik founder) created, since I am currently encountering errors when using it.
How do you differentiate between a data source (QVD, aka source) that is used by a QVW or a data file (QVD, aka target) that is generated by that QVW?
QVW:
LOAD
Lower(Discriminator) AS DataFile.Filepath
FROM C:\Sample_Transform_file.qvw (xmlSimple, Table is[DocumentSummary/LineageInfo])
Below is an example of what I found when parsing through the XML Metadata
(discriminator subtag within the lineageinfo tag) for one specific Transform QVW.
Sample Table Output
Are targets just identified by this?
STORE - [qvdName.qvd](qvd)
From what I have found, That appears to be the case, to a degree.
All of our QVW files that output a QVD utilize DIRECTORY statements rather than either hard-coded file location paths or variablized paths. Hence why all of the Targets are getting displayed as "STORE - qvdname.qvd", instead of displaying the filepath. In a sense, that is a flaw on QlikView's part, regarding its Governance Dashboard (or at the very least, they don't seem to recommend variablizing those paths as a standard in order to avoid breaking the lineage).

Conditionally, Converting of JSON to XML using MuleSoft

I have a simple conversion of JSON to XML using MuleSoft. In "Transform Message" component, I provided JSON Schema as Input and XML Schema as Output. When I run the app, the conversion happens if the file matches with both schema but it generates an empty XML file if it doesn't match.
I want below conditions:
1) If the file matches with schema, the converted output file should be sent to converted folder and the original file should move to Success folder.
2) If the file doesn't match with schema, the original file should move to the Failure folder instead of conversion.
Hope, I explained it comprehensively as I am new to MuleSoft. Here is a sample diagram which may simplify my requirement. Provide me with a new one if I badly designed the process.
First thing you need to create a flowVar that will hold your original payload.
When your doing your evaluation, if its XML then use a simple XPath expression like //elementName[not(node())]
Lastly, on your success use scatter-gather for multi-threading write. Pull your original payload from flowVar and write to Success and Write your regular payload to your Converted folder

Openrefine not working as expected

I'm very new to OpenRefine, so please bear with me if i have made a simple mistake.
I'm parsing a HTML website to gather some date.
Everything went fine with fetching the individual pages, but now the parsing of the HTML fails.
I'm creating a new column, based on the one holding all the page's HTML. I'm trying to get to the data in a specific DIV[20].
In the"create column based on this column" window it gives me a preview when using value.parseHtml().select("DIV")[20] , which results in exactly what i need... executing it gives me nothing but blank cells.
it even tells me that it is "filling 0 rows with grel:value.parseHtml().select("DIV")[20]"
Any clue what i'm doing wrong here?
You just need to finalize with .toString() to output the JSON.org object AS a string.
This is explained on our wiki here: https://github.com/OpenRefine/OpenRefine/wiki/StrippingHTML#extract-html-attributes-text-links-with-integrated-grel-commands
I also updated the select() function with that example: https://github.com/OpenRefine/OpenRefine/wiki/GREL-Other-Functions#selectelement-e-string-s

Custom Grid Query in Rally with Multiple OR Clauses

I'm trying to write a query in Rally that will show me all the defects for several projects, but every time I save the query I get the message "Could not parse: Error parsing expression -- expected ")" but saw "OR" instead."
Here is the actual query:
((((Project.Name = "Project A") OR (Project.Name = "Project B")) OR (Project.Name = "Project C")) OR (Project.Name = "Project D"))
I checked Rally's so-called Help and it seems to me that everything is set up correctly, but maybe I'm missing something?
Your query syntax and parentheses groupings look fine. I tested your exact string above in a Custom Grid and it parses fine - no "Could not parse..." error. Maybe compare the exact query you are using against your sample above? Complex AND's and OR's can definitely be frustrating. If you miss a parenthesis or spaces around operators, the query engine will complain.
fyi, I just found that doing a page reload in the browser forces the changed query expression to be evaluated, whereas simply saving the modified query does not reliably re-evaluate the changed query.
The symptom I observed was that query results continued to complain about the previous query string, even though I had replaced parts of the query with different named fields, etc. This made me suspect browser caching, and when flushing cache did not help, then I did browser page reload, which worked perfectly.
So if your browser page was reloaded between when you were having the problem and later when it started working, then this might explain why.
From About link: Rally Build: master-9274 , Browser Type: firefox/19.0, rv:19.0