How to quickly get code from Kaggle notebook without requiring registration? - kaggle

I can already see three ways, but none are quick (not compared to say, accessing a raw file on github)
Fork/download (requires registration)
Follow instructions here (i.e. download, open up in jupyter/ipython notebook)
Copy the code blocks manually, one by one (bad for long notebooks)
Is there an easier way? (I hoping, ideally, to add raw to the url somewhere, just like on github)

In case it's useful to others, put the notebook url here to extract raw code

Related

how-to-parse-a-ofx-version-1-0-2-file-in power BI?

I just read
How to parse a OFX (Version 1.0.2) file in PHP?
I am not a developer. What easy tool can I use to make this code run with no code skill or appetence ? web browser is pretty hard to use for non dev guys.
I need this to use the file into Power BI, which accept M code, json source or xml, but not sgml ofx or PHP.
Thanks in advance
Welcome Didier to StackOverflow!
I'm going to try and give you a clue how I'd approach the problem here. But keep in mind that your question really lacks details for us to help you, and I'm asking to update your question with example data that you want to integrate into PowerBI. Also, I'm not too familiar with PowerBI nor PHP, and won't go into making that PHP code you linked run for you.
Rather, I'd suggest to convert your OFX file into XML, and then use PowerBI's XML import on that converted file.
From your linked question, I get that your OFX file is in SGML format. There's a program specifically designed to convert SGML into XML (which is just a restricted form of SGML) called osx. I've detailed how to install it on Linux and Mac OS in another question related to SGML-to-XML down-converting; if you're on Windows, you may have luck by just downloading a really ancient (32bit) version of it from ftp://ftp.jclark.com/pub/sp/win32/sp1_3_4.zip. Alternatively, you can use my sgmljs.net software as explained in Converting HTML to XML though that tutorial is really about the much more complex task of converting HTML to XML/XHTML and will probably confuse you.
Anyway, if you manage to install osx, running it on your OFX file (which I assume to have the name yourfile.ofx just for illustration) is just a matter of invoking (on the Windows or Linux/Mac OS command line):
osx yourfile.ofx > yourfile.xml
to result in yourfile.xml which you can attempt to load with PowerBI.
Chances are your OFX file has additional text at the beginning (lines like XYZ:0001 that come before <ofx>). In that case, you can just remove those lines using a text editor before invoking osx on it. Maybe you also need a .dtd file or additional instructions at the top of the OFX file informing SGML about the grammar of your file; it's really difficult to say without seeing actual test data.
Before bothering with SGML and all that, however, I suggest to remove those first few lines in your OFX file (everything until the first < character) and check if PowerBI can already recognize your changed input file as XML (which, from other OFX example files, has a good chance of succeeding). Be sure to work on a copy of your original file rather than overwriting it. Then come back and update your question with your results and example data.

QNAmaker: How to enable context-only for subquestions?

I have a document pdf or docx (only accepted formats for multiturn), this contains alot of subheadings which translate to follow up prompts. This all works fine! But I would like to enable context-only for all of my prompts, because the answers are not relevant out of context.
Can I denote this in my document itself? There are way too many too manually check the button.
I could write a script that changes the contextOnly to true on the exported tsv, but this seems like it is a silly workaround.
There does not seem to be any way to indicate whether a question is context-only through the document extraction process, so you will need to automate this with a script. If you don't want to modify the TSV directly, you can use the QnA REST API. You can also access this API through the Bot Framework CLI but I don't know if that makes anything more convenient for you.

Read Sentinel-2 L1C view angles from rasterio

I am trying to read the view angles from a Sentinel-2 image (L1C SAFE compact format) for executing an atmospheric correction algorithm. I can get those values by parsing the file MTD_TL.xml, but I am not able to get them through rasterio.
I have tried to access to those data using the xml:SENTINEL2 and the xml:VRT metadata domains, but I can only access to the values from the file MTD_MSIL1C.xml (the main metadata file).
The whole point of using rasterio is being able of using GDAL's virtual file system, as the images will be read from S3 buckets. Any alternatives for easily reading MTD_TL.xml through the virtual file system would be also valid (and really appreciated).
Thank you!!
I answer to myself.
I could not find how to get the values I require, but according to https://gdal.org/user/virtual_file_systems.html the function VSIFOpenL may be used for opening the file. After that, manual parsing will do the trick :)
Ps. I must read the documentation slowly.

Upload image diff using CTest and CDash

For running automated tests in a C++ application, I would like the application to dump an image and compare it against a baseline image. I saw several examples of this on various CDash dashboards, e.g. this one (link might not be valid for long).
https://open.cdash.org/testDetails.php?test=660365465&build=5407474
My google-fu has failed me on this one, what is the correct way to get this functionality?
The easiest way to attach ordinary files to test results is by listing those files in either the ATTACHED_FILES or ATTACHED_FILES_ON_FAIL test properties. This is not the mechanism being used here though.
According to this mailing list post, you can output special contents like that shown below to the test's stdout and it results in the named files being uploaded. The sample CDash results page you linked to follows a similar pattern as the example from the mailing list, which I've reproduced here for reference (I've made one small correction to change DifferenceImage to DifferenceImage2):
<DartMeasurement name="BaselineImage" type="text/string">Standard</DartMeasurement>
<DartMeasurementFile name="TestImage" type="image/png">C:/Users/.../Testing/Temporary/BoxWidget.png</DartMeasurementFile>
<DartMeasurementFile name="DifferenceImage2" type="image/png">C:/Users/.../Testing/Temporary/BoxWidget.diff.png</DartMeasurementFile>
<DartMeasurementFile name="ValidImage" type="image/png">C:/Users/.../VTKData/Baseline/Widgets/BoxWidget.png</DartMeasurementFile>
I've checked through the CTest source code and it scans the test output looking for <DartMeasurement> and <DartMeasurementFile> tags here and here. These are uploaded as discrete measurement items to CDash, which also looks for these particular names and presents them specially as in the example CDash links in the question.

Replace words/phrases in existing PDF or docx with other words

I am trying to make a dynamic PDF generator as an .NET Core API. I want to take an existing PDF, or .docx file, and edit it so it replaces the current name (John Doe) with something that can be replaced like #NAME_PLACEHOLDER.
I then want to transform #NAME_PLACEHOLDER -> John Doe (or whatever is in the KeyValuePair or Dictionary<string, string>).
I am running this on a Docker environment, so I can easily execute commands and I am willing to do that as well.
So far I have tried a few things:
1) pdf2htmlEX
Executes as pdf2htmlEX file.pdf
Does the job pretty well
Can be converted back to PDF using Google Chrome headless or similar
Problem: Only the characters used in the PDF can be used to replace. So if I only use A, B, C as characters, it will make D into Times New Roman (or default font)
2) LibreOffice ODT to PDF
This was pretty nice, because I could simply unzip the .odt file, open content.xml, search and replace, then save it as an .odt file again
Could be converted into PDF rather easily using soffice --convert-to pdf
LibreOffice is quite nice
Problem 1: Microsoft Word -> Save as ODT tends to break the formatting, so we have to use LibreOffice to go and change it back again
Problem 2: We don't want to move away from Microsoft's Office suite
3) HTML to PDF using Chrome Headless
What you see is what you get
By far the best option, if we're all developers aaand have unlimited time
Problem 1: Only our developers can make changes, since our marketing department do not know HTML
Problem 2: Our existing PDFs would have to be rewritten in HTML
As you can see, I have tried a bunch of things. None of them, except Chrome Headless, has lived up to my expectations. What I really like about #3 is what you see is what you get. I can make the whole thing in HTML, press CTRL+P and see what it looks like as a finished PDF, basically.
I am looking for a better solution, though. It can be paid. It can be free. All I need is to change out words/phrases with other words dynamically, which apparently seems like a tough thing to do.
Thanks for specifying what you've already found clearly. It helps a lot providing a succinct answer.
The conversion is always tricky - I'm sure you know Word has trouble displaying/editing some Word documents itself.
I have experience regarding point #2 "LibreOffice ODT to PDF" and can suggest a few things to test:
Don't use Microsoft to do the docx->odt conversion. It's not good as you know. Use LibreOffice itself to do this step. The rest of your process remains the same.
For some documents, Libre Office does doc->odt much better. So, you can instead work with DOC format and get a better result without any other changes.
You won't be able to remove the devs from the process, but you can certainly reduce their role allowing your business/marketing teams to have more direct input simply by:
get the starting point document to the devs to run through the conversion process. The devs can "clean up" the document to make it convert nicely.
make this version of the document the "official" starting point. The business or technical teams can load it, adjust it, and put it back into the process.
if possible, expose a test-platform to the business teams so they can download, adjust, upload and render to PDF. This cycle means they will be able to achieve more and if they're good, do impressive stuff without any dev input.
the above steps simply mean don't expect perfect conversion of arbitrary complex documents. Starting from a (even complex) working baseline is great.
Some of that might show you that your #2 is actually going to get the best overall results.
I hope that helps.