Hiding text but not matplotlib plots in IPython notebooks - matplotlib

Sometimes I have a long notebook (essentially a lab-notebook) with lots of text, headings, plots etc. What I'd like is to be able to filter out all the text and just show the plots, so that I can quickly get an overview of what's in the notebook or find that one plot I want but I can't remember exactly where I put it. There's enough text in the notebooks that it takes a long while to scroll through it all. I'm aware that it's possible with an extension to hide input cells, which helps somewhat, but often there's a lot of text in the outputs too. The matplotlib plots are typically 'inline', so that they are just embedded pngs. Thus it should be sufficient to just hide text while preserving images.
I've looked through the extension index but haven't found anything appropriate. I'm guessing I could achieve something like this using an nbconvert template, or some javascript, but perhaps someone has a good way already.

Depending on the type of text output, you could hide the text_output class via your custom.js. To this end add the following lines to your custom.js:
define([
'base/js/namespace',
'base/js/events'
],
function(IPython, events) {
events.on("app_initialized.NotebookApp",
function () {
$("#view_menu").append("<li id=\"toggle_toolbar\" title=\"Show/Hide text output\">Toggle Text Output</li>");
}
);
}
);
text_show=true;
function text_toggle() {
if (text_show){
$('div.output_text').hide();
} else {
$('div.output_text').show();
}
text_show = !text_show
}
This adds a menu entry in the view menu to toggle the visibility of the output_text class.
Of course if you have some text in markdown cells, these are not hidden. If required it should be straight forward to adapt the above code to hide the input cells (class input, the markdown cells (class text_cell), etc.

Related

Tabulator - formatting print and PDF output

I am a relatively new user of Tabulator so please forgive me if I am asking anything that, perhaps, should be obvious.
I have a Tabulator report that I am able to print and create as a PDF, but the report's formatting (as shown on the screen) is not used in either output.
For printing I have used printAsHtml and printStyled=true, but this doesn't produce a printout that matches what is on the screen. I have formatted number fields (with comma separators) and these are showing correctly, but the number columns should be right-aligned but all of the columns appear as left-aligned.
I am also using Tree View where the tree rows are coloured differently to the main table, but when I print the report with a tree open it colours the whole table with the tree colours and not just the tree.
For the PDF none of the Tabulator formatting is being used. I've looked for anything similar to the printStyled option, but I can't see anything. I've also looked at the autoTable option, but I am struggling to find what to use.
I want to format the print and PDF outputs so that they look as close to the screen representation as possible.
Is there anywhere I could look that would provide examples of how to achieve the above? The Tabulator documentation is very good, but the provided examples don't appear to explain what I am trying to do.
Perhaps there are there CSS classes that I am missing or even mis-using? I have tried including .tabulator-print-table in my CSS, but I am probably not using it correctly. I also couldn't find anything equivalent for producing PDFs. Some examples would help immensely.
Thank you in advance for any advice or assistance.
Formatting is deliberately not included in these, below i will outline why:
Downloaders
Downloaded files do not contain formatted data, only the raw data, this is because a lot of the formatters create visual elements (progress bar, star formatter etc) that cannot be replicated sensibly in downloaded files.
If you want to change the format of data in the download you will need to use an accessor, the accessorDownload option is the one you want to use in this case. The accessors transform the data as it is leaving the table.
For instance we could create an accessor that prepended "Mr " to the front of every name in a column:
var mrAccessor= function(value, data, type, params, column, row){
return "Mr " + value;
}
Assign it to a columns definition:
{title:"Name", field:"name", accessorDownload:mrAccessor}
Printing
Printing also does not include the formatters, this is because when you print a Tabulator table, the whole table is actually rebuilt as a standard HTML table, which allows the printer to work out how to layout everything across multiple pages with column headers etc. The downside of this is that it is only loosely styled like a Tabulator and so formatted contents generated inside Tabulator cells will likely break when added to a normal td element.
For this reason there is also a accessorPrint option that works in the same way as the download accessor but for printing.
If you want to use the same accessor for both occasions, you can assign the function once to the accessor option and it will be applied in both instances.
Checkout the Accessor Documentation for full details.

Looking for software or API that will give me co-ordinates of text in a pdf

Simple question I hope - I have a pdf and want to detect the co-ordinates of specific word(s) or placeholder text. I then intend to use itextsharp to stamp a replacement bit of text on top at the co-ordinates found.
Can anyone recommend anything please?
Thanks
As answered in the comments, one could use iText to perform such a task. Maybe there are some better solutions, however, I doubt it. The cause of the mentioned issue, i.e. "[itextsharp] sometimes give co-ords of the start of the sentence the search text is in", is that sometimes glyphs are so close, that their boxes overlap, hence I don't see how it could be handled as you want.
So you can do the following:
extend LocationTextExtractionStrategy class and override eventOccurred, for example, as follows:
#Override
public void eventOccurred(IEventData data, EventType type) {
if (type.equals(EventType.RENDER_TEXT)) {
TextRenderInfo renderInfo = (TextRenderInfo) data;
// Obtain all the necesary information from renderInfo, for example
LineSegment segment = renderInfo.getBaseline();
// ...
}
pass an instance of such an extended class to PdfTextExtractor.getTextFromPage as follows:
PdfTextExtractor.getTextFromPage(pdfDocument.getPage(1), new ExtendedLocationTextExtractionStrategy()
once text is found, the event will be triggered.
There are some difficulties in such a solution, of course, because the text you want to find and write above could be present in the PDF not as "Text", but "T", "ex", t", or even "t", "x", "e", "T". However, since you use iText, you may want to harness the advantages of one of its products - pdfSweep. This product aims to completely remove unnecessary content from the PDF, with such a content being passed either as some locations (which you want to obtain, so that is not an option) or regexes.
This is how to create such a regex strategy (to find all "Dolor" and "dolor" instances in the document, completely remove them (from all the streams, so that they are either not observed from a PDF viewer nor found in the underlying PDF objects):
RegexBasedCleanupStrategy strategy = new RegexBasedCleanupStrategy("(D|d)olor").setRedactionColor(ColorConstants.GREEN);
This is how to use it:
PdfAutoSweep autoSweep = new PdfAutoSweep(strategy);
autoSweep.cleanUp(pdf); // a PdfDocument instance
And this is how to write some text on the location, at which the unnecessary text was present:
for (IPdfTextLocation location : strategy.getResultantLocations()) {
Rectangle rect = location.getRectangle();
// do something, for exapmle, write some text
}

Set histogram ticks/label using Syntax

Let me preface this by saying that I am a programmer by trade, but not very familiar with SPSS.
I am helping a friend set up some histogram plots using SPSS Syntax language. Using the Chart Builder, we have arrived at the code below:
GGRAPH
/GRAPHDATASET NAME="graphdataset" VARIABLES=OurVariable MISSING=LISTWISE REPORTMISSING=NO
/GRAPHSPEC SOURCE=INLINE
TEMPLATE=[
"C:\some\path\greenHistogram.sgt"].
BEGIN GPL
SOURCE: s=userSource(id("graphdataset"))
DATA: OurVariable=col(source(s), name("OurVariable"))
GUIDE: axis(dim(1), label("OurVariable"))
GUIDE: axis(dim(2), label("Frequency"))
GUIDE: text.title(label("Bla bla",
"bla"))
ELEMENT: interval(position(summary.count(bin.rect(OurVariable, binStart(0.5)))),
shape.interior(shape.square))
END GPL.
As you can see, she would like to make the histogram columns green. We could not achieve that using the Chart Builder, but we could easily make a template via the Chart Editor window and apply that. This seems like a very sensible approach, as she has many charts she wants green.
She would also like to customize the y-axis labels (number of decimal places, tick "major increment" etc.). This can also be achieved using the Chart Editor and saving a template. However, this is a much more individualized edit, and making a custom template for each and every plot seems cumbersome. Is it possible to adjust these things directly in the Syntax-script which generates the plots?
In many other places there is a nice Paste-button which generates the necessacy code, but I could not find one in the Chart Editor.

Export MFC CView into vectorial format

With my MFC application, I am able to print my CDocument on screen using the CView class.
Basically, I use the CDC class to write text and draw polygons on screen to provide a view representation of my document.
Now let's say I would like to use that output view in Microsoft Word.
From a user point a view and without anymore developer work, I can :
copy-paste my drawing to word : this produces a raster BMP file which I am able to paste in Word
print my drawing and use a PDF exporter : this produces a vectorial PDF file which is light and zoom-able, but not easy to reuse in Word.
These two effortless solutions are great because I can keep the exact layout of my view, but have cons (raster or format)
Another way to solve my problem would be to write SVG or VML but I would not get the same layout and this would require a lot of work.
Is there a library to do the same kind of PDF export / print mechanism into a standard format ?
What would you suggest ? Thanks a lot.
To draw your view into a Enhanced Meta File, first read the documentation # MSDN:
http://msdn.microsoft.com/en-us/library/427wezx1%28v=VS.80%29.aspx
Here is an example, how this works:
CMetaFileDC MFDC;
CRect rect(0,0,width,height);
MFDC.CreateEnhanced(NULL,NULL,rect,NULL);
MFDC.SetBkMode(TRANSPARENT);
MFDC.SetMapMode(MM_HIMETRIC);
CDC tempDC;
tempDC.CreateCompatibleDC(&MFDC);
MFDC.SetAttribDC(tempDC.m_hDC);
// now you draw into the DC like it was your original view
HENHMETAFILE hEnhMetaFile = MFDC.CloseEnhanced();
HENHMETAFILE hEMF = NULL;
hEMF = CopyEnhMetaFile(hEnhMetaFile,"C:\\Temp\\Test.emf");
DeleteEnhMetaFile(hEMF);
DeleteEnhMetaFile(hEnhMetaFile);

Apache POI: Partial Cell fonts

If I crack open MS Excel (I assume), or LibreOffice Calc (tested), I can type stuff into a cell, and change the font of parts of the text in a cell, such as doing, in one cell, :
This text is bold and this text is italicized
Again, let me reiterate, that this string could exist in the shown format in one cell.
Can this level of customization be achieved with Apache POI? Searching only seems to show how to apply a font to an entire cell.
Thanks
===UPDATE===
As suggested below, I ended up going with the HSSFRichTextString (as I'm working with HSSF). However, after applying fonts (I tried bold and underline), my text would remain unchanged. This is what I attempted. To put things in context, I am working on something sports-related, in which it is common to display a match up in the form "awayteam"#"hometeam", and depending on certain external conditions, I would like to make one or the other bold. My code looks something like this:
String away = "foo";
String home = "bar";
String bolden = "foo"
HSSFRichTextString val = new HSSFRichTextString(away+"#"+home);
if(bolden.equals(home)) {
val.applyFont(val.getString().indexOf("#") + 1, val.length(), Font.U_SINGLE);
} else if(bolden.equals(away)) {
val.applyFont(0, val.getString().indexOf("#"), Font.U_SINGLE);
}
gameHeaderRow.createCell(g + 1).setCellValue(val);
As you can see, this is a snippet of code from a more complicated function than is displayed, but the brunt of this is actual code. As you can see, I'm doing val.applyFont to part of a string, and then setting a cell value with the string. So I'm not entirely sure what I did wrong there. Any advice is appreciated.
Thanks
KFJ
POI does support it, the class you're looking for is RichTextString. If your cell is a text one, you can get a RichTextString for it, then apply fonts to different parts of it to make different parts of the text look different.
You would be drained if working with SXSSFWorkbook, as it does not support such formatting.
Check it here.
http://apache-poi.1045710.n5.nabble.com/RichTextString-isn-t-working-for-SXSSFWorkbook-td5711695.html
val.applyFont(0, val.getString().indexOf("#"), Font.U_SINGLE);
You should not pass Font.U_SINGLE to applyFont,but new a Font, such as new HSSFFont(), then setUnderline(Font.U_SINGLE).
example:
HSSFFont f1 = new HSSFFont();
f1.setUnderline(Font.U_SINGLE);
val.applyFont(0, val.getString().indexOf("#"), f1);