I'm using kable to produce tables for an Rmarkdown PDF document. I want to highlight certain cells based on their value, and every other row has a stripe. The image below shows this, as well as the issue I'm having (taken from my Adobe PDF reader). As you can see, the highlight color in the middle row (for Chinstrap penguins) is not quite even with the other two. This is a small issue but is enough to make the table look a bit janky, so I'm hoping someone can help me find a fix, or at least explain the cause.
What's odd is that this issue seems to depend on which PDF reader I view the document in. The initial image came from Adobe, but when I open the file in SumatraPDF (the default viewer that my RStudio knits to) the highlighting is consistent for all rows (see next image). This behavior doesn't seem to depend on zoom level in the PDF viewer, but could be related to another setting I'm not aware of.
The issue itself seems to be caused by the light blue striping rather than the highlight itself, because when I remove the striping all three highlights are even in both PDF viewers. This behavior can also be demonstrated without the extra highlighting by instead adding cell borders to the table using column_spec(c(2:3), width = "2.1cm", border_right = T) as shown in the final image. In two of three rows, the 2nd column's right border is covered up by the striping (this also happens with the built-in striping feature in kableExtra). One interesting thing to note here is that this does seem to depend on PDF viewer zoom level. The image below came from Adobe at my default view (75.5), but when I change the zoom to between 100 and 200 all cell borders are correct, and then disappear again after that. My code is below, thanks for your help!
Reproducible Example
---
title: "Kable Highlighting Issue"
output: pdf_document
header-includes:
- \usepackage{booktabs}
- \usepackage{longtable}
- \usepackage{array}
- \usepackage{multirow}
- \usepackage{wrapfig}
- \usepackage{float}
- \usepackage{colortbl}
- \usepackage{pdflscape}
- \usepackage{tabu}
- \usepackage{threeparttable}
- \usepackage{threeparttablex}
- \usepackage[normalem]{ulem}
- \usepackage{makecell}
- \usepackage{titling}
- \usepackage{graphicx}
---
knitr::opts_chunk$set(echo = TRUE)
library(palmerpenguins)
library(dplyr)
library(stringr)
library(knitr)
library(kableExtra)
#create table data
table_data <- penguins %>%
group_by(species) %>%
summarise(bill_length_mm= round(mean(bill_length_mm, na.rm = TRUE), digits=2),
bill_depth_mm= round(mean(bill_depth_mm, na.rm = TRUE), digits=2)) %>%
mutate(across(everything(), as.character)) %>%
rename(length= bill_length_mm, depth= bill_depth_mm) #_ is an escaped character in Latex
#add general table striping
#the default Kable striping feature will overwrite the highlighting, so this is an alternative way to add that
#perhaps there's something simpler?
for (row in c(1,3)){
for (col in c(1,3)){
table_data[row,col] <- cell_spec(table_data[row,col], "latex", background = "#ddf1f7")
}
}
#add table highlighting if bill length > 40mm
for (p in table_data$species){
row <- which(table_data[1]==p)
if (table_data[row, 2] > 40){
table_data[row, 2] <- cell_spec(table_data[row, 2], "latex", background = "#00cc66")
} else {
table_data[row, 2] <- cell_spec(table_data[row, 2], "latex", background = "#ff704d")
}
}
#create table
penguin_table <- kable(table_data, format = "latex", booktabs = TRUE, linesep="", escape = FALSE, align=c("l","c","c")) %>%
kable_styling(latex_options = c("scale_down", "hold_position"), position = "center", font_size =12) %>%
row_spec(0, bold = TRUE, hline_after = TRUE) %>%
column_spec(1, width = "5cm") %>%
column_spec(c(2:3), width = "2.1cm")
penguin_table
Session Info
R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] kableExtra_1.3.4 knitr_1.33 stringr_1.4.0
[4] dplyr_1.0.4 palmerpenguins_0.1.0
loaded via a namespace (and not attached):
[1] pillar_1.4.7 compiler_4.0.3 tools_4.0.3 digest_0.6.27
[5] evaluate_0.14 lifecycle_1.0.0 tibble_3.0.4 viridisLite_0.3.0
[9] pkgconfig_2.0.3 rlang_0.4.10 DBI_1.1.0 cli_2.4.0
[13] rstudioapi_0.13 xfun_0.23 httr_1.4.2 xml2_1.3.2
[17] generics_0.1.0 vctrs_0.3.5 systemfonts_1.0.1 webshot_0.5.2
[21] tidyselect_1.1.0 svglite_2.0.0 glue_1.4.2 R6_2.5.0
[25] rmarkdown_2.8 purrr_0.3.4 magrittr_2.0.1 scales_1.1.1
[29] ellipsis_0.3.1 htmltools_0.5.0 assertthat_0.2.1 rvest_0.3.6
[33] colorspace_2.0-0 stringi_1.5.3 munsell_0.5.0 crayon_1.3.4
Rendering a pdf_document relies on generating intermediate latex code, which gets processed in your latex interpreter to build a document. For any issue with your pdf document, I can recommend looking at the underlying latex code (by adding
keep_tex as output argument.
In your case, table code is generated, with the following lines:
\midrule
\cellcolor[HTML]{ddf1f7}{Adelie} & \cellcolor[HTML]{ff704d}{38.79} & \cellcolor[HTML]{ddf1f7}{18.35}\\
Chinstrap & \cellcolor[HTML]{00cc66}{48.83} & 18.42\\
\cellcolor[HTML]{ddf1f7}{Gentoo} & \cellcolor[HTML]{00cc66}{47.5} & \cellcolor[HTML]{ddf1f7}{14.98}\\
\bottomrule
What causes the issues is that the \cellcolor leads to colored boxes being drawn of the final pdf. Adobe reader has some trouble rendering those nicely at various zoom levels, and you'll find that bordering boxes compete a bit over which pixel gets which color. The 'white' row has no background color set, so the red/green box has no other box to compete with, and gets a bit more space to be rendered.
For further background on the latex issues causing this, you might want to look here
Since you're already manually adding cell_spec to your rows, why not explicitly set the background color of your even rows to white?
for (row in 1:nrow(table_data)){
color = ifelse(row %%2 ==1, "#ddf1f7", "#ffffff")
for (col in c(1,3)){
table_data[row,col] <- cell_spec(table_data[row,col], "latex", background = color)
}
}
This will yields a nice, crisp table, where at least all space-competing issues are solved equally.
Related
I use Selenium to automate some GUI tests using the firefox web driver.
It occurred to me that it would be also make sense to create screenshots during the run of the tests to use those in the manual.
The screens of my application are relatively static - no timestamps visible and so on. So my expectation would be, if I create a screenshot from lets say the start page, navigate later to the start page again, the screenshots should be identical. Also if I run the test twice, the screenshot of the start page should be identical between both runs.
I save the screenshots as PNG, I even process the screenshots (save without date) before saving them, so that the files should be really identical.
Nevertheless, if I compare the pictures with each other (e.g. subtract them from each other), there are minor differences between them (not visible to the naked eye), some faint lines at the border of tables, or around fonts.
My question:
1) Why are there differences at all?
2) What could be the easiest way to ensure that the screenshots are identical? (what kind of post processing could I do)
PS: I also tried to change the renderer from skia to windows to cairo, but although the differences are slightly different, it still doesn't solve the problem.
How do you save the images ?
In ruby I was using something like this ?
def take_screenshot(image_id)
scenario = Zpg::HooksHelper::FeatureHelper.scenario_name
Dir.mkdir('output', 0o777) unless File.directory?('output')
Dir.mkdir('output/screenshots', 0o777) unless File.directory?('output/screenshots')
screenshot = "./output/screenshots/#{image_id}#{scenario.tr(' ', '_').gsub(/[^0-9A-Za-z_]/, '')}.png"
if page.driver.browser.respond_to?(:save_screenshot)
page.driver.browser.save_screenshot(screenshot)
else
save_screenshot(screenshot)
end
FileUtils.chmod(0o777, screenshot)
end
And I was checking the image diff
# return[hash]
def image_diff(imageid_1 = nil, _imageid_2 = nil)
scenario = Zpg::HooksHelper::FeatureHelper.scenario_name
if imageid_1.class != Integer
image_1 = "./features/support/data/images/#{Zpg.brand}/#{imageid_1}.png"
image_2 = "./output/screenshots/#{_imageid_2}#{scenario.tr(' ', '_').gsub(/[^0-9A-Za-z_]/, '')}.png"
else
image_1 = "./output/screenshots/#{imageid_1}#{scenario.tr(' ', '_').gsub(/[^0-9A-Za-z_]/, '')}.png"
image_2 = "./output/screenshots/#{_imageid_2}#{scenario.tr(' ', '_').gsub(/[^0-9A-Za-z_]/, '')}.png"
end
images = [
ChunkyPNG::Image.from_file(image_1),
ChunkyPNG::Image.from_file(image_2)
]
diff = []
images.first.height.times do |y|
images.first.row(y).each_with_index do |pixel, x|
diff << [x, y] unless pixel == images.last[x, y]
end
end
puts "pixels (total): #{images.first.pixels.length}"
puts "pixels changed: #{diff.length}"
puts "pixels changed (%): #{(diff.length.to_f / images.first.pixels.length) * 100}%"
# init empty hash
diff_hash = {}
# return pixels changed number
diff_hash[:pixels_changed] = diff.length
# return pixels changed percentage
diff_hash[:pixels_changed_percentage] = (diff.length.to_f / images.first.pixels.length) * 100
# return diff hash
diff_hash
end
you can get different results if the browser DOM is not 100% loaded . I would try to have a threshold in which I would expect my image to be .
There is a very good project here in .Net https://www.codeproject.com/Articles/374386/Simple-image-comparison-in-NET which you can convert in any language you want.
I'm trying to create a flexdashboard with results from the r arules apriori function, so to display associated relations for specific items selected from the pull down menu in the markdown dashboard. When I create function outside the markdown environment I'm able successfully feed the new product item to the apriori function without problem and the graph changes as expected when I change item. When I replace the function variable with the reactive function name I get an error message saying no rules "Error: x contains 0 rules!"
Using the provided Grocery data, I'm want to the reactive input to feed the selected variable either "whole milk" or "sugar" and output the rules for the the specified variable and create a graph. I'm new to dashboards so I don't know if need to use a different function than the "reactive".
Below is the markdown code I'm having trouble feeding a new variables into the aproiri function.
---
title: "Grocery_Test"
output:
flexdashboard::flex_dashboard:
storyboard: true
social: menu
source_code: embed
runtime: shiny
---
```{r global, include=FALSE}
library(plotly)
library(flexdashboard)
library(htmlwidgets)
library(htmltools)
library(knitr)
library(arules)
library(arulesViz)
library(igraph)
library(datasets)
data(Groceries)
```Inputs {.sidebar}
-----------------------------------------------------------------------
```{r}
selectInput("Product","Product", c("whole milk","sugar"))
```
###Associated Product
```{r}
product <- reactive({input$product})
renderPlot({
#product<-"whole milk"
rules <- apriori (data=Groceries
,parameter=list (supp=0.001,conf = 0.15,maxlen=3)
,appearance = list(default="rhs",lhs=Product())
,control = list (verbose=F))
rules_conf <- sort (rules, by=c("confidence"), decreasing=TRUE) # 'high-confidence' rules.
redundant <- which (colSums (is.subset (rules_conf, rules_conf)) > 1) # get redundant rules in vector
rules_conf <- rules_conf[-redundant] # remove redundant rules
plot(rules_conf, method="graph",measure = "confidence", shading = "lift"
,control = list(engine="htmlwidget"))
})
```
Question: Why has the cluster dendrogram of text mined data gone fuzzy /messy (see link to the diagram below)?
Synopsis:I first harvested the original data of approximately 5500 e-scanned articles from a Mongo database, and saved in disk drive as a Json object (ode not shown here, harvested using Cran Mongolite package for R). What is shown here is the standard text processing (using Cran TM package) to clean “the”, “and”, “ing”, “;”, “:” etc.,). That lead to the ensuing hierarchical clustering, which looks fuzzy/MESSY because some of the words in the Json object were very long combinations of letters and not real words that can be separately identified.
Calling two of the libraries
library("tm")
library ("SnowballC")
Creating a path to the data and a corpus of text
cname <- file.path("C:", "texts")
docs <- Corpus(DirSource(cname))
Processing the text
docs <- tm_map(docs, removePunctuation)
docs <- tm_map(docs, removeNumbers)
docs <- tm_map(docs, removeWords, stopwords("english"))
docs <- tm_map(docs, stripWhitespace)
docs <- tm_map(docs, stemDocument)
tdm <- TermDocumentMatrix(docs)
Thirdly the clustering via dendrogram
d<-dist(tdm,method = "euclidean")
hc<-hclust(d, method="ward.D2")
library("rafalib")
myplclust(hc, labels=hc$labels)
Link to the image:
cluster/dendrogram/text mining
Answer is cut less frequent words, more than just the standard stopwords that I had already done (see how I cut them in the code below)
mystopwords <- findFreqTerms(tdm, 1, 20)
mystpwrds <- paste(mystopwords, collapse = "|")
tdm <- tdm[tdm$dimnames$Terms[!grepl(mystpwrds,tdm$dimnames$Terms)],]
The whole picture and code is published here:
http://rpubs.com/antonyama/180574
I want to create half a pie chart to use in an existing builder script and I'm not sure how to go about it. (its actually to implement a speedometer)
Does anyone have any suggestions?
I realise I'll need to dip into some python code.
This old posting in Google Groups https://groups.google.com/forum/#!topic/psychopy-users/JcnS7ZtuVlM gives some sample code utilising pylab but they couldn't get it to work when it was dropped into Builder.
Thanks,
I've worked out how to do this now so here's my method for the record.
# create 1st sector - use the **ori** parameter to control what bearing it
# starts on (0 is vertical)
rad1 = visual.RadialStim( win=win, name='rad1', color=[1,-1,-1],
angularCycles = 0, radialCycles = 0, radialPhase = 0.5, colorSpace = 'rgb',
ori= -90.0, pos=(0.5, -0.3), size=(0.3,0.3), visibleWedge=(0.0, 135.0) )
rad1.draw()
#now draw another sector next to it by using **ori** again to line them up
rad2 = visual.RadialStim( win=win, name='rad1', color=[-1,1,-1],
angularCycles = 0, radialCycles = 0, radialPhase = 0.5, colorSpace = 'rgb',
ori= 45.0, pos=(0.5, -0.3), size=(0.3,0.3), visibleWedge=(0.0, 45.0) )
rad2.draw()
This works fine in Builder, dropping it into a code segment and knowing that Builder uses a window called win
Note that this uses the default mode for size (-1 to 1) so it will look different on different screen sizes. you may want to change to cm (and calibrate your monitor) if you want it to be round.
Thanks to Jon P for his suggestion (via google groups) to use RadialStim)
I have a figure that I would like to resize and afterwards print as a PDF.
Using something like
set(hFig, 'PaperUnits', 'centimeters')
set(hFig, 'PaperSize', [x_B x_H]);
works as long as I do not resize the figure too drastically. If I reduce the height then at some points the xlabel moves out of the figure. I have searched a lot but only found an solution to manually resize the underlying axes-object
scalefactor = 0.96;
movefactor = 0.82;
hAx = get(gcf,'CurrentAxes');
g = get(hAx,'Position');
% 1=left, 2=bottom, 3=width, 4=height
g(2) = g(2) + (1-movefactor)/2*g(4);
g(4) = scalefactor*g(4);
set(hAx,'Position',g);
I do not like this approach since I have to manually adjust the two factors.
Before printing I set the 'interpreter' to 'latex' of all text-objects (if that is of concern).
Printing is achieved using
print(hFig, '-dpdf', '-loose', 'test.pdf');
I hoped to loosen the bounding box by using '-loose'. Any help is highly appreciated!
edit:
It seems that really the interpreter (none, tex, latex) plays a role in this. I got inspired by this post here (http://stackoverflow.com/questions/5150802/how-to-save-plot-into-pdf-without-large-margin-around) and came up with this solution:
tightInset = get(gca, 'TightInset');
position(1) = tightInset(1);
position(3) = 1 - tightInset(1) - tightInset(3);
if strcmpi(x_Interpreter,'latex')
position(2) = tightInset(2)+ 1*tightInset(4);
position(4) = 1 - tightInset(2) - 2*tightInset(4);
else
position(2) = tightInset(2)+ 0*tightInset(4);
position(4) = 1 - tightInset(2) - 1*tightInset(4);
end
set(gca, 'Position', position);
This may not solve your problem completely (it may just help clean up your code), but I found the fig code in the file exchange to be helpful: it lets you easily set the exact size of figures without bordering white space.