Scrapy : Parse text between breakpoints - scrapy

I've come across an HTML like this:
<span itemprop="description">
Colour: Blue
<br>
Fabric: Cotton Silk
<br>
Type Of Work: Printed
<br><br>
Product colour may slightly vary due to photographic lighting sources or your monitor settings.
</span>
I want to parse the text between the breakpoints and get them separately. The desired result is something like:
["Colour: Blue", "Fabric: Cotton Silk", "Product colour may slightly vary due to photographic lighting sources or your monitor settings."]
I've tried
response.xpath('//*[#itemprop="description"]/text()').extract()
but this gives me the whole text together in a single string.
How can I get it separately around the "" tag.

I tried your code and it looks like it is working. I made some adjustments to clean the data extracted through the re() method:
>>> sel.xpath('//span[#itemprop="description"]/text()').re("\s*(.+)\s*")
[u'Colour: Blue', u'Fabric: Cotton Silk', u'Type Of Work: Printed', u'Product colour may slightly vary due to photographic lighting sources or your monitor settings.']
Is that what you need?

Related

Change marker shape in Splunk line chart

Though it's weirdly not in the UI, I can enable markers on a line chart with:
<option name="charting.chart.showMarkers">true</option>
However, if there are multiple series, it's nice to have different marker shapes (square, triangle, etc.) on each series. Is this possible to do on a Splunk line chart?
The marker is binary, either it is or it's not. What can be changed is the style of the dash
For example if you have two series antena3 and digi24 you can apply the following to make one series dotted and the other short-dotted:
<option name="charting.fieldDashStyles">{"antena3":"shortDot", "digi24":"shortDash"}</option>
it will look like this:
Complete chart config reference: https://docs.splunk.com/Documentation/Splunk/latest/Viz/ChartConfigurationReference#General_chart_properties

Are Bootstrap's breakpoints for stacking divs vertically on different viewport sizes? [duplicate]

What is the difference among col-lg-* , col-md-* and col-sm-* in Twitter Bootstrap?
Updated 2020...
Bootstrap 5
In Bootstrap 5 (alpha) there is a new -xxl- size:
col-* - 0 (xs)
col-sm-* - 576px
col-md-* - 768px
col-lg-* - 992px
col-xl-* - 1200px
col-xxl-* - 1400px
Bootstrap 5 Grid Demo
Bootstrap 4
In Bootstrap 4 there is a new -xl- size, see this demo. Also the -xs- infix has been removed, so smallest columns are simply col-1, col-2.. col-12, etc..
col-* - 0 (xs)
col-sm-* - 576px
col-md-* - 768px
col-lg-* - 992px
col-xl-* - 1200px
Bootstrap 4 Grid Demo
Additionally, Bootstrap 4 includes new auto-layout columns. These also have responsive breakpoints (col, col-sm, col-md, etc..), but don't have defined % widths. Therefore, the auto-layout columns fill equal width across the row.
Bootstrap 3
The Bootstrap 3 grid comes in 4 tiers (or "breakpoints")...
Extra small (for smartphones .col-xs-*)
Small (for tablets .col-sm-*)
Medium (for laptops .col-md-*)
Large (for laptops/desktops .col-lg-*).
These grid sizes enable you to control grid behavior on different widths. The different tiers are controlled by CSS media queries.
So in Bootstrap's 12-column grid...
col-sm-3 is 3 of 12 columns wide (25%) on a typical small device width (> 768 pixels)
col-md-3 is 3 of 12 columns wide (25%) on a typical medium device width (> 992 pixels)
The smaller tier (xs, sm or md) also defines the size for larger screen widths. So, for the same size column on all tiers, just set the width for the smallest viewport...
<div class="col-lg-3 col-md-3 col-sm-3">..</div> is the same as,
<div class="col-sm-3">..</div>
Larger tiers are implied. Because col-sm-3 means 3 units on sm-and-up, unless specifically overridden by a larger tier that uses a different size.
xs(default) > overridden by sm > overridden by md > overridden by lg
Combine the classes to use change column widths on different grid sizes. This creates a responsive layout.
<div class="col-md-3 col-sm-6">..</div>
The sm, md and lg grids will all "stack" vertically on screens/viewports less than 768 pixels. This is where the xs grid fits in. Columns that use the col-xs-* classes will not stack vertically, and continue to scale down on the smallest screens.
Resize your browser using this demo and you'll see the grid scaling effects.
This article explains more about how the Bootstrap grid
The bootstrap docs do explain it, but it still took me a while to get it. It makes more sense when I explain it to myself in one of two ways:
If you think of the columns starting out horizontally, then you can choose when you want them to stack.
For example, if you start with columns:
A B C
You decide when should they stack to be like this:
A
B
C
If you choose col-lg, then the columns will stack when the width is < 1200px.
If you choose col-md, then the columns will stack when the width is < 992px.
If you choose col-sm, then the columns will stack when the width is < 768px.
If you choose col-xs, then the columns will never stack.
On the other hand, if you think of the columns starting out stacked, then you can choose at what point they become horizontal:
If you choose col-sm, then the columns will become horizontal when the width is >= 768px.
If you choose col-md, then the columns will become horizontal when the width is >= 992px.
If you choose col-lg, then the columns will become horizontal when the width is >= 1200px.
From Twitter Bootstrap documentation:
small grid (≥ 768px) = .col-sm-*,
medium grid (≥ 992px) = .col-md-*,
large grid (≥ 1200px) = .col-lg-*.
Let's un-complicate Bootstrap!
Notice how the col-sm occupies the 100% width (in other terms breaks into new line) below 576px but col doesn't. You can notice the current width at the top center in gif.
Here comes the code:
<div class="container">
<div class="row">
<div class="col">col</div>
<div class="col">col</div>
<div class="col">col</div>
</div>
<div class="row">
<div class="col-sm">col-sm</div>
<div class="col-sm">col-sm</div>
<div class="col-sm">col-sm</div>
</div>
</div>
Bootstrap by default aligns all the columns(col) in a single row with equal width. In this case three col will occupy 100%/3 width each, whatever the screen size. You can notice that in gif.
Now what if we want to render only one column per line i.e give 100% width to each column but for smaller screens only? Now comes the col-xx classes!
I used col-sm because I wanted to break the columns into separate lines below 576px. These 4 col-xx classes are provided by Bootstrap for different display devices like mobiles, tablets, laptops, large monitors etc.
So,col-sm would break below 576px, col-md would break below 768px, col-lg would break below 992px and col-xl would break below 1200px
Note that there's no col-xs class in bootstrap 4.
This pretty much sums-up. You can go back to work.
But there's bit more to it. Now comes the col-* and col-xx-* for customizing width.
Remember in the above example I mentioned that col or col-xx takes the equal width in a row. So if we want to give more width to a specific col we can do this.
Bootstrap row is divided into 12 parts, so in above example there were 3 col so each one takes 12/3 = 4 part. You can consider these parts as a way to measure width.
We could also write that in format col-* i.e. col-4 like this :
<div class="row">
<div class="col-4">col</div>
<div class="col-4">col</div>
<div class="col-4">col</div>
</div>
And it would've made no difference because by default bootstrap gives equal width to col (4 + 4 + 4 = 12).
But, what if we want to give 7 parts to 1st col, 3 parts to 2nd col and rest 2 parts (12-7-3 = 2) to 3rd col (7+3+2 so total is 12), we can simply do this:
<div class="row">
<div class="col-7">col-7</div>
<div class="col-3">col-3</div>
<div class="col-2">col-2</div>
</div>
and you can customize the width of col-xx-* classes also.
<div class="row">
<div class="col-sm-7">col-sm-7</div>
<div class="col-sm-3">col-sm-3</div>
<div class="col-sm-2">col-sm-2</div>
</div>
How does it look in the action?
What if sum of col is more than 12? Then the col will shift/adjust to below line. Yes, there can be any number of columns for a row!
<div class="row">
<div class="col-12">col-12</div>
<div class="col-9">col-9</div>
<div class="col-6">col-6</div>
<div class="col-6">col-6</div>
</div>
What if we want 3 columns in a row for large screens but split these columns into 2 rows for small screens?
<div class="row">
<div class="col-12 col-sm">col-12 col-sm TOP</div>
<div class="col col-sm">col col-sm</div>
<div class="col col-sm">col col-sm</div>
</div>
You can play around here: https://jsfiddle.net/JerryGoyal/6vqno0Lm/
I think the confusing aspect of this is the fact that BootStrap 3 is a mobile first responsive system and fails to explain how this affects the col-xx-n hierarchy in that part of the Bootstrap documentation.
This makes you wonder what happens on smaller devices if you choose a value for larger devices and makes you wonder if there is a need to specify multiple values. (You don't)
I would attempt to clarify this by stating that...
Lower grain types (xs, sm) attempt retain layout appearance on smaller screens and larger types (md,lg) will display correctly only on larger screens but will wrap columns on smaller devices.
The values quoted in previous examples refer to the threshold as which bootstrap degrades the appearance to fit the available screen estate.
What this means in practice is that if you make the columns col-xs-n then they will retain correct appearance even on very small screens, until the window drops to a size that is so restrictive that the page cannot be displayed correctly.
This should mean that devices that have a width of 768px or less should show your table as you designed it rather than in degraded (single or wrapped column form).
Obviously this still depends on the content of the columns and that's the whole point. If the page attempts to display multiple columns of large data, side by side on a small screen then the columns will naturally wrap in a horrible way if you did not account for it. Therefore, depending on the data within the columns you can decide the point at which the layout is sacificed to display the content adequately.
e.g. If your page contains three col-sm-n columns bootstrap would wrap the columns into rows when the page width drops below 992px.
This means that the data is still visible but will require vertical scrolling to view it. If you do not want your layout to degrade, choose xs (as long as your data can be adequately displayed on a lower resolution device in three columns)
If the horizontal position of the data is important then you should try to choose lower granularity values to retain the visual nature. If the position is less important but the page must be visible on all devices then a higher value should be used.
If you choose col-lg-n then the columns will display correctly until the screen width drops below the xs threshold of 1200px.
TL;DR
.col-X-Y means on screen size X and up, stretch this element to fill Y columns.
Bootstrap provides a grid of 12 columns per .row, so Y=3 means width=25%.
xs, sm, md, lg are the sizes for smartphone, tablet, laptop, desktop respectively.
The point of specifying different widths on different screen sizes is to let you make things larger on smaller screens.
Example
<div class="col-lg-6 col-xs-12">
Meaning: 50% width on Desktops, 100% width on Mobile, Tablet, and Laptop.
Device Sizes and class prefix:
Extra small devices Phones (<768px) - .col-xs-
Small devices Tablets (≥768px) - .col-sm-
Medium devices Desktops (≥992px) - .col-md-
Large devices Desktops (≥1200px) - .col-lg-
Grid options:
Reference: Grid System
.col-xs-$  Extra Small  Phones Less than 768px 
.col-sm-$  Small Devices  Tablets 768px and Up 
.col-md-$  Medium Devices  Desktops 992px and Up 
.col-lg-$  Large Devices  Large Desktops 1200px and Up 
One particular case : Before learning bootstrap grid system, make sure browser zoom is set to 100% (a hundred percent). For example : If screen resolution is (1600px x 900px) and browser zoom is 175%, then "bootstrap-ped" elements will be stacked.
HTML
<div class="container-fluid">
<div class="row">
<div class="col-lg-4">class="col-lg-4"</div>
<div class="col-lg-4">class="col-lg-4"</div>
</div>
</div>
Chrome zoom 100%
Browser 100 percent - elements placed horizontally
Chrome zoom 175%
Browser 175 percent - stacked elements
well it's used to tell bootstrap how many columns are to be placed in a row depending on the screen size-
col-xs-2
would show only 2 columns in a row in extra small(xs) screen, in the same way as sm defines a small screen, md(medium sized), lg(large sized),
but according to bootstrap smaller first rule, if you mention
xs-col-2 md-col-4
then 2 columns would be shown in every row for screen sizes from xs upto sm(included) and changes when it gets next size i.e. for md up to lg(included)
for a better understanding of screen sizes try running them in various screen modes in chrome's developer mode(ctr+shift+i) and try various pixels or devices

Need to keep <br> in text block tags while using import.io

Looking to do something relatively straightforward, I'm scraping text which so far I have had no problem grabbing, but I need to keep the <br> tags because white space analysis is an important part of the dataset.
Is there a way to keep the <br> tags so I can turn them into \n\rlater on.
Example:
<p>
<span>Some text.</br></span>
<a>Some more text.<br></a>
<span>Some more more text.<br></span>
</p>
I need : Some text.<br>Some more text.<br>Some more more text.<br>
Right now I get: Some text. Some more text. Some more more text.
Advice?
The only way is to get the html format of your selection , all you have to do is change the column type from Text to HTML , also there is no way to get only the text + the <br>.

How to use special characters on pentaho Dashboard

I'm trying to use special characters on my dashboard using a HTML structure.
It only works if I use HTML Entities such as "& atilde;" (without space) for ã.
But is it the only way to do it? Is there anywhere I can set UTF-8, for example?
I tried to put a META tag setting UTF-8, but I didn't work.
Here's what I'm doying:
Input:
Output:
I need to type: "Alocação de Funcionários"
Notice that I also set a custom noDataMessage_text on Advanced Properties > Extension points of my first Bar Char and, since the message also have special characters on it, using the HTML Entity would certainly not be a good idea.
UPDATE:
I have the same problem when I was looking for my Cubes when I was using the OLAP Selector Wizard
I think your problem will solve. You can use like these.
<h1 style="font-weight: bold;"> Alocação de Funcionários<h1>
or
<h1 style="color:#297385l"> AlocaÇão de Funcionários</h1>
I got these output in my dashboard.
I am thinking your font family is creating some issue. Please copy the exact h1 tag line and paste it in your dashboard let's see.
Thank you.

dijit.InlineEditBox with highlighted html

I have some dijit.InlineEditBox widgets and now I need to add some search highlighting over them, so I return the results with a span with class="highlight" over the matched words. The resulting code looks like this :
<div id="title_514141" data-dojo-type="dijit.InlineEditBox"
data-dojo-props="editor:\'dijit.form.TextBox\', onFocus:titles.save_old_value,
onChange:titles.save_inline, renderAsHtml:true">Twenty Thousand Leagues <span
class="highlight">Under</span> the Sea</div>
This looks as expected, however, when I start editing the title the added span shows up. How can I make the editor remove the span added so only the text remains ?
In this particular case the titles of the books have no html in them, so some kind of full tag stripping should work, but it would be nice to find a solution (in case of short description field with a dijit.Editor widget perhaps) where the existing html is left in place and only the highlighting span is removed.
Also, if you can suggest a better way to do this (inline editing and word highlighting) please let me know.
Thank you !
How will this affect your displayed content in the editor? It rather depends on the contents you allow into the field - you will need a rich-text editor (huge footprint) to handle html correctly.
These RegExp's will trim away XML tags
this.value = this.displayNode.innerHTML.replace(/<[^>]*>/, " ").replace(/<\/[^>]*>/, '');
Here's a running example of the below code: fiddle
<div id="title_514141" data-dojo-type="dijit.InlineEditBox"
data-dojo-props="editor:\'dijit.form.TextBox\', onFocus:titles.save_old_value,
onChange:titles.save_inline, renderAsHtml:true">Twenty Thousand Leagues <span
class="highlight">Under</span> the Sea
<script type="dojo/method" event="onFocus">
this.value = this.displayNode.innerHTML.
replace(/<[^>]*>/, " ").
replace(/<\/[^>]*>/, '');
this.inherited(arguments);
</script>
</div>
The renderAsHtml attribute only trims 'off one layer', so embedded HTML will still be html afaik. With the above you should be able to 1) override the onFocus handling, 2) set the editable value yourself and 3) call 'old' onFocus method.
Alternatively (as seeing you have allready set 'titles.save_*' in props, use dojo/connect instead of dojo/method - but you need to get there first, sort of say.