Scrapy - Extract data from table - scrapy

I am trying to fetch data from a table into separate fields of a CSV file.
The table in the website looks like this:
And (part of) the source of the webpage looks like this:
<div id="right">
<div id="rightwrap">
<h1>Krimpen aan den IJssel</h1>
<div class="tools">
print
terug
</div>
<h2 class="lijst">Krimpen aan den IJssel</h2>
<div class="dotted"> </div>
<div class="zoekres">
<h4>Aantal kindplaatsen</h4>
<div class="paratable">
<table cellpadding="0" cellspacing="0">
<tr>
<th> </th>
<th>2006</th>
<th>2008</th>
<th>2009</th>
<th>2010</th>
<th>2011</th>
</tr>
<tr>
<th>KDV</th>
<td>144</td>
<td>144</td>
<td>174</td>
<td>243</td>
<td>-</td>
</tr>
<tr>
<th>BSO</th>
<td>135</td>
<td>265</td>
<td>315</td>
<td>365</td>
<td>-</td>
</tr>
<tr>
<th>Totaal</th>
<td>279</td>
<td>409</td>
<td>489</td>
<td>608</td>
<td>-</td>
</tr>
</table>
</div>
</div>
</div>
</div>
<div class="brtotal"> </div>
</div>
I managed to retrieve the name of the place "Krimpen aan den IJssel" using this code:
def parse(self, response):
item = OrderedDict()
for col in self.cols:
item[col] = 'None'
item['Gemeente'] = response.css('h2.lijst::text').get('')
yield item
But I am unable to retrieve the values displayed in the table of this website. The standard approach for table using:
response.xpath('//*[#class="table paratable"]
doesn't seem to work or I am not experienced enough to set the parameters right.
Can anyone provide me with some lines of code that will bring the
values from this table into the following columns of my CSV-file
KDV_2006 KDV_2008 KDV_2009 KDV_2010 KDV_2011 BSO_2006 BSO_2008
BSO_2009 BSO_2010 BSO_2011

One possible way:
result = {}
years = response.xpath('//div[#class="paratable"]/table/tr[1]/th[position() > 1]/text()').getall()
for row in response.xpath('//div[#class="paratable"]/table/tr[position() > 1][position() < last()]'):
field_name = row.xpath('./th/text()').get()
values = row.xpath('./td/text()').getall()
for year, value in zip(years, values):
result[f'{field_name}_{year}'] = value

Related

listen for server sent event (SSE) with HTMX and append to a table

I made a simple Go backend that renders an html table (from a SQLite database).
In the same backend i have an /updates endpoint with SSE events when a new row is added to the database.
I want to use htmx to listen for events and then add a row to the table.
What is the right pattern to do this?
I've read https://htmx.org/extensions/server-sent-events/
the example here is to trigger a GET when an event arrives:
<div hx-ext="sse" sse-connect="/updates">
<div hx-get="/table" hx-trigger="sse:rowadded">
...
</div>
</div>
In this way i request the entire table at every update.
How could i add only a single row to the existent rendered table?
You can do this with client side templates.
See https://htmx.org/extensions/client-side-templates/
Example:
<table>
<thead>
<tr>
<th>
Manufacturer
</th>
<th>
Model
</th>
<th>
Power (KW)
</th>
</tr>
</thead>
<tbody id="idTableBody">
<tr id="ModelId_250">
<td>Husqvarna</td>
<td>701 Supermoto</td>
<td>55</td>
</tr>
</tbody>
</table>
In this example mustache is used as templating engine.
<div hx-ext="client-side-templates">
<div hx-ext="sse" sse-connect="/sse-motorbikes">
<div sse-swap="new_bike"
hx-swap="afterbegin"
hx-target="#idTableBody"
mustache-template="idTemplateInsertModel">
</div>
</div>
</div>
Note: EventName is new_bike
Here is the template:
<template id="idTemplateInsertModel">
<tr id="modelId_{{modelId}}">
<td>
{{manufacturer}}
</td>
<td>
{{model}}
</td>
<td>
{{power}}
</td>
</tr>
</template>
With this sse event
type: "new_bike"
data:'{"modelId":208,"manufacturer":"Honda","model":"CRF 1100 L Africa Twin","power":75}'
this row will be inserted into the table body
<tr id="modelId_208">
<td>Honda</td>
<td>CRF 1100 L Africa Twin</td>
<td>75</td>
</tr>
Update
If you want to use a GET request to fetch the new row from the server, you can pass the data sent with the sse event (e.g. an id) to the request.
Check out the answer to this question.
https://stacko...how-to-get-url-for-htmx-get-from-an-sse-event-which-itself-triggers-the-hx-get-c

xpath with following-sibling

I want to access the table with classname 'table table-hover' that should be under the class=='box-title' that contains the text 'OODR Items for next 20 Days'. Can any one please help me out to get the xpath for this ? I tried with following-sibling but no luck. Thanks in advance.
<?xml version="1.0" encoding="UTF-8"?>
<div id="topTenSellers" class="box box-solid box-primary frontpageWidget">
<div class="box-header">
<i class="fa fa-group" />
<h3 class="box-title">OODR Items for next 20 Days</h3>
<div class="box-tools pull-right">
<button class="btn btn-primary btn-sm" data-widget="collapse">
<i class="fa fa-minus" />
</button>
</div>
</div>
<!-- /.box-header -->
<div class="box-body no-padding">
<table class="table table-hover">
<tbody>
<tr>
<th style="width: 10px">#</th>
<th>Booking</th>
<th>Item Start Date</th>
<th>Site</th>
<th>Supplier</th>
</tr>
<tr>
<td>
<strong>1</strong>
</td>
<td>
(642143)
</td>
<td>21/10/2017 00:00:00</td>
<td>Ski</td>
<td>OODR - Out of Date Range</td>
</tr>
</tbody>
</table>
</div>
<!-- /.box-body -->
</div>
You can try following instead of following-sibling as mentioned h3 and table nodes are not siblings:
//div[#id="topTenSellers"]//h3[#class="box-title" and .="OODR Items for next 20 Days"]/following::table[#class="table table-hover"]
If there is only one element of that class name then either of these will work. One thing to note though, selenium can struggle with spaces in class names depending on the version and browser. If you find that is the case, then use multiple contains to handle the spaces.
//table[contains(#class,'table table-hover')]
//table[#class = 'table table-hover']
If you need that the element as a child of that OODR Items for the next 20 days
//h3[contains(text(),'OODR Items for the next 20 days')]/parent::div/following-sibling::div/table[#class ='table table-hover']
This path uses your anchor point, "OODR Items..", then goes to the parent, then sibling, then to the item with the specified class name. Good luck!

Table cell validation in vuejs and laravel 5.4

I’m very new to VUE and trying loop through dynamically created tables from unique arrays. I have the table creation complete and dynamic table id’s based off a value from the array. I’m trying to validate that either cell[0] in each row contains a specific string or if the last cell[?] which has a select dropdown has been selected and is said string.
I’ve done something similar before in JS like this.
$("#" + t_node + " :selected").each(function (i,sel) { .....///code }
and like this
$("table#"+t_node+" > tbody > tr").each(function(row, tr) { .....///code }
I don’t know how to replicate with VUE.
I have a onclick event that I want to loop through the table and for any row that has p2vg01 already created sum its size along with any select option of p2vg01. In the below table I’d want to find that SDB was selected at 107GB. Not shown but it could be that SDB was already p2vg01 but if I selected SDC as well as p2vg01 I’d sum 32GB + 107GB.
<div v-for="storageResult in storageValidationResults" >
<h3 class="panel-title">{{ storageResult.node_name }}</h3>
<table :ref="storageResult.node_name" v-bind:id="storageResult.node_name" class="table table-bordered table-striped table-hover" >
<thead>
<th v-for="(value, key, index) in storageResult.table_head">
{{ value }}
</th>
<th>Select</th>
</thead>
<tbody>
<tr v-for="(value, key, index) in storageResult.disk_data">
<td v-for="v in value">
{{ v }}
</td>
<td v-if="checkAvailable(value)">
<select>
<option value="">--Select VG--</option>
<option value="p2vg00">p2vg00</option>
</select>
</td>
<td v-else=""></td>
</tr>
</tbody>
</table>
</div>

vue.js when sorting grid only values is updated, not HTML

I am new to Vue.js, and could really use som help on this one.
I am trying to put a class (success) on my table rows to give them background color depending on the value of a property (status) in each of the objects in Array (data), wich is working as intended using the v-bind:class.
The problem arises when i sort the table rows by clicking on the table headers. When this is done there is a mismatch between the colored rows and their content, as if only values of rows is updated and not the rows themselves.
Try it here : https://jsfiddle.net/Bayzter/cyv1o78s/
Does anyone know how to solve this, so colored rows again match up with the correct objects?
<script type="text/x-template" id="grid-template">
<table>
<thead>
<tr>
<th v-for="key in columns"
#click="sortBy(key)"
:class="{active: sortKey == key}">
{{key | capitalize}}
<span class="arrow"
:class="sortOrders[key] > 0 ? 'asc' : 'dsc'">
</span>
</th>
</tr>
</thead>
<tbody>
<tr v-for="
entry in data
| filterBy filterKey
| orderBy sortKey sortOrders[sortKey]" v-bind:class="{ 'success' : data[$index].status == 0}">
<td v-for="key in columns">
{{entry[key]}}
</td>
</tr>
</tbody>
</table>
</script>
<!-- demo root element -->
<div id="demo">
<form id="search">
Search <input name="query" v-model="searchQuery">
</form>
<demo-grid
:data="gridData"
:columns="gridColumns"
:filter-key="searchQuery">
</demo-grid>
</div>
Where you have
v-bind:class="{ 'success' : data[$index].status == 0}"
You want
v-bind:class="{ 'success' : entry.status == 0}"
$data[$index] is not going to refer to the current-order item, it's going to refer to the original-order item. entry is the current item.

How to set header value in kendo grid row template

I am using jquery kendo grid in my project where i used row template to show three column in one row. Below is the code:
<table id="grid" style="width:100%">
<thead style="display:none">
<tr>
<th>
Details
</th>
</tr>
</thead>
<tbody>
<tr>
<td colspan="3"></td>
</tr>
<tr>
<td>
</td>
</tr>
</tbody>
</table>
<script id="rowTemplate" type="text/x-kendo-tmpl">
<div>
<span class="name" style="font-size:medium">#: FirstValue #</span>
<span class="name" style="font-size:medium">#: SecondValue #</span>
</div>
<tr>
<td style="width:30%">
#: GetName #
<span class="name" style="font-size:14px; color:green">#: Designation #</span>
<span class="name" style="font-family:Arial; font-size:small">#: Company #</span>
</td>
</tr>
</script>
in the above code i am just passing my model data it's working fine but when i added one div which have value firstName and LastName so it is also repeating with this data but i want to to show separately.How do i show it separately so that it should not repeat with grid.
there is one problem in your html template.
Please replace '#' with 'Javascript:void(0)'.
Error:- #: GetName #
Fix:- #: GetName #
Hope that's work for you.
http://jsfiddle.net/parthiv89/t0w3ht6m/1/
if you like then don't forget to like.
I got solution by own, Firstly i changed code in my schema like this:
schema: {
parse: function (data) {
var items = [];
for (var i = 0; i < data.data.length; i++) {
if (data.data[i].CorrectValue != null && data.data[i].SearchValue != null) {
$("#spnSR")[i].innerHTML = "<b>"+"Get results for this text: "+"</b>"+data.data[i].CorrectValue;
$("#spnSV")[i].innerHTML = "<b>" + "Searched for this text: " +"</b>" + data.data[i].SearchValue;
}
}
var product = {
data: data.data,
total: data.total
};
items.push(product);
return (items[0].data);
},
}
Then in html i used two span to show this value which is there in for loop.
and it's working fine for me.
Thanks everyone.