I'm using scrapy's export to CSV but sometimes the content I'm scraping contains quotes and comma's which i don't want.
How can I replace those chars with nothing '' before outputting to CSV?
Heres my CSV containing the unwanted chars in the strTitle column:
strTitle,strLink,strPrice,strPicture
"TOYWATCH 'Metallic Stones' Bracelet Watch, 35mm",http://shop.nordstrom.com/s/toywatch-metallic-stones-bracelet-watch-35mm/3662824?origin=category,0,http://g.nordstromimage.com/imagegallery/store/product/Medium/11/_8412991.jpg
Heres my code which errors on the replace line:
def parse(self, response):
hxs = Selector(response)
titles = hxs.xpath("//div[#class='fashion-item']")
items = []
for titles in titles[:1]:
item = watch2Item()
item ["strTitle"] = titles.xpath(".//a[#class='title']/text()").extract()
item ["strTitle"] = item ["strTitle"].replace("'", '').replace(",",'')
item ["strLink"] = urlparse.urljoin(response.url, titles.xpath("div[2]/a[1]/#href").extract()[0])
item ["strPrice"] = "0"
item ["strPicture"] = titles.xpath(".//img/#data-original").extract()
items.append(item)
return items
EDIT
Try adding this line before the replace.
item["strTitle"] = ''.join(item["strTitle"])
strTitle = "TOYWATCH 'Metallic Stones' Bracelet Watch, 35mm"
strTitle = strTitle.replace("'", '').replace(",",'')
strTitle == "TOYWATCH Metallic Stones Bracelet Watch 35mm"
In the end the solution was:
item["strTitle"] = [titles.xpath(".//a[#class='title']/text()").extract()[0].replace("'", '').replace(",",'')]
Related
def read_pdf(path: str) -> str:
doc = fitz.open(path)
txt = ""
for page in doc:
txt += page.get_text("text")
return txt
I get the error fitz.fitz.FileDataError: cannot open document. Could some one help? Thank you in anticipation.
Private Async Function cmdList() As Task
Dim m = Context.Message
Dim u = Context.User
Dim g = Context.Guild
Dim c = Context.Client
Dim words As String = ""
Dim embed As New EmbedBuilder With {
.Title = $"Wallpaper keyword list",
.ImageUrl = "https://i.imgur.com/vc241Ku.jpeg",
.Description = "The full list of keywords in our random wallpaper list",
.Color = New Color(masterClass.randomEmbedColor),
.ThumbnailUrl = g.IconUrl,
.Timestamp = Context.Message.Timestamp,
.Footer = New EmbedFooterBuilder With {
.Text = "Keyword Data",
.IconUrl = g.IconUrl
}
}
For Each keyword As String In wall.keywords
words = words + keyword + " **|** "
Next
embed.AddField("Full list", words)
Await m.Channel.SendMessageAsync("", False, embed.Build())
End Function
This is my command to get every word from an array and put it on a field. What I want to know is how do I make it so once the field gets full it'll automatically add a new one and continue with the list. This might be a little far-fetched but just don't know how to go about this. Sorry if I can't understand any of the answers. I'm still a little new to coding on Discord.net and well vb in general.
This is a modification of you hastebin code
Dim row As Integer = 0
Dim words As String = String.Empty
For Each keyword As String In wall.keywords
'If appending the keyword to the list of words exceeds 256
'don't append, but instead add the existing words to a field.
If words.Length + keyword.length + 7 > 256 Then
row += 1
embed.AddField($"List #{row}", words) 'Add words to field
'reset words
words = String.Empty
End If
words = words + keyword + " **|** "
Next
'The add condition within the for loop is only entered when we are
'about to exceed to field length. Anything string under the max
'length would exit the loop without being added. Add it here
embed.AddField($"List #{row + 1}", words)
Await m.Channel.SendMessageAsync("", False, embed.Build())
While it does not change any of the logic, you could consider using a StringBuilder
Is there a way to remove all whitespace characters except for tabs and linebreaks?
If I were to use .replaceAll("\s+", "") or .replaceAll(" ", "") I would also delete every tab or linebreak.
for python language, you can use this function:
def replaceWhiteSpace(text):
res = []
for i in text:
res = text.str.split()
for j in res:
text2 = ' '.join(j)
return text2
testDataset = replaceWhiteSpace(text)
put your a column in your dataframe in text, and the result will be stored in testDataset
The code I'm using...
For Each file As String In My.Computer.FileSystem.GetFiles(directory)
Dim fi As FileInfo = New FileInfo(file)
If isNotMusic(fi.Extension.ToString) = True Then Continue For 'Checks file extension for non-music files; if test is true for-loop continues with next file
trackCounter += 1 'Adds 1 to trackCounter
Dim song As New musicInfo
Dim tagFile As TagLib.File = TagLib.File.Create(fi.FullName)
infoArtist = tagFile.Tag.Performers(0)
With song
.track = tagFile.Tag.Track
.title = tagFile.Tag.Title
.artist = tagFile.Tag.Performers(0)
.album = tagFile.Tag.Album
.extension = fi.Extension.ToString
End With
songs.Add(song)
Next
When I use this code on a folder filled with AC/DC songs, tagFile.Tag.Performers(0) returns "AC".
I looked up this problem elsewhere and from what I could see, only other tagging solutions such as MpTagThat and MP1 have addressed this problem and made a patch.
I'm aware that the Performers tag is an array and the other half "DC" is likely stored in tagFile.Tag.Performers(1). However, I will eventually be separating each artist with a ";" in my code and if I left everything as is, AC/DC would be returned as "AC;DC".
I am creating a for each loop to take the words from a string and place them each into a text box. The program allows for up to "9" variables What I am trying to attempt is.
Foreach word in Words
i = i +1
Varible & i = word
txtCritical1.Text = variable & i
any ideas on a way to make this work?
Have a look through the MSDN article on For Each. It includes a sample using strings.
https://msdn.microsoft.com/en-us/library/5ebk1751.aspx
So i went with a simple if statement this does the trick. Each text box is filled in.
Dim details As String = EditEvent.EditDbTable.Rows(0).Item(13).ToString()
Dim words As String() = details.Split(New Char() {"«"})
Dim word As String
For Each word In words
i = i + 1
v = word
If i = 1 Then
txtCritical1.Text = v
ElseIf i = 2 Then
txtCritical2.Text = v
ElseIf ....
ElseIf i = 9 then
txtCritical2.text = v
Else
....
End If
Next