How can I remove URL links from cells in openrefine? - openrefine

How can I remove all URL in the text via openrefine? Is there any transform code for that? My data have many URL links different from each other in the texts. And I want to remove these links.
For example my data have like that text in the cells
"put returns between paragraphs for linebreak add 2 spaces at end italic or bold indent code by 4 spaces backtick escapes like _so_ quote by placing > at start of line to http://foo.com/"
And I want to delete only URL links in the cells. After removing it should be;
"put returns between paragraphs for linebreak add 2 spaces at end italic or bold indent code by 4 spaces backtick escapes like _so_ quote by placing > at start of line to"

This transformation should do the trick :
value.replace(/(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?/, '')

Related

2 spaces to 1 space after punctuation and super/subscript

I'm trying to write a macro in Word that will make 2 spaces into 1 space after a punctuation and formatting section, like this, where the 23-29 will be links to references at the end of the document.
dultricies.23-29 Purus
I would like the macro to identify the two spaces after the superscript and make it 1 space.
Thanks,
Chris
I tried creating the macro to identify 2 spaces and make it 1 space - that worked. But when I tried to create a macro using wildcard characters or special formatting (superscript), I expected Word to locate the the instance and make it one space, but it did not.
You don't even need a macro for this. All you need is a simple wildcard Find/Replace, where:
Find = ([ ^s]){2,}
Replace = \1

How to remove the extra space from all text boxes in php code

I want to add a function to remove all extra spaces from the text written in the text boxes in my php code. How to do it?
If you don't care about word separator spaces and want really serial string:
$string = 'forexample string with space and more';
$no_space = str_replace(" ","",$string);

How to replace all tab characters in a file by sequences of white-spaces in intellij?

Given a file in my project, I want to be able to replace all of the tab characters in the file with white spaces. Is there any way to do this in intellij?
Go to Edit | Convert Indents , and then choose To Spaces or To Tabs respectively. It's in the documentation: Changing identation
Replace only tabs used for indentation
Ctrl + Shift + A
type "To Spaces" > Enter
Replace all tabs
Ctrl + R
check Regex
Enter \t and spaces
Replace all

How do I check if a string has a paragraph character?

I need to check if a string from a word document contains a paragraph character. I want to only extract the text without the paragraph character. Is There a constant for the paragraph character? I tried checking for vbnewLine and vbCrLF, but these didn't work.
Have a look at the wikipedia article on newlines.
In total there are 3 characters indicating a new line (in some context), and sometimes they are used in combinations.
I think it does not matter which ones Word uses and which ones it doesn't; You want them all gone.
So, I'd say run through all characters and remove all LF, CR and RS instances, or replace them by spaces (whilst avoiding double spaces)

iphone sdk , apostrophe showing up as question mark

The quotation marks (apostrophe to be more specific) single and double are displaying as question mark on my text view.
The problem come up when I try to copy and paste some thing from a webpage and save it.
This problem does not happen when I type the sentence.
How can I replace a apostrophe with a regular single quote?
When you copy from a webpage you are not copying a plain old apostrophe. You are copying a fancy one that looks very similar but is not. Since the text view only displays plain text it cannot understand your fancy apostrophe.
When you copy from a webpage you will have to manually delete and retype the apostrophes.
You have to do a string replace probably with unicode characters. The following may be the characters that you want to replace:
Char Unicode HTML
“ 8220 “
‘ 8216 ‘
” 8221 ”
’ 8217 ’