2 spaces to 1 space after punctuation and super/subscript - vba

I'm trying to write a macro in Word that will make 2 spaces into 1 space after a punctuation and formatting section, like this, where the 23-29 will be links to references at the end of the document.
dultricies.23-29 Purus
I would like the macro to identify the two spaces after the superscript and make it 1 space.
Thanks,
Chris
I tried creating the macro to identify 2 spaces and make it 1 space - that worked. But when I tried to create a macro using wildcard characters or special formatting (superscript), I expected Word to locate the the instance and make it one space, but it did not.

You don't even need a macro for this. All you need is a simple wildcard Find/Replace, where:
Find = ([ ^s]){2,}
Replace = \1

Related

How can I remove URL links from cells in openrefine?

How can I remove all URL in the text via openrefine? Is there any transform code for that? My data have many URL links different from each other in the texts. And I want to remove these links.
For example my data have like that text in the cells
"put returns between paragraphs for linebreak add 2 spaces at end italic or bold indent code by 4 spaces backtick escapes like _so_ quote by placing > at start of line to http://foo.com/"
And I want to delete only URL links in the cells. After removing it should be;
"put returns between paragraphs for linebreak add 2 spaces at end italic or bold indent code by 4 spaces backtick escapes like _so_ quote by placing > at start of line to"
This transformation should do the trick :
value.replace(/(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?/, '')

Does regex not work in Excel search?

I am trying to search for trailing whitespaces in text cells in Excel. Knowing that Excel search accepts regex, I expected to leverage on the full feature set, but was surprised to find that some features do not seem to work.
For example, I have some cells with strings like ELUFA\s\s\s\s\s (note: in my excel sheet there is no \s, but just blank invisible whitespaces, after ELUFA, but I had to add these \s in here otherwise Stackoverflow would just remove these whitespaces and the string would just appear to be ELUFA) or NATION CONFEC.\s with trailing whitespaces.
I used the expression [A-Z.]{1}\s+$ into the excel search function expecting that it would return search results for these cells, but it does not, and just tells me that nothing is found.
However, what I find really funny is that Excel search is somehow able to interpret a regex like this A *. Using this expression, excel search does find for me only the ELUFA\s\s\s\s\s cells, and no other cells which do not match this regex.
Is there some kind of limitations as to what subset of the full REGEX that Excel search accepts? How do we get excel search to accept the full REGEX feature set as described here?
Thank you.
The Excel SEARCH() function does not support full regex. It actually only supports two wildcards, ? and *. From the documentation:
You can use the wildcard characters — the question mark (?) and asterisk (*) — in the find_text argument. A question mark matches any single character; an asterisk matches any sequence of characters. If you want to find an actual question mark or asterisk, type a tilde (~) before the character.
If you want to match spaces then you will have to enter them as literals. Note that finding any amount of trailing spaces could be as simple as ELUFA\s, with one space at the end, because that would actually match one, or more than one, space.

Remove repeated adjacent words in a word document

word document may contain repeated adjacent words. can there be a vba macro code to retain single occurrence, and delete the repeat.
eg,
He is is doing well.
should change to
He is doing well.
Help would be much appreciated.
I can't try it because office.live.com doesn't seem to support wildcards, but you can try this:
In Find and Replace > Replace > check Use wildcards and in the Find what: enter "(<*>) <\1>" and click Find Next to see if that matches the two words. If it does, enter "\1" in Replace with: and click Replace All to see if everything works as expected. If it does, you can Record Macro of those steps and check the generated code.
The above expression should also find repeating numbers like a123 a123. If you don't want that, you can try this expression in the Find what:
(<[A-Za-z]{1,}>) \1[!A-Za-z]
from http://www.louiseharnbyproofreader.com/blog-the-proofreaders-parlour/proofreading-in-word-one-of-my-favourite-findreplace-strings

How do I check if a string has a paragraph character?

I need to check if a string from a word document contains a paragraph character. I want to only extract the text without the paragraph character. Is There a constant for the paragraph character? I tried checking for vbnewLine and vbCrLF, but these didn't work.
Have a look at the wikipedia article on newlines.
In total there are 3 characters indicating a new line (in some context), and sometimes they are used in combinations.
I think it does not matter which ones Word uses and which ones it doesn't; You want them all gone.
So, I'd say run through all characters and remove all LF, CR and RS instances, or replace them by spaces (whilst avoiding double spaces)

Replace all non latin characters with their latin a-z counterparts and word count in VBA

I am trying to write a script in VBA that
will:
replace all É and other similar
characters with their latin
counterparts.
Remove all non alpha numeric
characters.
Remove duplicate spacing
then word count the string
I have worked out that i can split the string on " " and count the elements to get the word count... but I am struggling on the rest of it. Help much appreciated.
Word has a word count built in for sentences, paragraphs and the entire document:
ActiveDocument.Words.Count
As for replace, it is probably easiest to record a macro to see how this works. You will have to replace the accented characters one by one, or use RegEx to replace all A type (Å, Ä, Â ,Á, À) characters with A, and so on.