How do I make Google Sheets sort my words correctly - spreadsheet

I'm working on SEO and I've gathered a huge list of keywords (around 2000).
What I want to do is this:
Make sheets look at the 2000 keywords and get every word with <whatever I fill in>
Example:
Pears are awesome
Apples are awesome
Pineapples are awesome and great
Apples are awesome and great
Pears are awesome and great
I want all the results showing for "pears"
So sheets spits out:
Pears are awesome and great
Pears are awesome
Bonus points if it moves them so I don't have to manually remove the keywords from the list.
It was a long time since I've coded, I tried, but to no avail.
And after hours of messing around with the search, find and other randomly not working commands, I thought it was time to search for help.
I also really feel like this is a stupid question for some reason, but I just couldn't find it for the life of me.

The way I understand your problem, a simple SORT function should resolve your problem.
=SORT({Data};{Column to sort by};{is ascending})
Where {Data} is the area where you have your "raw data" (one or several columns), {Column to sort by} pretty much says it, but make shure to white is at f.x. {A:A}, and not just {A}. {is ascending} should be either {true} or {false}, for ascending or descending sorting.
Your SORT function can also contain functions, like this:
SORT(A:C; ISBLANK(A:A); true; VALUE(A:A); true; B:B; false; C:C; true)
(isblank to avoid sorting blank cells on top, and value to make sure the sheets sorts column A by numerical value, and not alphabetically (so 2 comes before 10 even though 2 comes after 1)
I hope this helped :)

Related

Splitting information from a cell

My sheet contains three types of cells:
5off
50off
550off
What they should read is:
$5.00 Off
$0.50 Off
$5.50 Off
I've been fighting with Text-to-Columns and =concat for a while and am trying to get this to work as easily as possible. Any ideas?
Just wondering what's the rule of the conversion for example the first figure is
5off => "$5.00 Off" > split number, add decimal, upper the O in off, concatenate
However in number two the rules are a little different
50off => "$0.50 Off" > split number, make the number decimal, concatenate
Based on those limited information I will suggest you to break down your problem to simpler form:
See image below, the top is the result, bottom is the formula used. There might be simpler way though.
Hope this help

Why does this search for [help/dont-ask] return irrelevant results in DSE?

Why does this ridiculously simple query on data.stackexchange.com return results that don't have [help/dont-ask] in the comment text? I feel like I'm missing something mind-numbingly obvious here.
select top 10 Id, PostId, Text
from comments
where text like '%[help/dont-ask]%'
Results I currently get:
Id PostId Text
-- ------- -----------------------------------
1 35314 not sure why this is getting downvoted -- it is correct! Double check it in your compiler if you don't believe him!
2 35314 Yeah, I didn't believe it until I created a console app - but good lord! Why would they give you the rope to hang yourself! I hated that about VB.NET - the OrElse and AndAlso keywords!
4 35195 I don't see an accepted answer now, I wonder how that got unaccepted. Incidentally, I would have marked an accepted answer based on the answers available at the time. Also, accepted doesn't mean Best :)
9 47239 Jonathan: Wow! Thank you for all of that, you did an amazing amount of work!
10 45651 It will help if you give some details of which database you are using as techniques vary.
12 47428 One of the things that make a url user-friendly is 'discover-ability', meaning you can take a guess at url's simply from the address bar. http://i.love.pets.com/search/cats+dogs could easily lead to http://i.love.pets.com/search/pug+puppies etc
14 47481 I agree, both CodeRush and RefactorPro are visually impressive (most of which can be turned off BTW), but for navigating and refactoring Resharper is much better in my opinion of using both products.
15 47373 Just wanted to mention that this is an excellent solution if you consider the problem to be linear (i.e. treating `A1B2` as a single number). I still think the problem is multi-dimensional, but I guess we'll just have to wait for the author to clarify :)
16 47497 Indeed, the only way to do this is get the server to generate your CSS file which can be done in many ways depending on which language you are using. HttpHandlers are common in C#. You could use jQuery or the likes to add styling to every element with the class 'ourColur' and parametrise your JS
18 47513 This advice goes against the spirit of CSS, which is separation of content and presentation. You way requires changing HTML for presentation sake, and stating in content which elements have same color.
...none of which contains the magic link (or even the text dont-ask).
Because [] delimits a set of characters to find.
You need to escape them.
Or just use CHARINDEX as the search is unsargable anyway.
WHERE CHARINDEX('[help/dont-ask]', text) > 0

Get proper Numbering field for figures in Microsoft Word 2010

I have .docx document with multilevel list for my chapters and TOC is:
1. Chapter One
2. Chapter Two
2.1. Chapter Two, Sub-chapter One
...
5. Chapter Five
5.1. Chapter Five, Sub-chapter One
5.1.1. Chapter Five, Sub-chapter One, Sub-sub-chapter One
etc.
I had inserted figure in my sub-chapter 5.1.1. and I used "Insert Caption..." to put some text below the image:
Figure 5.1.1.1 Some image caption
What I would like to have is caption format where only Chapter number, but no sub-chapters numbering is included, like this:
Figure 5.1 Some image caption
where 5 is my Chapter number and .1 is sequential of Figure in this Chapter.
Now my field code looks like this:
Figure { STYLEREF 1 \s }.{ SEQ Figure \* ARABIC \s 1 }
How can that be done?
In Word2007, all you have to do is be careful to start the sub-chapters with a different heading style. Then use the chapter heading style for numbering your figures. Word then ignores your sub-chapters when it numbers your figures, tables, and equations. Oh, and you have to have the entire document set up as a multi-level list, but it sounds like you've already done that.
p.s. Although this is a great question, and I learned something answering it, it doesn't really belong on StackOverflow since it isn't a programming question. Someone will suggest a better StackExchange site for it. (Super User?) Don't get mad! It is very easy to open accounts on other StackExchange sites. Your logon credentials are the same (unless you make them different). Your reputation doesn't carry over to different sites, but your rep here is only 23 at the moment, so no big deal there. ;-) If you move it to Super User, leave a comment here to that effect, and I'll come answer it over there. Then you can select my answer, and I won't be a 6 there anymore. LOL! And maybe you can answer my one question over there.
Ok, I have found temporary solution to my problem.
Inside every chapter (denoted by Heading1 style) I have change the Caption code from this:
Figure { STYLEREF 1 \s }.{ SEQ Figure \* ARABIC \s 1 }
to this:
Figure 5.{ SEQ Figure \* ARABIC \s 1 }
where 5 is number of my top-level chapter. Then, if I insert new figure, I Copy-Paste above code under it and all figure numbers properly change to new, i.e. sequence number.
This way, I do not have problems, when executing "Update field" on whole document, where Table of Figures is affected.

How exact phrase search is performed by a Search Engine?

I am using Lucene to search in a Data-set, I need to now how "" search (I mean exact phrase search) mechanism has been implemented?
I want to make it able to result all "little cat" hits when the user enters "littlecat". I now that I should manipulate the indexing code, but at least I should now how the "" search works.
I want to make it able to result all "little cat" hits when the user enters "littlecat"
This might sound easy but it is very tough to implement. For a human being little and cat are two different words but for a computer it does not know little and cat seperately from littlecat, unless you have a dictionary and your code check those two words in dictionary. On the other hand searching for "little cat" can easily search for "littlecat" aswell. And i believe that this goes beyong the concept of an exact phrase search. Exact phrase search will only return littlecat if you search for "littlecat" and vice versa. Even google seemingly (expectedly too), doesnt return "little cat" on littlecat search
A way to implement this is Dynamic programming - using a dictionary/corpus to compare your individual words against(and also the left over words after you have parsed the text into strings).
Think of it like you were writing a custom spell-checker or likewise. In this, there's also a scenario when more than one combination of words may be left over eg -"walkingmydoginrain" - here you could break the 1st word as "walk", or as "walking" , and this is the beauty of DP - since you know (from your corpus) that you can't form legitimate words from "ingmydoginrain" (ie rest of the string - you have just discovered that in this context - you should pick the segmented word as "Walking" and NOT walk.
Also think of it like not being able to find a match is adding to a COST function that you define, so you should get optimal results - meaning you can be sure that your text(un-separated with white spaces) will for sure be broken into legitimate words- though there may be MORE than one possible word sequences in that line(and hence, possibly also intent of the person seeking this)
You should be able to find pretty good base implementations over the web for your use case (read also : How does Google implement - "Did you mean" )
For now, see also -
How to split text without spaces into list of words?

Add spaces between words in spaceless string

I'm on OS X, and in objective-c I'm trying to convert
for example,
"Bobateagreenapple"
into
"Bob ate a green apple"
Is there any way to do this efficiently? Would something involving a spell checker work?
EDIT: Just some extra information:
I'm attempting to build something that takes some misformatted text (for example, text copy pasted from old pdfs that end up without spaces, especially from internet archives like JSTOR). Since the misformatted text is probably going to be long... well, I'm just trying to figure out whether this is feasibly possible before I actually attempt to actually write system only to find out it takes 2 hours to fix a paragraph of text.
One possibility, which I will describe this in a non-OS specific manner, is to perform a search through all the possible words that make up the collection of letters.
Basically you chop off the first letter of your letter collection and add it to the current word you are forming. If it makes a word (eg dictionary lookup) then add it to the current sentence. If you manage to use up all the letters in your collection and form words out of all of them, then you have a full sentence. But, you don't have to stop here. Instead, you keep running, and eventually you will produce all possible sentences.
Pseudo-code would look something like this:
FindWords(vector<Sentence> sentences, Sentence s, Word w, Letters l)
{
if (l.empty() and w.empty())
add s to sentences;
return;
if (l.empty())
return;
add first letter from l to w;
if w in dictionary
{
add w to s;
FindWords(sentences, s, empty word, l)
remove w from s
}
FindWords(sentences, s, w, l)
put last letter from w back onto l
}
There are, of course, a number of optimizations you could perform to make it go fast. For instance checking if the word is the stem of any word in the dictionary. But, this is the basic approach that will give you all possible sentences.
Solving this problem is much harder than anything you'll find in a framework. Notice that even in your example, there are other "solutions": "Bob a tea green apple," for one.
A very naive (and not very functional) approach might be to use a spell-checker to try to isolate one "real word" at a time in the string; of course, in this example, that would only work because "Bob" happens to be an English word.
This is not to say that there is no way to accomplish what you want, but the way you phrase this question indicates to me that it might be a lot more complicated than what you're expecting. Maybe someone can give you an acceptable solution, but I bet they'll need to know a lot more about what exactly you're trying to do.
Edit: in response to your edit, it would probably take less effort to run some kind of OCR tool on a PDF and correct its output than it would just to correct what this system might give you, let alone program it
I implemented a solution, the code is avaible on code project:
http://www.codeproject.com/Tips/704003/How-to-add-spaces-between-spaceless-strings
My idea was to prioritize results that use up most of the characters (preferable all of them) then favor the ones with the longest words, because 2,3 or 4 character long words can often come up by chance from leftout characters. Most of the times this provides the correct solution.
To find all possible permutations I used recursion. The code is quite fast even with big dictionaries (tested with 50 000 words).