Indexed_search only the news detail - typo3-9.x

I'm setup a TYPO3 v9.5 website with the Indexed_search ext.
When I search a word using the seachbox on the FE it show all results : home page, categories pages, and news pages.
there is a way to index/search only the newsitems detail page ?

There are multiple ways to achieve this.
In my opinion the simplest (without setting up crawler configurations) would be to limit indexing to only this page.
See https://docs.typo3.org/c/typo3/cms-indexed-search/master/en-us/Configuration/General/Index.html
On your root page you would set page.config.index_enable = 0 in TypoScript setup and on your news detail page page.config.index_enable = 1. Then clear and rebuild the index.
Another possibility for smaller sites is to filter the shown results in your Fluid template. I would not really suggest that but it works, too.

Related

Is it possible/wise to NOT link any pages from index? (SEO, Search Engines)

I have a humble question :)
i plan to set up a rather unusual webproject with about a thousand pages, where there won't be a classical navigation (only for about page and contact) and all pages won't link to one and another.
its index > opens random page > opens random page > opens random page.. all via a small php action..
i know from basic SEO understanding, that you should then generate a static directory like a sitemap, that links to all pages, so that google finds all pages from the index downwards..
BUT i don't want users, to see it.. it kills the fun on using the site, when you can see all content pages at a glance.. this project is all about exploring random things..
is this somehow possible? to have a dead end index page and a thousand dead end html pages that are only connected via a php script?
thanks in advance..
From a technical standpoint, there are no issues in what you are planning. From a SEO indexing and Google standpoint, make sure none of the pages you want discovered and indexed by Google are orphans, i.e. without a link to these pages.
These "hidden" pages need not be linked from the home page or a sitemap (one-to-many), instead you can try the breadcrumb method where a page leads only to the next page, which leads to the next page (one-to-one) and so on.
e.g. -
Parent 1 > child 1 > child 1a > child 1b .......
Parent 2 > child 2 > child 2a > child 2b .......
Parent 3 > child 3 > child 3a > child 3b .......
Here, the home page and your sitemap will have links ONLY to Parent 1, Parent 2 & Parent 3
UPDATE--
Also, not having a HTML sitemap for your users will not affect Google indexing as long as your XML sitemap is in place for Google to access.
Hope this helps.
There's not technical problem in having one page generate links to other pages upon generation, but I feel like there are some issues in the general idea here..
Firstly, why do you want your "sub pages" to be indexed by Google? Per definition, this defeats the "random page" idea. A Google search over your site (for instance using the "site:" feature of Google) will list all your pages, since they're indexed. This means it is easy to navigate between the "secret pages" (even if only cached versions of them).
Secondly, unless you prevent Google from indexing your pages (via a robots.txt file, for instance) - the Google bot will generate at least a subset of your pages by visiting the index page and then generating a link to a sub page.
To conclude, you can create an index which sends users over to random pages, but it probably makes little sense to have the web site indexed by a search engine if you'd like the sub pages to be "secret".

Creating a Table of Contents in Share Point Wiki Program

Hey I am not sure this is a question for this site but I am making wiki pages in a share point for my job but I notice they all just accumulate with no table of contents. Im wondering is there a way to make a table of contents page that allows the user to see all of the wiki pages made and allows them to click on the link to the page.
Thanks in advance!
do you mean that you want to see all the pages?
simply go to the all pages view, like this: http://sitename/LibraryName/Forms/AllPages.aspx

Canonical tag for content split across multiple pages

We have pages which have been split into multiple pages as they are too in depth. The structure currently...
Page (www.domain.com/page)
We have split this up like so...
Page + Subtitle (www.new-domain.com/page-subtitle-1)
Page + Subtitle (www.new-domain.com/page-subtitle-2)
Page + Subtitle (www.new-domain.com/page-subtitle-3)
I need to know the correct way of adding in multiple canonical tags on the original page. Is it search engine friendly to add say 3/4 canonical tags linking to 3/4 separate pages?
Well, this is what you should do -
Keep the complete page even if you are dividing into component pages.
Use rel="next" and rel="prev" links to indicate the relationship between component URLs. This markup provides a strong hint to Google that you would like it to treat these pages as a logical sequence, thus consolidating their linking properties and usually sending searchers to the first page.
In each of the component pages, add a rel="canonical" link to the original (all content) page. This will tell google about the all content page.
This strategy is recommended by google - read here.
Canonical tags are basically to consolidate link signals for duplicate or similar content. With that said, you are not supposed to have multiple canonical tags in a page. You have two options.
If your old page is going to go away, then you should pick one primary page(in the split pages) and do a 301 redirect, so the SEO value are carried over to that new primary URL.
If its going to stay, you can create internal links to the new pages. But make sure the content is different, so that it does not count as duplicate pages.
Hope this helps.

How do I setup a robots.txt which allows all pages EXCEPT the main page?

If I have a site called http://example.com, and under it I have articles, such as:
http://example.com/articles/norwegian-statoil-ceo-resigns
Basically, I don't want the text from the frontpage to show on Google results, so that when you search for "statoil ceo", you ONLY get the article itself, but not the frontpage which contains this text but is not of the article itself.
If you did that, then Google could still display your home page with a note under the link saying they couldnt crawl the page. This is because robots.txt doesnt stop a page being indexed. You could noindex the home page, though personally I wouldnt recommend it.

Sitefinity How to Exlude Template from Searching

Is there any way to exclude Sitefinity main template i used for all the pages, from searching?
Right now if i search,the search is returning the result with words present in the template menu,even though its not belong to the page.
Now i need to search pages exlceding that template contents.
Thanks in Advance.
The problem here is i have added a menu inside a content block widget in a template.
This template is used throughout the site and when i search for a keyword using the search feature, all the pages of the website are listed in the search result because the keyword is also found in the menu.So i need a solution so that the search result does not include the menu content in the search result.
This is a very high priority. Please help me find a solution at the earliest possible.
Ivan Pelovski recently published a blog post on how you can hide content from the search engine by using custom layout controls. Not specifically what you are asking, but maybe it can help.
Here: http://www.sitefinity.com/blogs/ivanpelovski/posts/12-02-06/hiding_page_content_from_the_search_engine_in_sitefinity_using_layout_widgets.aspx
Try adding a robots.txt metatag like this into the top of the template:
<meta name="robots" content="noindex" />
In more recent versions of Sitefinity you can also uncheck a box at each page level that will prevent the page from being indexed. The column for this setting in the database is sf_page_data (table) .. crawlable (column) in case you want to write a sql script to update several pages at once.
The exclusion of templates from search is mentioned in more detail here:
http://www.sitefinity.com/devnet/forums/sitefinity-4-x/general-discussions/exclude-page-from-search-index.aspx
Note that this will probably also prevent other search engines (such as google) from indexing that page.