Return to site

Basics of SEO: Bots and Indexing

This is going to be a post to learn basic SEO because I am only interested in giving a vision of the elements that are strictly involved in the indexing process of our website, that is, playing with the basic concepts so that they are understood and knowing what the aspects are more relevant that we must control to help index pages.

Once the urls of our website have been indexed, you should know that you can improve the positioning of those pages so that Google shows them before among the search results (SERP) that it offers to users and, thus, increase the traffic of the web ... is another song.

Improving web positioning when Google 'ranks us' is a more complex process that involves working SEO On Page and SEO Off Page in all its dimensions with appropriate approaches and strategies. But we would already be in another phase of the SEO work that we will talk about in other posts.

Google bots (spiders)

We can only start talking about the so-called ' google spiders' , which are robots that help the search engine to track, classify and index the pages on the web on Google. Facilitate and optimize this process is in our hands.

Why do it? Because if you are not indexed, you will not be among the search results ( SERP ) that users do. And if you have a website you want to find it.

In the case of a page that has not been indexed , you should know that it can only be accessed by entering its URL directly in the browser (direct access), or, because you have shared your link through social networks, something always recommended that you also It helps indexing or, because you have embedded the link in a PDF file and someone ends up clicking on them.

In any case, none of those accesses mentioned in the previous paragraph would be by organic means, that is, none would access by clicking on a non-sponsored search result (SERP), so we would no longer be talking about SEO.

Crawl Budget: tracking time assigned to bots for each website

It is interesting to know this concept because it means that if your website loads slowly, bots will have less time to track it. Therefore, the WPO (Web Performance Optimization) becomes another SEO facet to work. That is why and because there is a direct relationship between web traffic and web indexing. In case you didn't know, statistically, you lose users for every second of load that the web takes.

Although bots have time to track your entire website, that does not necessarily mean that they will index everything on Google. Remember that there are websites with hundreds and thousands of pages. But if your website has few urls, it is very likely that they are all indexed.

The 'robots.txt' file

This file, which is in the web root of your server, is used to tell bots what files and directories they may or may not track. It is a way of trying to avoid tracking (and indexing) pages that do not interest us, for example, urls with parameters that are generated when users do a search within our website. Or we may be interested in not tracking certain directories.

We may simply want to optimize the robots.txt to get Google bots to focus on what interests us and not waste time tracking plugin files or non-relevant wordpress files.

And it is always advisable to add the sitemap address in this file.

'Index / No index' and 'Follow / No follow'

Google spiders can read if a page is set as ' index ' or ' no index ', that is, if we want that page to be indexed or not.

The bots also know if they should follow the links that they find on that page, that is, if they are ' follow ' or ' do not follow ' those links, although the correct use of those attributes would be the subject of an advanced post. Comment for now that if you have a good internal link between the different pages that are already indexed and those that are not, you will help the latter be indexed.

Similarly, if you have a page that you want to index on Google but it turns out that you do not have it linked from any other url of your website, you have even removed it from the sitemap (by default, if it is index it comes out on the sitemap) it will be more difficult Let the bots find her. Even harder if you don't even have an incoming link (backlink) from another website.

Google Search Console

I will dedicate another post to explain the configuration and uses of Search Console , an essential tool for indexing and SEO that will complement this basic SEO post.

For those who have already linked the properties and have access to this tool, I will advance that, in order to index urls, there are three basic functions that it brings:

(I) Sitemaps . The 'site maps' are files in which listings of the pages that make up our website and that makes it easier for Google bots to crawl the web.

We must try to have one in which there are only the pages that interest us to index our website. And when you're ready, send it to Google using the Search Console in the 'crawl / sitemaps' options

(II) Fetch as Google . Within the 'tracking' options of Search Console we find this tool that will help us accelerate the indexing process of the url that you analyze:

broken image

(III) Tools for webmaster ' Submit url ':

broken image

Perhaps, at this point you think that it is not an SEO post as basic as it might seem, but, being such a technical field, it usually costs a little more to take the first steps. So, I encourage you to launch without fear. And you can always tell us your experiences or ask us questions.