🌟 Making sure that your website is indexed using sitemap πŸš€: SEO series 😎

A sitemap is like a roadmap πŸ—ΊοΈ for search engines, guiding them through the different pages in your website and helping search engines like Google to properly index the pages. In this post, we'll dive into what sitemaps are, why they're crucial for your site, and how to create one, taking Hubhandle.com as an example. Let's embark on this SEO journey together πŸŽ“

Grow your social media audience


How do search engines find your page? πŸ€”

Think of it this way: 🧠 If I give you a website URL like hubhandle.com, how would you know what all pages are available on the site? Well, you would probably click on links in headers and navigation buttons, right? This will lead you to another page, which might also contain more links. You will then click on those links that will take you to other pages. This is exactly how search engine crawlers work. They have information about your website domain, and these bots crawl your page.πŸ€– They look for any links and follow those until they end up in a dead end with no links, or come back to the same page they have seen before. They will do this for all links on all pages and create a mapping of your site. πŸ•΅οΈβ€β™‚οΈπŸ”—

What if your website pages aren't linked properly, but you still want your "hidden pages" to appear in the search results? Well, this is quite common in large and complex sites. Not all pages are linked. But even in a simple site like hubhandle.com, this "hidden pages" problem can occur. When we launched hubhandle.com, we only had a landing page, and we designed it that way. There was only a landing page, and no links to other pages. Recently, we launched our blogging platform, hubhandle.com/blog. πŸš€ This site was in a different path (/blog), and there was no way to reach this page from the landing page. We had to manually put the blogs in the Google Search Console πŸ› οΈ and tell it to index those pages.


Introducing sitemaps πŸ—ΊοΈ

Sitemaps act as a guidebook πŸ“– for search engines, especially for those elusive "hidden pages.". Even if you think that the pages in your sites are properly linked, and there are no "hidden pages", you should still create a sitemap for your site. Creating a sitemap won't hurt. It provides additional information to the search engines and is pretty easy to create. πŸ’―

A sitemap is a file that contains all the enpoints in your website that the search engine need to crawl. It also provides additional information like how often the contents of the site changes, when was the site updates, etc.


Understanding sitemap format πŸ“

Sitemap can be written in multiple formats. In this post, I will be talking about XML sitemaps, and that is most probably the only format you will need too. A XML file format contains an inverted tree like structure. The branches of a tree comes out of a stem and more branches comes out of branches. Similarly, the "stem" of the xml is the origin, which will contain more "branch" that describes additional details.

    
    <stem>
        <branch1>
            <property1>Value 1</property1>
            <property2>Value 2</property2>
        </branch1>
    
        <branch2>
            <property1>Value 1</property1>
            <property2>Value 2</property2>
        </branch2>
    </stem>
    

This is an example of a XML file. As you can see, the stem contains two branches, each of which contains two properties and a value for each property. The things inside opening and closing braces  < ... >  are called tags. For eg:  <stem>  is a tag named "stem". All the tags must have a closing tag like  </stem> .



Creating a sitemap for Hubhandle.com πŸ–₯️

Let's put theory into practice. We'll craft a sitemap for Hubhandle.com. Before creating a sitemap, you should know all the endpoints of your site. Let's create a sitemap for hubhandle.com. At the time of writing, hubhandle.com contains these endponts

Let's see a sitemap for the first link:

    
    <?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
        <url>
            <loc>https://hubhandle.com</loc>
            <lastmod>2024-02-12</lastmod>
            <changefreq>weekly</changefreq>
            <priority>1.0</priority>
        </url>
    </urlset>
    

The first line is a metadata, that describes xml version and encoding used in the sitemap. The endpoint of your site must be contained inside  <urlset>...;</urlset>  tag. Then your have to put the sitemap data for each of the endpoints inside  <url>...;</url>  tag.  loc  tag contains the location to the endpoint.  lastmod  describes when was the endpoint last modified.  changefreq  describes how often the site changes. The values can be "daily", "weekly", "monthly", "yearly", "never", "always", "hourly".  priority  is a value between 0 to 1 that indicates how important this endpoint is compared to other endpoint in the sitemap. Value of "1" means is has the highest importance. Have a look at hubhandle.com/sitemap.xml for reference. Refer to sitemaps.org for more details.



Validating sitemap using xmlschema-validate βœ…

Now that you have written the sitemap for your site, it's time to validate the sitemap. There are different tools to validate the XML. Visit http://www.w3.org/XML/Schema#Tools or http://www.xml.com/pub/a/2000/12/13/schematools.html for the list of tools.

I am using xmlschema-validate on linux. Download the XML schema from http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd. This schema contains instructions for the validation tool regarding how to validate the xml. Now, install the tool and run the validation.

    
    $ sudo apt install python3-xmlschema
    $ xmlschema-validate -v --schema sitemap.xsd sitemap.xml 
    sitemap.xml is valid
    

Hooray!! πŸŽ‰ Our sitemap is valid! βœ…


Updating robots.txt πŸ€–

Now that you have created a sitemap for your site, you need to tell the search engine where is the sitemap file located. In most cases, sitemap files are present at domain.tld/sitemap.xml. "robots.txt" is another file that the search engines uses to check what all pages to crawl. We will create a separate blog on robots.txt, follow hubhandle on Linkedin     X     Instagram or subscribe to the blog by filling the form below to get notified.

You can add a "Sitemap" entry in robots the txt. For instance, we have added this entry in our page's robots.txt.

    
    Sitemap: https://hubhandle.com/sitemap.xml
    


πŸŽ‰By creating a sitemap and updating your robots.txt, you've paved the way for search engine crawlers to explore every nook and cranny of your website. Sit back, relax, and watch as your pages soar to the top of search results! Happy indexing! πŸš€πŸ“ˆ

🌟 Love what you're seeing? Stay in the loop by following us! πŸš€


Appreciate your feedback
Please fill out this form. We won't spam your inbox. Your feedback will help us create product that you want.
Feedback Form
Thank you!! We appreciated your feedback.