Yoast and SiteMap Index XML
Leaving SEO (Search Engine Optimization) up to one piece of software is not wise. Especially if it is Yoast. Now don't get me wrong, Yoast does a fairly good job at most things. My biggest complaint with it is how much it abstracts what is really going on and substitutes its own terminology.
Here's an example Yoast making it absolutely unclear what it is doing, or looked at from another perspective, making it do something: Objective; Turn off or disable the Site Map for all Posts. Here's how: First click on the SEO Tab (not a Tab named Yoast, that would be too obvious), then select Search Appearance, then select the Content Types Tab, then under Posts, Settings for single Post URLs select No. Hmmm, there doesn't seem to be any literary connection between the objective and how to achieve it. This is an example of "Appling" something up. That means they've dumbed it down to the level of someone that probably shouldn't be doing something like this in the first place. But that's OK, just let a magic piece of software take care of everything you don't know about. That's a solution for success, right?
But what about the helpful information they provide? If one selects the Question Mark next to this item it states, "Not showing Posts in the search results technically means those will have a noindex
robots meta and will be excluded from XML sitemaps." It also provides a link for additional information. The help from the Question Mark might lead one to believe that Yoast will make an entry in the robots.txt (look it up) file. Nope, only a backhanded suggestion. And the "additional information" link? It never explicitly states what it does. It babbles about this and that, but never states what it really does by changing the setting to "No". This sentence from the help page should be expanded, "We’ve taken away a lot of the confusion around indexing content and XML sitemaps by simplifying things.", and have this added, "...plus we've simplified the explanation and also taken away any other useful information, which should only confuse intelligent people." Now to be fair if one continues to dig into their "expert" section, they finally get around to giving a better explanation (https://yoast.com/what-is-an-xml-sitemap-and-why-should-you-have-one/).
So my suggestion would be to have a great big On / Off Button on the dashboard of their interface that says: "Advanced Interface: On / Off" which changes everything to smarten it up from its dumbed down display.
Taking Control of Yoast
The first thing to know is that Yoast "creates" an XML sitemap index named "sitemap_index.xml" in the root directory of a WordPress website it is installed on. "Create" is in quotes, because it doesn't create a file, but makes it so WordPress responds when a file by that name is requested (sort of like most other WordPress requests). In addition to the sitemap_index.xml file it generates / "creates" several XYZ-sitemap.xml files for pages, posts, images, etc. (depending on the website and configuration of Yoast).
That leaves the door open for a bit of manipulation that allows full control of Yoast.
How?
Whoever controls the Sitemap Index controls everything. IE, a new Sitemap Index can be manually created to point to all of the individual Yoast generated Sitemaps plus custom Sitemaps or ones generated by other SEO software.
And that's fairly easy to do. Originally my thought was to use a simple Apache Rewrite directive in the site's .htaccess files (couldn't make it work inside of an Apache config file) as noted below to direct search engines to a different Sitemap Index;
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^sitemap_index\.xml$ /SitemapIndex.xml [R=302,L]
</IfModule>
The above RewriteRule basically says: "Anyone that is looking for sitemap_index.xml should be redirected to SitemapIndex.xml and the browser should be given a 302 (temporarily moved) error number and that's it (the L at the end)." But it didn't work. Why? Because the sitemap_index.xml file doesn't exist. The WordPress software responds with the appropriate information when the file is requested, but that's after Apache can Rewrite something. Don't forget to make the SitemapIndex.xml file and then copy the raw Yoast XML markup code into the file as a starting point.
Instead, just create the physical file sitemap_index.xml in the root directory of the WordPress website. My preferred solution was to use a hybrid of the two with a note inside of the sitemap_index.xml explaining what I had configured.
Sitemap Index XML File
As noted above, a good starting point is to copy the Yoast sitemap_index.xml code. More information is available from sitemaps.org (AKA Google), here: https://www.google.com/sitemaps/protocol.html
Take a look at some of the pros to see what they do;
- https://www.cnn.com/sitemaps/cnn/index.xml
- https://www.microsoft.com/learning/sitemap.xml (not a good example in terms of readability
- https://www.google.com/sitemap.xml (they're the boss)
Where can you find a site's sitemap or sitemap index? Here: WhatEverURL/robots.txt