phpDocumentor Google-XML-Sitemap-Feed
[ class tree: Google-XML-Sitemap-Feed ] [ index: Google-XML-Sitemap-Feed ] [ all elements ]

Google XML Sitemap Feed Documentation

Welcome to the Google XML Sitemap Feed Documentation - by Chemo

This contribution was coded to meet the protocol specification delineated by Google for the Sitemaps experimental service.

Current features:
  • Super easy install - no file edits! Just upload the files, set the permissions, and optionally create a CRON job to handle automatic maintenance.
  • Generation of sitemap files for products and categories (separate files)
  • Generation of a sitemap index file
  • Built-in support for large catalogs (more than 50,000 products, categories, or both)
  • Optional file compression per Google specification (GZ format)
  • Thorough documentation for developers! Don't be shy...improve the code if you can.

Install Directions

The contribution package should have the following structure - important files are bold:

  • install.html (redirects to this documentation page)
  • sitemapproducts.xml (dummy file)
  • sitemapcategories.xml (dummy file)
  • sitemapmanufacturers.xml (dummy file)
  • sitemapspecials.xml (dummy file)
  • sitemapindex.xml (dummy file)
  • gss.xml
  • googlesitemap (directory)
    • index.php
    • sitemap.class.php
    • ...various documentation

Upload the googlesitemap directory to your catalog directory. If your store is installed in the domain root (domain.com/) then it should be accessible via browser like domain.com/googlesitemap/. As another example, if you store is installed in the "catalog" directory it will look like domain.com/catalog/googlesitemap/. Do not change the name of the directory!

Upload the dummy files (the XML files) to your catalog directory. So, when you are done you should be able to call the files in your browser like this:

If the store is in the document root:
  • domain.com/sitemapproducts.xml
  • domain.com/sitemapcategories.xml
  • domain.com/sitemapindex.xml
Else if the store is in a directory:
  • domain.com/directory/sitemapproducts.xml
  • domain.com/directory/sitemapcategories.xml
  • domain.com/directory/sitemapindex.xml

Once you have the dummy XML files uploaded you will need to make them wrtieable by the web server. The easiest way to do this is to start your favorite FTP client, right click the files, and change the permissions on each one. The correct settings will vary based on your server setup but generally speaking a setting of 777 or read, write, execute will work every time.

At this point you have the googlesitemap directory and dummy files uploaded with proper permissions. Congratulations! You are done with the installation! Wasn't that easy? Now it's time to test the code...

To test the script simply call the /googlesitemap/index.php file in your browser. This is a special script that was designed to be called either via browser or CRON job and will generate the proper sitemap files for you. Before we setup the CRON to handle automatic maintenance we have to test it with the browser.

http://www.yourdomain.com/{directory?}/ <= cut-n-paste for your convenience :-)

Once you call the script in your browser the text may run together an appear pretty sloppy. However, keep in mind that the output is perfect for CRON jobs...in the browser it sucks. The CONTENT is what matters. If there are no errors such as wrong permission or other the near last line should say "If you have not already submitted the sitemap index to Google click the link below". If you see this text CONGRATULATIONS...your contribution is installed and you are ready to move onto the CRON setup (optional...but if you're this far might as well give it a go). However, if you encountered errors please skip down to the quick help section below.

If the above browser test was successful you should now be able to view the XML files for quality assurance.

http://www.yourdomain.com/{directory?}/
http://www.yourdomain.com/{directory?}/
http://www.yourdomain.com/{directory?}/

Once you have verified the data quality it's time to setup the CRON so the system is maintained automatically. I'll be using cPanel for this and if you use a different domainCP please ask your host for specific directions.

The first step is to login to your cPanel and find the CRON menu. (click for full size image)

The next step is to select the "advanced" menu. The reason is that everything is already setup as default and all you have to do is enter the command + path to index.php (click for full size image).

Notice that the default CRON time setting is to execute every day at midnight. For most this is perfectly acceptable and makes it easier to create this task. The only thing left to do is enter the command.

CRON Command: <= change the path!

Change the path to the correct location! On Linux based servers it will ALWAYS start with a forward slash.

A nice feature of the CRON is that it will automatically email you the results of the task. I recommend receiving at least the first 1 as it has the submission link as part of the output! So, enter a valid email address in the field and wait for the email!

After you get the first email and click the submit link you are all done! The system will update itself every day at midnight and Google will use the sitemap files to crawl your site.

Enjoy!


Support Support thread

Donations

This contribution was coded for the benefit of the community. To help support my coding efforts consider a donation...

Enter Donation amount: $

Documentation generated on Sat, 4 Jun 2005 23:45:56 -0400 by phpDocumentor 1.3.0RC3