J!Extensions Store™
Forum
Welcome, Guest
Please Login to access forum.
Large sitemaps management (1 viewing) 
Go to bottom
TOPIC: Large sitemaps management
#104
John Dagelmore
Admin
Posts: 3716
User Online Now
Large sitemaps management Karma: 79  
This thread is to point out limitations about generating sitemaps with tens of thousands links.

So many users experiment issues generating very large sitemaps, and maybe they expect this can be accomplished in a snap.

A lot of trouble occur when for example 50.000 records needs to be managed for a single sitemap generation, mostly due to server settings first of all.
Mainly issues that prevent a complete sitemap generation in these cases are:
- Server timeouts
- Memory limit

This should let you understand that generating a full sitemap for say 100.000 links also using a good component like jsitemap can't be viable if server settings and resources are not adequate.

For this is recommended to contact us asking for a free trial and an on site demo in all these cases.

The same can occur generating images sitemaps. For images sitemap a crawler needs to scan all the links on your site, and this is an quick operation for 100 links, but for 1000 links can require several minutes. On server machine that have a timeout set to 60 seconds won't be possible generate a full images sitemap.
Again, contact us to test on site if right setup can be found for your server.
 
Logged Logged  
  The administrator has disabled public write access.
#229
Duncan Shiell
Fresh Boarder
Posts: 3
User Offline
Re:Large sitemaps management Karma: 2  
I'm struggling a bit with generating an image sitemap for my website.

I've no problem when I restrict the image sitemap to active files only. The problems occur when I ask the system to look at the archived files as well.

The greatest number of files that I have managed to process is just under 500 out of nearly 800 on site. I get the best result with a split map and the request limit set to 500 ~ 15 minutes. Trying 750 ~20 minutes times out. Changing the max number of links per file has no effect.

Is there any way of excluding specific Categories from the image sitemap? I don't need to track a few of my categories.

My next step will be to set some files temporarily to Trashed status as I suspect JSitemap doesn't look at trashed files.
 
Logged Logged  
  The administrator has disabled public write access.
#230
John Dagelmore
Admin
Posts: 3716
User Online Now
Re:Large sitemaps management Karma: 79  
Hi Duncan,

trying to explain how images sitemap works probably will help you to understand better.
It acts as a crawler that scan pages on your site finding images embed in HTML code. It's the very same system that a google bot uses. For this reason the process may require several minutes if you have a lot of links and if server timeout is low you receive script timeout.

Images sitemap has filters that let you specify what images have to be excluded/included, the categories exclusion is another type of filter available in th 'content' data source and exclude categories in all types of sitemaps.

Keep in mind that trashed or not trashed files should not affect crawler because it grabs images on your pages and if the path have no exclusions the image will be added to sitemap.

Finally we have started developing a new system to generate large image sitemap avoiding server timeouts, thanks to an incremental process step by step. This feature will be available in next version 2.3.

Thanks and best regards

John
 
Logged Logged  
  The administrator has disabled public write access.
#233
Duncan Shiell
Fresh Boarder
Posts: 3
User Offline
Re:Large sitemaps management Karma: 2  
John

Thanks

I guess I should have read the manual a bit more before I posted.

After reading your reply, I have now studied the default Content source and can see that there are user parameters in there that I can tweak to get what I need.

I have now generated an Image Sitemap that gives me all that I require.

Duncan
 
Logged Logged  
  The administrator has disabled public write access.
#234
John Dagelmore
Admin
Posts: 3716
User Online Now
Re:Large sitemaps management Karma: 79  
That's great

So sometimes i'm useful
 
Logged Logged  
  The administrator has disabled public write access.
Go to top