Code Monkey home page Code Monkey logo

sitemapgen4j's People

Watchers

 avatar

sitemapgen4j's Issues

URL using non ASCII characters [utf-8 support]

When I use custom url with Arabic or chineese characters , I get '?' 
I'm using version 1.0.1 on  windows 8?

here is the fix :

Ligne : 245 [SitemapGenerator.class]
 if (gzip) {
                FileOutputStream fileStream = new               FileOutputStream(outFile);
                GZIPOutputStream gzipStream = new GZIPOutputStream(fileStream);
                out = new OutputStreamWriter(gzipStream, Charset.forName("UTF-8").newEncoder());
            } else {
                out = new OutputStreamWriter(
                        new FileOutputStream(outFile),
                        Charset.forName("UTF-8").newEncoder());
            }

Original issue reported on code.google.com by [email protected] on 17 Jul 2013 at 3:11

wsg.writeSitemapsWithIndex()

This may not be a bug/issue/problem. 

I sought to generate a site map-with-index per the example in the documentation

Example code from documentation:
WebSitemapGenerator wsg = new WebSitemapGenerator("http://www.example.com", 
myDir);
for (int i = 0; i < 60000; i++) 
wsg.addUrl("http://www.example.com/doc"+i+".html");
wsg.write();
wsg.writeSitemapsWithIndex(); // generate the sitemap_index.xml


What steps will reproduce the problem?
1. Instantiate WebSiteMapGenerator
2. add urls
3. call wsg.write()
4. call wsg.writeSiteMapWithIndex()

Expected Result
===================
Code generates sitemap.xml and site map index xml file.


Actual Results
=====================

Exception while exeecuting task 'Generate Site Map' (999) 
java.lang.RuntimeException: No URLs added, sitemap index would be empty; you 
must add some URLs with addUrls


Issue: 
-I don't see a place where the api requires/allows you to 'add a url' between 
the write() and the wsg.writeSitemapsWithIndex();


-I have a very small data set: one url. This does not seem relevant to the 
problem.

Version
===========
Version 1.0.1




Appendix A. Code
=====================

         WebSitemapGenerator wsg = new WebSitemapGenerator(BASE_URL, myDir);

         for (Post post : posts)
         {
            ChangeFreq changeFreq = ChangeFreq.HOURLY;
            WebSitemapUrl url = new WebSitemapUrl.Options(BASE_URL + ELEMENT_URL + post.getID())
                    .lastMod(post.getThreadActivityDate())
                    .priority(1.0)
                    .changeFreq(changeFreq).build();

            wsg.addUrl(url);

         }
         wsg.write();
         wsg.writeSitemapsWithIndex();

Original issue reported on code.google.com by [email protected] on 13 Jul 2010 at 12:46

AbstractSitemapUrlRenderer should take care of escaping entities & and < in URLs

What steps will reproduce the problem?
1. add a URL containing a & in path, e.g. http://www.domain.com/user/me&you/
2. generate the sitemap
3. ampersand is not correctly encoded for XML

What is the expected output? What do you see instead?
ampersand should be encoded for XML:
http://www.domain.tld/user/me&amp;you/

What version of the product are you using? On what operating system?
1.0.1 on win xp


Please provide any additional information below.
Both & and < are valid characters of a URL, but not in XML
see URL RFC (3.3 / 3.4): http://www.ietf.org/rfc/rfc2396.txt
and XML Spec (2.4): http://www.w3.org/TR/REC-xml/


Original issue reported on code.google.com by [email protected] on 9 May 2009 at 4:47

Missing xmlns:xsi and xsi:schemaLocation namespaces

Please change in:
com.redfin.sitemapgenerator.SitemapGenerator 

in method:
private void writeSiteMap(OutputStreamWriter out) throws IOException 

out.write("<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\" 
");

TO:

out.write("<urlset xmlns=\"http://www.sitemaps.org/schemas/sitemap/0.9\" " 
+
                    "      xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-
instance\"" +
                    "      
xsi:schemaLocation=\"http://www.sitemaps.org/schemas/sitemap/0.9" +
                    "            
http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd\"");

Original issue reported on code.google.com by [email protected] on 15 Mar 2010 at 4:27

Missing IMAGE:IMAGE tag

What steps will reproduce the problem?
there is no image:image tag

What is the expected output? What do you see instead?
<url>   
  <loc>...</loc>
  <image:image>
    <image:loc>...</image:loc>
    <image:caption>...</image:caption>
    <image:title>...</image:title>
  </image:image>
  <image:image>
    ...
  </image:image>
  ...

  <lastmod>...</lastmod>
  <priority>...</priority>
</url>

What version of the product are you using? On what operating system?
1.0.1, Linux

Please provide any additional information below.
I like to add image urls to an url like:
http://support.google.com/webmasters/bin/answer.py?hl=en&answer=178636

Original issue reported on code.google.com by [email protected] on 2 Jul 2012 at 9:35

UrlUtils.checkUrl as option

1. Generate sitemap with urls different from main domain

As we use sub domain name and alias, our urls are not always similar to main 
domain.

Example :

www.mydomain.com

Url added

www.aliasspecific.com/.....

May it be possible to get a flag in order get generation skipping checkUrl 
process ?

Thanks,

Jul



Original issue reported on code.google.com by [email protected] on 21 May 2012 at 9:38

Make This App Engine Friendly

What steps will reproduce the problem?
1. Try using this on App Engine. 

What is the expected output? What do you see instead?

It works, it doesn't due to the restrictions on java.io and file system access.

What version of the product are you using? On what operating system?

1.0, Google App Engine

Please provide any additional information below.

I think it should be easy enough to add a method that instead of trying to 
write to a file gives you the output as a blob or some kind of input stream so 
you can then write the output to the datastore or even memory to be sent to the 
crawler upon request. Thanks!

Original issue reported on code.google.com by [email protected] on 14 Dec 2010 at 7:55

Usage of WebSitemapUrl.class cannot be compiled with Java 1.7.0_25

What steps will reproduce the problem?
1. Refer WebSitemapUrl.class from a class that is not in package 
com.redfin.sitemapgenerator. Our use case is in a test with Mockito using code 
similar with that below:

     WebSitemapGenerator mock = Mockito.mock(WebSitemapGenerator.class);
     ...
     Mockito.verify(mock, Mockito.times(1)).addUrl(Mockito.any(WebSitemapUrl.class));

2. Compile with Java 1.7.0_25
3. Get error: ISitemapUrl is not public in com.redfin.sitemapgenerator; cannot 
be accessed from outside package

What is the expected output? What do you see instead?
It should compile, but instead it fails because ISitemapUrl is not declared 
public

What version of the product are you using? On what operating system?
1.0.1

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 16 Jul 2013 at 5:24

Ability to change output to other outputstreams, strings, etc.

Hi,

very nice clean project.  However I have a feature request:
 - The ability to change the output to be limited to only file.

I'd like to be able to write to various implementations of outputstreams / 
writers as opposed to be limited to file only.  Specifically in mind, I could 
write to ServletOutputStreams, StringWriters, etc.

Thanks

Original issue reported on code.google.com by [email protected] on 27 Mar 2010 at 3:11

Support for mixed video and image sitemaps?

Hi,

Do you plan on supporting mixed Google Image and Video sitemaps as specified in 
https://support.google.com/webmasters/answer/183668?hl=en

I was hoping I could extend GoogleVideoSitemapGenerator and 
GoogleVideoSitemapUrl myself to do this, but unfortunately 
AbstractSitemapGeneratorOptions isn't visible.

Cheers,

Mark

Original issue reported on code.google.com by [email protected] on 1 Aug 2014 at 10:39

Allow WebSitemapGenerator to use Commons-VFS

Is it possible for the WebSitemapGenerator builder to have a method  where you 
specify a Commons VFS manager to store the sitemap on different filesystems 
(local, remote, RAM, FTP, Samba server, Amazon S3, etc.)

Original issue reported on code.google.com by [email protected] on 29 Oct 2014 at 12:12

Google-News-Sitemap xmltags missing

Its currently not possible to create a complete
google-news-sitemap with this tool.
Tags like 
"title" or "publication" are important and missing.

Original issue reported on code.google.com by [email protected] on 27 Jan 2010 at 2:51

sitemap index limit error

i find the program limit SitemapIndexGenerator.maxUrls no more then 
1000(MAX_SITEMAPS_PER_INDEX), but just as I konw, google does not to limit this.


Original issue reported on code.google.com by [email protected] on 20 Sep 2012 at 4:58

GZip and AutoValidate builder options are not compatible

What steps will reproduce the problem?
1. User builder
2. Set GZIP=true
3. Set autoValidate=true

What is the expected output? What do you see instead?

Expected the output to validate despite being gzipped.
Actually, it fails trying to read GZIP as XML.

What version of the product are you using? On what operating system?

1.0.1 under Ubuntu.

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 1 Mar 2012 at 1:43

Cannot add URL which does not start with base URL

What steps will reproduce the problem?

wsg.addUrl("http://www.facebook.com/[some FB user]");

What is the expected output? What do you see instead?

Stack trace:
        java.lang.RuntimeException: Url http://www.facebook.com/[some FB user] doesn't start with base URL http://www.[some host].com
    at com.redfin.sitemapgenerator.UrlUtils.checkUrl(UrlUtils.java:10)
    at com.redfin.sitemapgenerator.SitemapGenerator.addUrl(SitemapGenerator.java:59)
    at com.redfin.sitemapgenerator.SitemapGenerator.addUrl(SitemapGenerator.java:119)


What version of the product are you using? On what operating system?

sitemapgen4j-1.0.1.jar

OS: Windows 7


Please provide any additional information below.

I've never heard of a restriction where you are not allowed to include external 
links into your sitemap. In fact having external links is one of the criteria 
for Google's rating.

Original issue reported on code.google.com by [email protected] on 1 Aug 2012 at 8:16

jdom integration

Sample on how to generate code that can be and should be subject to 
namespace validation:

//important imports
import org.jdom.Attribute;
import org.jdom.output.Format;
import org.jdom.output.XMLOutputter;
import org.jdom.Document;
import org.jdom.Element;

//code sample
Document document = new Document();

Namespace xmlns = 
Namespace.getNamespace("http://www.sitemaps.org/schemas/sitemap/0.9");
Namespace xsi = Namespace.getNamespace("xsi", 
"http://www.w3.org/2001/XMLSchema-instance");

Element rootElement = new Element("urlset", xmlns);
rootElement.addNamespaceDeclaration(xsi);
Attribute schemaLocation = new Attribute("schemaLocation", 
"http://www.sitemaps.org/schemas/sitemap/0.9" +
" http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd", xsi);
rootElement.setAttribute(schemaLocation);

document.setRootElement(rootElement);

            //begin url harvesting
            Element url = new Element("url", xmlns);


            Element loc = new Element("loc", xmlns);
            Element lastmod = new Element("lastmod", xmlns);
            Element changefreq = new Element("changefreq", xmlns);
            Element priority = new Element("priority", xmlns);

            loc.addContent("");
            lastmod.addContent("");
            changefreq.addContent("");
            priority.addContent("");

            url.addContent(loc);
            url.addContent(lastmod);
            url.addContent(changefreq);
            url.addContent(priority);

            rootElement.addContent(url);
            //end loop

            createSitemapFile(document);

Original issue reported on code.google.com by [email protected] on 16 Mar 2010 at 2:17

GoogleVideoSitemapUrl.Options misses the field

http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=80472
describes an optional field <video:expiration_date> for a date after which 
the video will no longer be available.

GoogleVideoSitemapUrl.Options should have a method 
expirationDate(java.util.Date publicationDate) to set this field and 
GoogleVideoSitemapGenerator should write this field to the Video-Sitemap.


What version of the product are you using?
sitemapgen4j-1.0.1

Original issue reported on code.google.com by [email protected] on 1 Sep 2009 at 10:56

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.