Code Monkey home page Code Monkey logo

jurl's People

Contributors

anthonynsimon avatar codacy-badger avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jurl's Issues

Dropping trailing question mark

possibly related to #2 but I have some URLs that are not roundtripping, e.g. if there is a trailing ?

scala> import com.anthonynsimon.url.URL
scala> URL.parse("http://example.com/?")
res0: com.anthonynsimon.url.URL = http://example.com/

Build src and doc jar

I'm not sure if this because of jitpack.io, but I cant download the sources jar for jurl.
This makes debugging and understanding the library more difficult.

Maybe only an option is missing in the gradle build file.

Thanks for the cool library!

Slow parsing

Hi,

How did you get "4+ million URLs per second"?
I have tried parsing 10.000.000 urls, it took 75 416 ms

final int count = 10000000;
val urls = new ArrayList<String>(count);
for (int i = 0; i < count; i++) {
  urls.add("http://user@domain" + i + ".com:12345/a/great/path/?with=query&unicode_parameter=๐Ÿ˜Š&nothing#cool");
}

long start = System.currentTimeMillis();
for (String url : urls) {
  com.anthonynsimon.url.URL.parse(url);
}
System.out.println("time: " + (System.currentTimeMillis() - start) + " ms");

Library doesn't parse port

Your parser is the best solution I've found, but it does not parse port. Is there any reason why you didn't implement port parsing?

.toString() bugs

The string serialization seems incorrect. Maybe not all URL segments are properly urlencoded ?

Also UTF-8 is assumed in urlencoded parts, but in my corpus from the web there are some examples of latin1 encoded ones. I'n not sure about what the standard says, but chrome recognizes it.

import com.anthonynsimon.url.URL;

public class JurlTest {
  public static void main(String[] args) {
    test("http://abc.net/1160x%3E/quality/");
    test("http://db-engines.com/en/system/PostgreSQL%3BRocksDB");
    test("http://xzy.org/test/hei%DFfl"); // latin1
    test("http://www.net/decom/category/AA/A_%26_BBB/AAA_%26_BBB/"); // !!!
    test("https://en.wikipedia.org/wiki/Eat_one%27s_own_dog_food");
  }

  private static void test(String url) {
    try {
      Thread.sleep(10);
    } catch (InterruptedException e) {
    }
    try {
      URL parse = URL.parse(url);

      if(!parse.toString().equals(url)) {
        System.out.print("NOT EQUAL: ");
        System.out.println(url);
        System.out.println(parse.toString());
        System.out.println();
      }

    } catch (Exception e) {
      System.out.print("KAPUTT: ");
      System.out.println(url);
      e.printStackTrace();
      System.out.println();
    }
  }
}

results in

NOT EQUAL: http://abc.net/1160x%3E/quality/
http://abc.net/1160x>/quality/

NOT EQUAL: http://db-engines.com/en/system/PostgreSQL%3BRocksDB
http://db-engines.com/en/system/PostgreSQL;RocksDB

java.lang.StringIndexOutOfBoundsException: String index out of range: 15
	at java.lang.String.substring(String.java:1963)
	at com.anthonynsimon.url.PercentEncoder.decode(PercentEncoder.java:181)
	at com.anthonynsimon.url.DefaultURLParser.parse(DefaultURLParser.java:85)
	at com.anthonynsimon.url.URL.parse(URL.java:73)
	at JurlTest.test(JurlTest.java:20)
	at JurlTest.main(JurlTest.java:9)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
KAPUTT: http://xzy.org/test/hei%DFfl

NOT EQUAL: http://www.net/decom/category/AA/A_%26_BBB/AAA_%26_BBB/
http://www.net/decom/category/AA/A_&_BBB/AAA_&_BBB/

NOT EQUAL: https://en.wikipedia.org/wiki/Eat_one%27s_own_dog_food
https://en.wikipedia.org/wiki/Eat_one's_own_dog_food

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.