Code Monkey home page Code Monkey logo

spidy's Introduction

Hi ๐Ÿ‘‹ - Glad to see you here!

I'm a Golang Software Engineer, passionate about developing robust programs, creating elegant and scalable APIs.

As an open-source enthusiast, I believe that writing clean and well-documented code is the foundation for maintainability.

Feel free to reach out to me at [email protected]. I look forward to hearing from you!

Languages and Tools:

Go Svelte JavaScript Shell Script Linux Docker Git

spidy's People

Contributors

twiny avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

spidy's Issues

Document how -u can accept multiple urls

Is your feature request related to a problem? Please describe.
I am not sure if its -u url1.com url2.com or -u url1.com,url2.com or -u url1.com -u url2.com

I am sorry im not familiar enough with go yet to figure it out. But if you explain ill submit a pull request to update the readme

Returns .tv domains as available even though they are not

Describe the bug
The crawler marks domains like justin.tv (old twitch.tv) as available when they are not. Inside the results .csv

To Reproduce

  1. install
  2. run on any website
  3. wait to see results

Expected behavior
.tv domains are correctly categorized.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):
tested on windows 10, linux & alpinejs

panic: send on closed channel

Encountered this error with go version go1.15.6 linux/amd64 on Ubuntu 20.04.1 LTS and the following command line:

curl -v -silent -l https://twitter.com --stderr - | awk '/^content-security-policy:/' | grep -Eo "[a-zA-Z0-9./?=_-]*" | sed -e '/\./!d' -e '/[^A-Za-z0-9._-]/d' -e 's/^\.//' | sort -u | httpx -o spidy/input.txt && spidy -config spidy/setting.yaml

Welcome, Spidy is running.
false schema.org
false googletagmanager.com
false giphy.com
false facebook.net
false google-analytics.com
false vine.co
false twitter.com
false t.co
false doubleclick.net
false vineapp.com
false nytimes.com
false twimg.com
false pscp.tv
false google.com
false apple.com
false bit.ly
false happs.tv
false n.tw
false jquery.com
false wordpress.com
false pocoo.org
false apache.org
false jquery.org
false gruntjs.com
false emberjs.com
false github.com
false mattt.me
false bower.io
false stylus-lang.com
false ogp.me
false ethicspointvp.com
false jamsadr.com
false dataprotection.ie
false privacyshield.gov
panic: send on closed channel

goroutine 172 [running]:
github.com/superiss/spidy/crawler.(*Spider).extract(0xc000276600, 0xc000586000, 0x3b7d, 0x3e00)
        /root/go/pkg/mod/github.com/superiss/[email protected]/crawler/crawler.go:268 +0x105
github.com/superiss/spidy/crawler.(*Spider).Run.func4(0xc000276600, 0xc00006cea0)
        /root/go/pkg/mod/github.com/superiss/[email protected]/crawler/crawler.go:352 +0x71
created by github.com/superiss/spidy/crawler.(*Spider).Run
        /root/go/pkg/mod/github.com/superiss/[email protected]/crawler/crawler.go:350 +0x21b

setting.yaml:

Engine:
  worker: 100
  parallel: 10
  depth: 10
  urls: ./spidy/input.txt
  proxies: []
  #tlds: [com, net, io, co, ly, me, us, at, st, so]
  tlds: []
  random_delay: 5s
  timeout: 30s

httpx can be found here: https://github.com/projectdiscovery/httpx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.