Comments (20)
Hey @uBadRequest,
You might be running out of memory, for the "could not marshal json" errors? I can add HTTP request retries to get the common crawl index.
from gau.
Can you share the command you were using to run this in parallel? It might be caused by a memory leak on my part, I'll investigate
from gau.
I think I may have fixed the issue. Can you re-install this new version and test?
from gau.
Same for me with even with your last commit.
"could not unmarshal json from commoncrawl: invalid character '<' looking for beginning of value"
from gau.
That is an expected error when CommonCrawl rate-limits you. The fix was meant to help with the panic
from gau.
@lc
Can I disable CommonCrawl? Can I limit pages?
from gau.
@iTestAndroid, I thought about implementing this recently. I can add a flag that allows you to specify which ones you want to use
from gau.
Sure, I tried to remove CC as I only want the OTX, but I think I messed up some other parts of code. Switch would help, alternatively when you hit the CC limit, it can continue instead of breaking or something
from gau.
Hey @iTestAndroid,
If you only want URLs from OTX you can alternatively use this tool: http://github.com/lc/otxurls
I'll add limiting the fetchers to my todo list.
from gau.
Great, thanks.
Also I commented out CommonCrawl and I still got this:
"error in parsing JSON from alienvault: invalid character '<' looking for beginning of value"
and
"could not decoding response from wayback machine: net/http: request canceled (Client.Timeout exceeded while reading body)"
from gau.
@lc Am I missing something?
I tried otxurls and I still got this:
2020/04/23 18:00:54 Could not decode json: invalid character '<' looking for beginning of value
from gau.
Yeah that happens when AlienVault's OTX does not respond with JSON. You might just be getting IP-blocked / rate-limited. Maybe try from another IP
from gau.
@lc
But when I manually open and type the URL in my browser I can clearly see the JSON data
https://otx.alienvault.com/api/v1/indicators/hostname/google.com/url_list?limit=50&page=1
from gau.
Many concurrent requests might be blocked, not 100% sure, however, the error catching is as intended
from gau.
@lc
Can we slow it down so it works? One thread at time or something?
from gau.
@iTestAndroid, it currently runs on one thread, but I could add an option to set the delay between the fetchers
from gau.
@lc
Sure, but I'm not 100% sure that is the problem with the feed. Also if you read 200 pages and page number 201 doesn't work or return wrong JSON, don't break. Maybe exception handling and then returning list of all URLs code was able to capture so far?
from gau.
Hey @iTestAndroid @uBadRequest,
I just merged a great pull-request that should fix these issues.
Closing this now
from gau.
@iTestAndroid, I just added the -providers
flag as well now so you can specify which providers you want to fetch URLs from.
from gau.
All working well now. Thank you
from gau.
Related Issues (20)
- error while installing gau
- .git fatal
- not working HOT 3
- Black list filter not working
- fatal: not a git repository (or any of the parent directories): .git HOT 2
- How do I hack a cctv camera
- installation error : no 'go' files in gau/cmd HOT 1
- broken HOT 3
- gau not showing urls correctly HOT 2
- blacklist doesn't seem working HOT 3
- error reading config: open /home/g0xkayala/.gau.toml: no such file or directory HOT 4
- Cyber security
- WARN[0009] http://url2547.tesla.com - failed to fetch wayback pagination: API responded with non-200 status code provider=wayback HOT 6
- Gua
- gau is not working with list of urls HOT 1
- Regarding feat: implement unthrottled concurrency using task queue HOT 8
- WARN[0000]
- Error
- Error when including OTX Api Key
- i do not use --subs but show all
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gau.