Light

Jan Čurn photo

jancurn Goto Github PK

followers: 98.0 following: 10.0 repos: 18.0 gists: 8.0

Name: Jan Čurn

Type: User

Company: @apify

Bio: Founder and CEO of Apify - the web scraping and automation platform. PhD in AI. Y Combinator Fellow.

Twitter: jancurn

Location: Prague, Czech Republic

Blog: apify.com/jancurn

Jan Čurn's Projects

act-pdf-to-html

Converts PDF to HTML using the pdf2htmlex tool

act-probe-page-resources

Apify act to load web pages and analyze HTTP resources they request

actor-amazon-crawler

Amazon crawler - this configuration will extract items for a keywords that you will specify in the input, and it will automatically extract all pages for the given keyword. You can specify more keywords on the input for one run.

actor-analyze-domains

An Apify actor that crawls web pages from a list of provided domains and analyzes them. For example, it checks whether pages have HTTPS version, saves their HTML content and screenshot, HTTP response headers, SSL certificate information, text body, outgoing links, emails, phone numbers, social handles and more.

actor-find-broken-links

A source code of an Apify actor that finds and reports broken links on a website. Unlike other SEO analysis tools, it also reports broken URL #fragments.

actor-metadata-extractor

An Apify actor that crawls a list of web pages and extracts various metadata from them.

actor-residential-proxy-probe

Probes Apify residential proxies and maintains a pool of proxies from specific ZIP codes or DMAs

actor-selenium-custom-firefox

Apify actor with custom build of Firefox, instrumented using Selenium.

awesome-puppeteer

A curated list of awesome puppeteer resources.

awesome-web-scraping

List of libraries, tools and APIs for web scraping and data processing.

bson-ext

The C++ bson parser for the node.js mongodb driver.

iron-router

A client and server side router designed specifically for Meteor.

llama_index

LlamaIndex (GPT Index) is a project that provides a central interface to connect your LLM's with external data.

nekolikotazek

puppeteer

Headless Chrome Node API

stayinghomeclub

A list of all the companies WFH or events changed because of covid-19

www

The mitmproxy website, https://mitmproxy.org/.

yclist

List and description of ycombinator companies

1

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.