generate_valid_invalid_urls.py
is a Python script for generating valid URLs by assembling URL components such as scheme, authority, path, query, and fragment, in accordance with RFC 3986 standards. It also includes a method for generating invalid URLs.
- Functions are designed to generate different types of characters needed for URLs, including percent-encoded characters, general delimiters, sub-delimiters, reserved, and unreserved characters.
- Random URL schemes are generated, which are crucial for defining the protocol to be used.
- Constructs the authority part, including user information, host (IPv4, IPv6, future IP versions), and port, essential for locating the resource.
- Generates paths in various forms, pivotal for specifying the resource location on the server.
- Supports the creation of query and fragment components, used for providing additional data and navigating within a page.
- Percent-encoding: Essential for encoding characters that may not be safely sent in URLs.
- Scheme: Defines the protocol used for the URL, such as HTTP or HTTPS.
- Authority: Specifies the domain name or IP address and port number, indicating the server.
- Path: Identifies the specific resource within the server.
- Query: Provides additional information to the server for resource retrieval.
- Fragment: Allows direct access to a subsection of the resource.
- Installation: No external dependencies required.
- Execution:
python generate_valid_invalid_urls.py
- Output: Prints 100 valid URLs to the console.
- Adjust the
N
variable to change the number of URLs generated. - Modify maximum lengths for different components in their respective functions.
Contributions are welcome for improvements or new features.
Open-sourced under the MIT License.