zalando-incubator / transformer Goto Github PK

View Code? Open in Web Editor NEW

99.0 12.0 13.0 4.19 MB

A tool to transform/convert web browser sessions (HAR files) into Locust load testing scenarios (locustfile).

Home Page: https://transformer.readthedocs.io/

License: MIT License

Makefile 0.86% Python 99.14%

testing locust load-testing

transformer's Introduction

Transformer

A command-line tool and Python library to convert web browser sessions (HAR files) into Locust load test scenarios ("locustfiles").

Use it to replay HAR files (storing recordings of interactions with your website) in load tests with Locust.

Installation

Install from PyPI:

pip install har-transformer

Install Locust to run your locustfiles:

pip install locust

Usage

Example HAR files are included in the examples/ directory, try them out.

Command-line

transformer my_har_files_directory/ >locustfile.py

Library

import transformer

with open("locustfile.py", "w") as f:
    transformer.dump(f, ["my_har_files_directory/"])

Documentation

Take a look at our documentation for more details, including how to generate HAR files, customize your scenarios, use or write plugins, etc.

Authors

Serhii Cherniavskyi — @scherniavsky
Thibaut Le Page — @thilp
Brian Maher — @bmaher
Oliwia Zaremba — @tortila

See also the list of contributors to this project.

License

This project is licensed under the MIT license — see the LICENSE.md file for details.

transformer's People

Contributors

Stargazers

Watchers

Forkers

jsabak jredrejo tevanraj ed00m pivotsecurity edwindj auready divinp svenskaspel laleye thilp garutilorenzo rharish101

transformer's Issues

Replace our homegrown plugin system with pluggy

Is your feature request related to a problem? Please describe.

As a maintainer of Transformer, the less code I have to maintain, the better. This is particularly true of code that is not immediately relevant to Transformer's main goal, like the "plumbing" part of its plugin system.

Take for example:

transformer.plugins.contracts,
transformer.plugins.resolve,
their unit tests.

Taken together, they make "just" around 400 lines of code, but that's 10% of all Transformer, and that's not counting the corresponding user documentation (no good tools to count that quickly).

Describe the solution you'd like

I would like this "plumbing" to be taken care of by a third-party library like pluggy (thank you @marcinzaremba for the recommendation), which has battle-tested its design at a much larger scale than Transformer. I don't want to solve the same problems in an unrelated project.

It doesn't have to be pluggy, but at least it shouldn't be done in Transformer (and it should be done well enough that not doing it in Transformer simplifies our life, not the opposite).

Describe alternatives you've considered

…

Additional context

We should decide before many users start writing their own Transformer plugins, so that we can have a simple deprecation process for the current plugin system. I don't think we want to be maintaining two plugin systems in parallel.

In the documentation, links to compare releases are broken

https://transformer.readthedocs.io/en/latest/Changelog.html contains "Diff" link for each release to compare the changes between current and previous release. The links look like this one: v1.1.2...v1.1.3 and redirect to GitHub. Because we don't track the releases (or any other meaningful tags) in GitHub, these links show a message "There isn’t anything to compare".

Transformer version
1.1.3 and older

To Reproduce

Open https://transformer.readthedocs.io/en/latest/Changelog.html
Click on any of the "Diff" links

Expected behaviour
Either make the links valid (e.g. by publishing a release to GitHub at the same time we publish it to PyPI) or get rid of the broken links altogether.

Screenshots

Configuration file

In order to:

be able to run Transformer as a script,
allow users to have a one-time setup of Transformer (in either script of library use case),

we should make Transformer read its configuration from a file.

It was discussed that the configuration file could be either declarative (e.g. YAML, TOML, JSON, ...) or an actual Python file.

ℹ️ Issue imported from TIP/docs#391.

Har files containing PATCH http requests make transformer crash

Describe the bug
har files containing patch requests fail

Transformer version
1.1.3

To Reproduce
Steps and input files to reproduce the behavior:

Create a har file with some web site using a REST api and update some field. REST uses patch http requests to do it. Attached prueba3.zip as an example.
Run Transformer with the same command line explained in the examples
See error

  File ".../python3.6/site-packages/transformer/request.py", line 150, in from_har_entry
    method=HttpMethod[request["method"]],
  File "/usr/lib/python3.6/enum.py", line 329, in __getitem__
    return cls._member_map_[name]
KeyError: 'PATCH'

Expected behavior
A python available for locust should be produced

Desktop (please complete the following information):

OS: Ubuntu
Version:18.04

Additional context
Add any other context about the problem here.

CLI usage

Expected Behavior

I want to use the tool directly from the command line to convert HAR files to Locust scenarios.
Using it as a Python module would also be fine (python3 -m transformer ...).

Actual Behavior

I have to write a small Python script to use the transformer library.

Expose all HAR "entry" fields in each corresponding Task2 instance

Add plugin to filter entries in HAR files

Save HAR from chrome will save all requests to HAR.
Sometimes locustfile is not necessary include static files requests(js, css) and cached requests, add plugin to allow user filter entries.

(Re)implement the count_reporter plugin

As part of TIP/docs#330 “Introduce generic multi-level nested scenarios in Transformer”, we are breaking our count_reporter plugin, which assumes a fixed, 2-level structure of nested scenarios (ScenarioGroup then Scenario). The next step is to fix count_reporter so that it works well with the new structure, where ScenarioGroup and Scenario are merged and can be arbitrarily nested.

ℹ️ Imported from TIP/docs#387.

Plugin hook for final text

In some cases it is useful to access the literal text that is about to be output by transformer, rather than accessing the AST.

This could of course be done in post-processing step, but integrating it all into transformer is nice for useability.

Some use cases:

Apply code formatting (e.g. Black)
~~Add any code that is separate from the actual requests (e.g. event handler registration this could be done using regular hooks, but it is complex)~~ (solved this case using OpaqueBlock)

I can probably implement this myself, as just another plugin hook, called with the complete text just before output. If it sounds good to you @thilp

Invalid locustfile created when transforming a HAR without an associated weight file

Describe the bug
Since the recommended way to use Transformer on the command-line is to redirect stdout to a file e.g. transformer example.har > locustfile.py, if during the execution of Transformer any log statements are printed then they will appear in the locustfile and cause a syntax error upon execution. An example of a scenario where this happens is if you transform a HAR file without providing a weight file.

Transformer version
1.2.6

To Reproduce

Prepare a HAR file but no .weight file
Run Transformer with the arguments some_file.har > locustfile.py
Open the locustfile.py
Notice how the first line will be an [INFO] log statement from Transformer warning about a missing a .weight file

Expected behaviour
I should be able to produce a valid locustfile without having to specify a weight file.

Honestly it's a little tricky to know what the best behaviour here should be since the INFO warning was valid and I, the user, explicitly redirected the stdout to the file.

Desktop (please complete the following information):

OS: MacOS
Version: 10.14.6

Add functional tests

With more and more moving bits, the project has reached a state in which we can no longer rely on the unit tests only. We should set up and gradually add functional tests, that will help us gain more confidence in changing existing features as well as adding new ones.

Impossible to use plugins following cli documentation

Describe the bug
According to https://github.com/zalando-incubator/Transformer/blob/master/docs/Using-plugins.rst, together with https://transformer.readthedocs.io/en/latest/Writing-plugins.html#name-resolution , when using the cli,
transformer -p mod.sub har/ >loc.py
should work if mod/sub.py and mod/__init__.py files exist.
However, it triggers this error
ERROR Failed loading plugins: No module named 'mod'

Transformer version
1.2.3

To Reproduce
Steps and input files to reproduce the behavior:

Anywhere in your system, creates a dummy.py plugin file
Run Transformer with the -p python_path argument
See error

Expected behavior
Transformer should be able to find the plugin.
This is an easy fix for the problem:

--- a/transformer/plugins/resolve.py
+++ b/transformer/plugins/resolve.py
@@ -1,6 +1,8 @@
 import importlib
 import inspect
 import logging
+import os
+import sys
 from types import ModuleType
 from typing import Iterator
 
@@ -27,6 +29,7 @@ def resolve(name: str) -> Iterator[Plugin]:
     :raise InvalidContractError: from load_load_plugins_from_module.
     :raise NoPluginError: from load_load_plugins_from_module.
     """
+    sys.path.append(os.getcwd())
     module = importlib.import_module(name)
 
     yield from load_plugins_from_module(module)

but I am not sure if the intention of the project is providing this feature or change the docs to force to use plugins only after having been installed using pip, thus available in the sys.path.
In case the above solution is acceptable, I can fill a new PR, or feel free to use it.

Desktop (please complete the following information):

OS: Ubuntu
Version: 18.04.02

Could not convert HAR using library

Describe the bug

I have placed HAR in examples folder of transformer folder.
Created py file with below:
`import transformer.transform as entry

with open("locustfile.py", "w") as f:
entry.dump(f, "[./examples/1.har]")`
and ran transformer library, below is the exception:
➜ transformer git:(master) ✗ python3.9 NaveenTest.py
Traceback (most recent call last):
File "/Users/nmandepudi/transformer/NaveenTest.py", line 4, in
transformer.dump(f, ["./examples/1.har"])
File "/Users/nmandepudi/transformer/transformer/transform.py", line 91, in dump
file.writelines(
File "/Users/nmandepudi/transformer/transformer/transform.py", line 135, in intersperse
yield next(it)
File "/Users/nmandepudi/transformer/transformer/transform.py", line 106, in _dump_as_lines
scenarios = [
File "/Users/nmandepudi/transformer/transformer/transform.py", line 107, in
Scenario.from_path(
File "/Users/nmandepudi/transformer/transformer/scenario.py", line 155, in from_path
if path.is_dir():
AttributeError: 'str' object has no attribute 'is_dir'

Transformer version
Output of the command transformer --version.
1.3.0
To Reproduce
Steps and input files to reproduce the behavior:

Create the file "..."
create py file with below:
import transformer.transform as entry

with open("locustfile.py", "w") as f:
entry.dump(f, "[./examples/1.har]")

Run Transformer with the arguments "..."
python3.9 <>
See error

Expected behavior
A clear and concise description of what you expected to happen.
HAR should be converted py file.

Screenshots
output on running the above py file:

➜ transformer git:(master) ✗ python3.9 NaveenTest.py
Traceback (most recent call last):
File "/Users/nmandepudi/transformer/NaveenTest.py", line 4, in
transformer.dump(f, ["./examples/1.har"])
File "/Users/nmandepudi/transformer/transformer/transform.py", line 91, in dump
file.writelines(
File "/Users/nmandepudi/transformer/transformer/transform.py", line 135, in intersperse
yield next(it)
File "/Users/nmandepudi/transformer/transformer/transform.py", line 106, in _dump_as_lines
scenarios = [
File "/Users/nmandepudi/transformer/transformer/transform.py", line 107, in
Scenario.from_path(
File "/Users/nmandepudi/transformer/transformer/scenario.py", line 155, in from_path
if path.is_dir():
AttributeError: 'str' object has no attribute 'is_dir'

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: [e.g. Fedora, MacOS] MacOS
Version: [e.g. 22]

Additional context
Add any other context about the problem here.

Enable making editable installations

Already fixed in #84 (adding this ticket to satisfy merge requirements)

Design better plugin contracts for the Syntax Tree Framework

Currently, in Transformer, a plugin is a function that expects and returns a sequence of Task objects:

Plugin = Callable[[Sequence[Task]], Sequence[Task]]

Here is how it goes inside Transformer:

Task.from_requests: Task objects are created from the contents of HAR files.
Plugins application: Plugins are applied to these tasks, returning possibly changed Task objects (they can be identical to a pre-plugin Task object, or their request may have changed, or pre/post-processing code may have been added). These final tasks are stored as leaves of the Scenario objects tree.
Transformation into OpaqueBlocks via Task2: When serializing a Scenario tree via locust_taskset or locustfile, Task objects are converted into Task2 objects that encapsulate all "code strings" into OpaqueBlock objects.

So I see only three ways to use the syntax tree directly in plugins:

Change Task so that it can store OpaqueBlock nodes.
- But we don't really want to update and continue maintaining Task, we want it to go away.
Change the "plugin contract" and allow plugins to return Task2 objects (which can contain any syntax node).
- But Task2 doesn't currently support global code blocks (like Task does in a hacky way with global_code_blocks), and I don't think reproducing that Task feature into Task2 would be a good idea.
Add more "plugin contracts", for example:
- TaskPlugin = Callable[[Task2], Task2] for "stateless" plugins such as the header sanitizer;
- ScenarioPlugin = Callable[[Scenario], Scenario] for plugins that add new tasks inside existing tasks of a scenario, for all scenarios;
- PythonTreePlugin = Callable[[Program], Program], the most flexible, for plugins that need to change anything, and in particular globally-scoped setup/teardown code.

I'm in favor of approach 3 because the others rely on keeping intermediate solutions that we have an opportunity to clean up.

ℹ️ Issue imported from TIP/docs#398.

To-dos

CONTRIBUTING.md updated
CONTRIBUTORS.md updated with names of external contributors
CODEOWNERS updated with usernames of who review which PRs
MAINTAINERS updated with team member contact info
CODE_OF_CONDUCT.md reviewed
SECURITY.md reviewed
Pull request template reviewed
Issue template reviewed
Readme updated

Getting error while running the HAR file using CLI

Describe the bug

I am getting below error so code is not able to convert into Locust python file
WARNING while searching for HAR files, skipping C:\HAR_File\demoblaze.com.har: 'charmap' codec can't decode byte 0x8d in position 219193: character maps to
2021-03-08 15:40:33,826 ERROR Please help us fix this error by reporting it! https://github.com/zalando-incubator/Transformer/issues

Transformer version
1.3.0

To Reproduce
Steps and input files to reproduce the behavior:

https://www.demoblaze.com/
Use Google Chrome to capture network traffic
Export the har file
I used following command transformer C:\HAR_File/ >locustfile.py on windows command prompt

Expected behavior
It should create locustfile.py successfully without any errors

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

OS: Windows Command

Additional context
Add any other context about the problem here.

Provide helper methods for Transformer plugins

Follow-up after TIP/docs#327 “Extract helper methods from existing Transformer plugins”.

We could go with the following types of helper methods:

injecting (header, body parameter, query parameter): adding new value to existing ones, either to all requests or only to requests matching some filter;
removing (header, body parameter, ...): removing value either from all requests or only the ones matching some filter;
modifying (headers, body, url, ...): removing old value and using new one, either to all requests or only to requests matching some filter;
removing the request based on filter (plugin based blacklist?).

ℹ️ Imported from TIP/docs#336.

Release on PyPI

This Python package should be released to the official Python registry PyPI.

Merge Task into Task2

TIP/transformer#10 “Arbitrarily nested scenarios” solved TIP/docs#330 “Introduce generic multi-level nested scenarios in Transformer” by introducing the transformer.python module, which represents Python code as a syntax tree (i.e. a tree of objects, each representing a syntactic element like "class" or "assignment").

transformer.python.OpaqueBlock does not represent a regular Python element but encapsulates a string and "pretends" it is a normal syntax tree. It helps by not forcing the introduction of the syntax tree to instantly propagate everywhere in Transformer: without OpaqueBlock, Task and all plugins would have had to be updated in TIP/transformer#10.

However, eventually this usage of OpaqueBlock should go away: plugins should be updated to use the more powerful & testable framework of transformer.python too. Once plugins are updated, Task can be merged into the Task2 proxy. At that point, the only remaining usage of OpaqueBlock should be for cases not yet covered by the rest of transformer.python (like the raise statement today).

~~Define new types of plugins that interact with Task2 and/or the syntax tree instead of Task and its strings~~ → #10
Convert all existing plugins into "new type" plugins.
Merge Task into Task2, keeping the name "Task".
Update our plugin documentation.

ℹ️ Issue imported from TIP/docs#395

Exported HAR works in 0.14.6 but not 1.0.0-1.0.2, cannot import name 'TaskSequence'

My exported HAR file has this error on locust 1.0.2 but works fine in 0.14.6:

Traceback (most recent call last):
  File "/home/tim/dev/qbrio-load/venv/bin/locust", line 11, in <module>
    sys.exit(main())
  File "/home/tim/dev/qbrio-load/venv/lib/python3.6/site-packages/locust/main.py", line 113, in main
    docstring, user_classes = load_locustfile(locustfile)
  File "/home/tim/dev/qbrio-load/venv/lib/python3.6/site-packages/locust/main.py", line 77, in load_locustfile
    imported = __import_locustfile__(locustfile, path)
  File "/home/tim/dev/qbrio-load/venv/lib/python3.6/site-packages/locust/main.py", line 53, in __import_locustfile__
    return  source.load_module()
  File "<frozen importlib._bootstrap_external>", line 399, in _check_name_wrapper
  File "<frozen importlib._bootstrap_external>", line 823, in load_module
  File "<frozen importlib._bootstrap_external>", line 682, in load_module
  File "<frozen importlib._bootstrap>", line 265, in _load_module_shim
  File "<frozen importlib._bootstrap>", line 684, in _load
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/tim/dev/qbrio-load/locustfile.py", line 5, in <module>
    from locust import TaskSequence
ImportError: cannot import name 'TaskSequence'```

add attribute `name` to class `request`

Default task name is same as url in output locustfile.

class xxxxxxxxxxxx(TaskSet):
    @task(1)
    class xxxxxxxxxxxxxxxx(TaskSet):
        @seq_task(1)
        def GET_https_www_google_com_705430910___3145776_8752771539154967138(self):
            response = self.client.get(url='https://www.google.com/', name='https://www.google.com/', headers={':method': 'GET', ':authority': 'www.google.com', ':scheme': 'https', ':path': '/'}, timeout=30, allow_redirects=False)

add attribute name to class request, allow user change the task name in OnTask plugin?
if request.name was set to index, the output changed into

class xxxxxxxxxxxx(TaskSet):
    @task(1)
    class xxxxxxxxxxxxxxxx(TaskSet):
        @seq_task(1)
        def GET_https_www_google_com_705430910___3145776_8752771539154967138(self):
            response = self.client.get(url='https://www.google.com/', name='index', headers={':method': 'GET', ':authority': 'www.google.com', ':scheme': 'https', ':path': '/'}, timeout=30, allow_redirects=False)

Got an Error when transforming a HAR file generated for an Angular Application

Transformed a har file which described this error

2020-10-27 20:39:17,586 ERROR Please help us fix this error by reporting it! https://github.com/zalando-incubator/Transformer/issues
Traceback (most recent call last):
File "c:\progapps\python37\lib\site-packages\transformer\scenario.py", line 316, in from_har_file
har = json.load(file)
File "c:\progapps\python37\lib\json_init_.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "c:\progapps\python37\lib\json_init_.py", line 348, in loads
return _default_decoder.decode(s)
File "c:\progapps\python37\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "c:\progapps\python37\lib\json\decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 195542 column 24 (char 14100343)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\progapps\python37\lib\site-packages\transformer\cli.py", line 90, in script_entrypoint
dump(file=sys.stdout, scenario_paths=config.input_paths, plugins=config.plugins)
File "c:\progapps\python37\lib\site-packages\transformer\transform.py", line 92, in dump
intersperse("\n", _dump_as_lines(scenario_paths, plugins, with_default_plugins))
File "c:\progapps\python37\lib\site-packages\transformer\transform.py", line 135, in intersperse
yield next(it)
File "c:\progapps\python37\lib\site-packages\transformer\transform.py", line 113, in _dump_as_lines
for path in scenario_paths
File "c:\progapps\python37\lib\site-packages\transformer\transform.py", line 113, in
for path in scenario_paths
File "c:\progapps\python37\lib\site-packages\transformer\scenario.py", line 169, in from_path
blacklist=blacklist,
File "c:\progapps\python37\lib\site-packages\transformer\scenario.py", line 335, in from_har_file
raise SkippableScenarioError(path, err)
transformer.scenario.SkippableScenarioError: (WindowsPath('Locust-QA-SurgeonC.har'), JSONDecodeError('Unterminated string starting at: line 195542 column 24 (char 14100343)'))

transformer --version
1.3.0

To Reproduce
Steps and input files to reproduce the behavior:

Generated a Har file from Chrome for a custom Angular Application - hosted on cloud
transfomer XX.har >XX.har on windows
See Error:
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 195542 column 24 (char 14100343)

During handling of the above exception, another exception occurred:

Expected behavior
Generate a valid py file for Locust to run this.
I was able to transform a HAR to .py file for another recording on the same Environment.

Screenshots
If applicable, add screenshots to help explain your problem.

Windows 10 64 Bit
Chrome : Version 86.0.4240.111 (Official Build) (64-bit)

TransformerError.txt

OnTask plugin not work

OnTask plugin not work
Run transformer with sanitize_headers plugin, but output is include header start with :

To Reproduce

save google.HAR with

{
  "log": {
    "version": "1.2",
    "creator": {
      "name": "WebInspector",
      "version": "537.36"
    },
    "pages": [
      {
        "startedDateTime": "2019-02-24T04:54:52.380Z",
        "id": "page_1",
        "title": "https://www.google.com/",
        "pageTimings": {
          "onContentLoad": 2006.0650000013993,
          "onLoad": 3977.9940000007628
        }
      }
    ],
    "entries": [
      {
        "startedDateTime": "2019-02-24T04:54:52.379Z",
        "time": 1986.7290000001958,
        "request": {
          "method": "GET",
          "url": "https://www.google.com/",
          "httpVersion": "http/2.0",
          "headers": [
            {
              "name": ":method",
              "value": "GET"
            },
            {
              "name": ":authority",
              "value": "www.google.com"
            },
            {
              "name": ":scheme",
              "value": "https"
            },
            {
              "name": ":path",
              "value": "/"
            }
          ],
          "queryString": [],
          "headersSize": -1,
          "bodySize": 0
        },
        "response": {
          "status": 200,
          "statusText": "",
          "httpVersion": "http/2.0",
          "headers": [
          ],
          "cookies": [
          ],
          "content": {
            "size": 240861,
            "mimeType": "text/html"
          },
          "redirectURL": "",
          "headersSize": -1,
          "bodySize": -1,
          "_transferSize": 70718
        },
        "cache": {},
        "_priority": "VeryHigh",
        "_resourceType": "document",
        "connection": "44899",
        "pageref": "page_1"
      }
    ]
  }
}

run transformer -p transformer.plugins.sanitize_headers google.HAR
output is

# File automatically generated by Transformer:
# https://github.bus.zalan.do/TIP/transformer
import re
from locust import HttpLocust
from locust import TaskSequence
from locust import TaskSet
from locust import seq_task
from locust import task
class xxxxxxxxxxxxxxxxxxxxxxxxx(TaskSet):
    @task(1)
    class xxxxxxxxxxxxx(TaskSet):
        @seq_task(1)
        def GET_https_www_google_com_705430910___3145776_4802625401794803060(self):
            response = self.client.get(url='https://www.google.com/', name='https://www.google.com/', headers={':method': 'GET', ':authority': 'www.google.com', ':scheme': 'https', ':path': '/'}, timeout=30, allow_redirects=False)
class xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx(HttpLocust):
    task_set = xxxxxxxxxxxxxxxxxxxxxxxxx
    weight = 1
    min_wait = 0
    max_wait = 10

hraders in request is include :method :authority :scheme and :scheme, plugin seem not worked

Store HTTP headers in a case-insensitive dict

Implementation of the plugin to inject the User-Agent header raised our awareness that the only way to deterministically inject any header is to keep all headers in a case-insensitive dictionary. The requests library already does it.

Considering the following example:

>>> import requests
>>> r = requests.get("https://google.com", headers={'user-agent': 'lowercase', 'User-Agent': 'Uppercase'})
>>> r.request.headers
{'User-Agent': 'Uppercase', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
>>> r = requests.get("https://google.com", headers={'User-Agent': 'Uppercase', 'user-agent': 'lowercase'})
>>> r.request.headers
{'user-agent': 'lowercase', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

it appears that the requests library does something to only keep the last header.

ℹ️ Imported from TIP/docs#353.

Move tests to a separate directory

Currently the tests are in the same directory as the source, and they are shipped along with the whole package. Moving them to a separate directory is suggested as standard.

Keep releases & documentation synchronized

Problem

When @lispyclouds tried Transformer, the stable release was 1.0.1 (which is what he got from PyPI).

However, some work had already been merged to master: in particular, the addition of the dump method and the README update that made use of dump in the main example.

So the first thing our user tried is dump, which did not exist in his release (from PyPI, on 1.0.1) but was documented (in master, on 1.0.2.dev*). He experienced a crash and, had he not the chance (?) to sit close to Transformer's maintainers (who told him to use pip install --pre to get 1.0.2.dev), he would probably have concluded that this project was just broken.

How should we release new versions and maintain our documentation so that this doesn't happen?

Ideas

Every commit on master is a stable PyPI release. Prerelease development happens on a dev branch, from which (resp. into which) every feature branch starts (resp. is merged).
- Downside: This make us no longer stick to GitHub Flow, which may be confusing. Contributors will likely often try to merge their work into master out of habit.
- Downside: This solves our problem with respect to documentation only if we track the documentation with the code (as in our example, with the readme). However, we recently started using GitHub's wiki feature because it is more comfortable. Unless we add constraints like branches on the wiki too, our wiki documentation will always reflect the most recent, prerelease state.
Same as above but master stays the "dev" branch: A new, protected branch release is created instead.
- Downside: Same as above with respect to documentation in the wiki.
When documentation is updated, the next stable version is mentioned. Example: "Changed in v1.0.2: You can use the dump method etc.".
- Downside: This looks verbose and ad-hoc. We should come up with a standardized format to make writing and reading such annotations easier.
Every commit on master is a stable PyPI release. And that's it: no alternative "prerelease" branch.
- Downside: All PRs must go through the release process, in particular the full changelog update (links, etc.), which can be painful when multiple PRs are worked on concurrently.
- Downside: You can only write the documentation (in the wiki) after the PR is merged, otherwise we're not solving this issue. So the documentation is not reviewed and we can even forget to add it, since it happens after the fact. This can be mitigated:
  - Update the documentation before the PR is merged but with annotations like "prerelease: …". The documentation must be updated again after the PR is merged to remove these annotations, and the same kind of problems apply (can be forgotten, etc.). ~~No easy or automatic way to check that the current annotations are still relevant.~~ We can include the PR number in the annotation for easier tracking.
  - Opening an issue about updating the documentation is a mandatory part of the PR process. Downside: more paperwork. But at least it's tracked properly, and the issue can contain the planned changes so that they can be reviewed before the PR merge.

Unable to generate the locust file in Windows system

Unable to generate the locust file in windows system throwing error No scenarios inside directory
2020-07-18 23:19:52,750 WARNING while searching for HAR files, skipping D:\Tunr\tunr.har: 'charmap' codec can't decode byte 0x81 in position 1741333: character maps to
2020-07-18 23:19:52,751 ERROR Please help us fix this error by reporting it! https://github.com/zalando-incubator/Transformer/issues
Traceback (most recent call last):
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\cli.py", line 90, in script_entrypoint
dump(file=sys.stdout, scenario_paths=config.input_paths, plugins=config.plugins)
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\transform.py", line 91, in dump
file.writelines(
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\transform.py", line 135, in intersperse
yield next(it)
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\transform.py", line 106, in _dump_as_lines
scenarios = [
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\transform.py", line 107, in
Scenario.from_path(
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\scenario.py", line 156, in from_path
return cls.from_dir(
File "C:\Users\rishi.sharma08\AppData\Roaming\Python\Python38\site-packages\transformer\scenario.py", line 252, in from_dir
raise SkippableScenarioError(path, "no scenarios inside the directory")
transformer.scenario.SkippableScenarioError: (WindowsPath('D:/Tunr'), 'no scenarios inside the directory')

error when transform from har to py using command line

Getting below error when doing the transform

.......
File "C:\Users\mxu\workspace\nftenv\nftenv\Lib\site-packages\transformer\request.py", line 46, in
@DataClass
^^^^^^^^^
File "C:\Users\mxu\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 1220, in dataclass
return wrap(cls)
^^^^^^^^^
File "C:\Users\mxu\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 1210, in wrap
return _process_class(cls, init, repr, eq, order, unsafe_hash,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mxu\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 958, in _process_class
cls_fields.append(_get_field(cls, name, type, kw_only))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\mxu\AppData\Local\Programs\Python\Python311\Lib\dataclasses.py", line 815, in _get_field
raise ValueError(f'mutable default {type(f.default)} for field '
ValueError: mutable default <class 'mappingproxy'> for field headers is not allowed: use default_factory

command line: transformer C:\Users\mxu\workspace\nftenv\har\ >locustfile.py

windows 10 64bit.

Refactor transformer.blacklist.on_blacklist

Originally reported by @thilp.

on_blacklist finds, opens, and parses the .urlignore file for every URL it has to check. Having instead two functions like make_blacklist(path) -> Set[re.Pattern] and on_blacklist(blacklist: Iterable[re.Pattern], url) -> bool would reduce redundant operations, improve performance, and make testing simpler.

It also blacklists URLs that contain any sequence from .urlignore, regardless of its position, so https://www.zalando.de/carhartt-wip-pixel-t-shirt-print-c1422o047-a11.html would be excluded by the .urlignore we use for our tests because it contains pixel. We should probably limit that by using regexes instead of strings.
Rename parameter to better describe its purpose.

Make .urlignore ignore based on whole URL instead of just hostname

The urlignore/denylist functionality doesnt actually ignore based on url, it only checks hostname/netloc. This is a huge limitation (and doesnt match the documentation).

I have a PR that changes this, but since there are actually tests that verify this behaviour I think it is intentional, so I wanted to check first.

Transformer crashes on our example HAR files

To Reproduce

Run:

transformer examples/www.google.com.har

or:

transformer examples/en.zalando.de.har

Expected behavior

Transformer should certainly not crash, and it should produce a valid locustfile.

Initially reported offline by @bmaher.

Har file fails to transform to py file

Describe the bug
I have a few .har files that won't covert to .py files using transformer. Reason unknown. I can see the attempt to convert it but after-the-fact the .py file size is 0 bytes.

Transformer version
Output of the command transformer --version.
1.2.6

To Reproduce
Steps and input files to reproduce the behavior:

Create the file "..."
transformer C:\Users\mjohns33\PycharmProjects\Master\ppmi_v2.har >newppmo.py
Run Transformer with the arguments "..."
No arguments.
See error
No errors, it appears to have ran but nothing is generated on the .py file.

Expected behavior
The .har file to be converted to .py file

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):
-Win10

Additional context
I'd like to try to play back the .har file using locust but I'm unsure of the syntax in the locustfile.py.

ppmi_v2.zip
other context about the problem here.

Drop support for Locust 0.x, and make use of SequentialTaskSet optional, etc

In #80 I said:

For a new major version I would love to drop Locust 0.x support and maybe put everything in one task (insted of @task:s in a SequentialTaskSet - task sets are really an advanced feature, and a lot of people get confused when using them :)
And maybe optionally add catch_response=True and with-blocks.

@thilp said:

I must admit I’ve not kept myself up to date with Locust best practices, so I take your word for it. If I remember correctly, we are using SequentialTaskSet to attribute different weights to scenarios. How would you achieve the same flexibility with a single task, have Transformer hardcode the load-balancing code directly inside the task? Or do you simply not use weights?

Perhaps we can also continue this discussion in a dedicated issue, since this will likely outlive this PR 🙂

Yes, lets do that :)

I would suggest discarding the support for weighting (as it is much more user friendly to build upon a generated script where every request as a single line) or leave the current SequentialTaskSet-approach as an option (preferably not the default).

I think we should also rename a some things, as recorded requests is the basic building block, not tasks (like Contract.OnTask to Contract.OnRequest)

Another useability improvement I would like to suggest is that a plugin should be able to return None, and that should simply ignore the request (instead of causing an exception).

Update header comment info (incl. URL) in generated code

At the top of the generated locustfile code it outputs, Transformer adds a short comment:

# File automatically generated by Transformer:
# https://github.bus.zalan.do/TIP/transformer

This needs to be updated:

the repository's URL should be https://github.com/zalando-incubator/transformer;
Transformer's version should be included in that message.

To Reproduce
Steps to reproduce the behavior:

Create fake.har, the minimal HAR file acceptable by Transformer: echo '{"log":{"entries":[]}}' >fake.har.
Run Transformer on fake.har and inspect the first two lines of output, disregarding log output from standard error: transformer fake.har 2>/dev/null | head -n2.
Observe.

Expected behavior

# File automatically generated by Transformer v1.0.1:
# https://github.com/zalando-incubator/transformer

Desktop (please complete the following information):

Versions: 1.0.0, 1.0.1

Invalid Sequential Taskset

Thanks for this tool - quite nifty and useful. Just letting you know that it appears your generated Locustfile is attempting to be a sequential taskset (i.e. it's using @seq_task) however it's inheriting from the standard (non sequential) Taskset:

For example:

# File automatically generated by Transformer v1.2.4:
# https://github.com/zalando-incubator/Transformer
import re
from locust import HttpLocust
from locust import TaskSequence
from locust import TaskSet
from locust import seq_task
from locust import task
class C__Users_KellyBrownsberger_Downloads_har_dev_autobooks_co_har_3805485026(TaskSet):
    @seq_task(1)
    def POST_https_dev_api_autobooks_co_1337395116__authentication_mockVerify_2314996318_3125845862907443369(self):

    @seq_task(2)
    def POST_https_dev_apim_autobooks_co_1481902105__smb_user_loginbytoken_1595869348_4597368517360881465(self):

    @seq_task(3)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_5068277312115112438(self):

    @seq_task(4)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_7963396691033079014(self):

    @seq_task(5)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_503109427325622475(self):

    @seq_task(6)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_1588174576440816270(self):

    @seq_task(7)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_727774531110247713(self):

    @seq_task(8)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_2094824709827003355(self):

    @seq_task(9)
    def POST_https_dev_ql_autobooks_co_1215498063___3145776_925146618268756551(self):

class LocustForC__Users_KellyBrownsberger_Downloads_har_dev_autobooks_co_har_3805485026(HttpLocust):
    task_set = C__Users_KellyBrownsberger_Downloads_har_dev_autobooks_co_har_3805485026
    weight = 1
    min_wait = 0
    max_wait = 10

Shouldn't it be the following?

class C__Users_KellyBrownsberger_Downloads_har_dev_autobooks_co_har_3805485026(TaskSequence):
    @seq_task(1)
    def POST_https_dev_api_autobooks_co_1337395116__authentication_mockVerify_2314996318_3125845862907443369(self):

In my case, these did NOT run sequentially until I changed the base class

zalando-incubator / transformer Goto Github PK

transformer's Introduction

Transformer

Installation

Usage

Command-line

Library

Documentation

Authors

License

transformer's People

Contributors

Stargazers

Watchers

Forkers

transformer's Issues

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Expected Behavior

Actual Behavior

Problem

Ideas

Recommend Projects

Recommend Topics

Recommend Org