Code Monkey home page Code Monkey logo

cloudconvert-python's Issues

method for extracting files API

Hello, I would like help, I am trying to download files from S3 to perform the conversion, but the files are compressed, is there any method that I can use via API to perform the decompression of these files?

How to convert from pdf to docx in local storage

Hi @josiasmontag , Can you tell me what job I need to create if I want to convert test.pdf into test.docx?
I am not getting what should I do first either import or direct convert because input pdf file is not present on any url, it is primarily located in my local storage and I want to store converted file also in my local storage..

Please help me in this as it's very important for me..
Thanks

Error: Saving to S3 failed

I'm using cloudconvert==1.0.0 and trying to send to S3 the converted file with the following code (accordingly to the docs)

process.start({
    'input': 'upload',
    'file': open('music_test.mp3', 'rb'),
    "outputformat": "m4a",
    "output": {
        "s3": {
            "accesskeyid": "my-access-key",
            "secretaccesskey": "secret-key",
            "bucket": "bucket.name"
        }
    }
})

But I'm always getting this error message even already setting up permission to everyone list/upload/delete files in this bucket:

cloudconvert.exceptions.BadRequest: Saving to S3 failed: The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256. (Code: InvalidRequest)

Looks like I have to do some authentication process before this, but I didn't see anything from docs.

spaces in filename

Hello im using cloudconvert-python with my script to convert doc files to pdf. Works great but i have problem with uploading files with space in filename.

file = '/Users/xxx/Desktop/name 1.doc'
uploaded = cloudconvert.Task.upload(file_name=file, task=import_task)

on web taks page i have error ERROR INPUT_TASK_FAILED Input task has failed...

Filename without spaces in name works without problem. Tried several ways (escape filename, url encoding), but cant find how to fix it. Any idea? Its propablly not bug but my stupidity :)

Don't try to download if compression file isn't going to remain available

I submitted a support ticket but wanted to raise the issue here, as well. We are making a request for a PDF to be compressed using the following code (simplified for brevity):

# Initiate the compression via Cloud Convert
process = cc_api.createProcess({
    "mode": "compress",
})

process.start({
    "mode": "compress",
    "input": "download",
    "file": self.pdf_file.url,
})

process.wait()

path = ...code to generate path to file...
process.download(path)

Sometimes, the process.wait() will receive a 200 response from the API with what appears to be a finished compression and all of the required attributes to generate a download. However, there's a message property in the JSON response that reads: We could not compress this file. It seems this file is already compressed.

When the process.download(path) is fired from self.api.rawCall("GET", ...) in process.py the request to download the file is returned a 404 response with a message of File already deleted. Use the 'save' parameter, otherwise files will be deleted after the first download..

In my mind, it doesn't make sense for the compression API to return a 200 if the download is not going to work. For now, I've modified our code to look for the message above but I feel like the original API response should be more explicit if the file is not going to be available.

Migration guide

Hello

Do you have some migration guide from v1 to v2?

My cases are:

process = api.convert({
        "inputformat": FILE_FORMAT
        "outputformat": "pdf",
        "input": "download",
        "file": FILE_PATH
    })

    try:
        process.wait()
    except cloudconvert.exceptions.ConversionFailed as err:
        logger.error('Error during convection: %s -> %s', FILE_NAME, str(err))
        raise ConversionError(str(err))

    res = process.api.rawCall("GET", process['output']['url'], stream=True)
process = api.createProcess({
        "inputformat": "pdf",
        "outputformat": "pdf"
    })
...
process.start({
        "mode": "combine",
        "input": "download",
        "files": [PATH_1, PATH_2]
    })

... SAME ERROR HANDLING...

res = process.api.rawCall("GET", process['output']['url'], stream=True)

and

process = api.createProcess({
        "inputformat": "pdf",
        "outputformat": "pdf"
    })

    process.start({
        "mode": "info",
        "input": "download",
        "file": FILE_URL
    })

    try:
        process.wait()

        if process['info']['Encrypted'].startswith('yes'):
...

Or how these should be changed to fit v2?

Unauthenticated code when trying to configure API v2

Currently I trying to add the API token, as follows:

import cloudconvert
cloudconvert.configure(api_key="XXXXXX", sandbox=False)

job = cloudconvert.Job.create(payload={
    'tasks': {
        'upload-my-file': {
            'operation': 'import/upload'
        }
    }
})

But getting this result:

{'message': 'Unauthenticated.', 'code': 'UNAUTHENTICATED'}

I created the API Token, as follows:

![image](https://user-images.githubusercontent.com/16579856/146591463-4a5355b2-ca78-4ab2-be02-d16077412f82.png)

Any idea how to fix this?

API Environment Var name is wrong

Several places in the documentation mention that CLOUDCONVERT_API_KEY is the name of the environment variable to set in order to use the default_client(). However, environment_vars.py sets the name of the variable to be just "API_KEY", which is directly used in the default_client(). My guess is that environment_vars.py is supposed to just have the default values if the variables aren't present, but somehow was mistaken to have the names of those variables.

Avoid using global state

Please consider using standard OOP patterns as it's common for such libraries instead of relying on global state.
Nothing wrong with providing some convenience utils that rely on a global client, but developers who want to nicely encapsulate things should not be required to deal with global state.

A nice API would be something like this:

cconvert = CloudConvertClient(api_key=..., sandbox=...)
cconvert.foo()

I think #13 and #24 are both about issues which are implicitly caused by the current design.

For a commercial product (albeit at very decent pricing, so nothing wrong with that!) I also find the lack of activity on this repo (and issues) a bit worrying TBH. Can we assume that your infrastructure is better maintained than the client libraries? I'd hope so but you never know... ;)

IndexError

Recently we experienced the following error:

"Traceback (most recent call last):"," File \"/app/src/app/services/third_party/cloudconvert_converter_service.py\", line 38, in perform"," cloudconvert.Task.wait(id=export_task_id)"," File \"/usr/local/lib/python3.7/site-packages/cloudconvert/resource.py\", line 188, in wait"," res = api_client.get(url)"," File \"/usr/local/lib/python3.7/site-packages/cloudconvert/cloudconvertrestclient.py\", line 162, in get"," return self.request(util.join_url(self.endpoint, action), 'GET', headers=headers or {})"," File \"/usr/local/lib/python3.7/site-packages/cloudconvert/cloudconvertrestclient.py\", line 71, in request"," return self.http_call(url, method, json=body, headers=http_headers)"," File \"/usr/local/lib/python3.7/site-packages/cloudconvert/cloudconvertrestclient.py\", line 102, in http_call"," method, url, proxies=self.proxies, **kwargs)"," File \"/usr/local/lib/python3.7/site-packages/requests/api.py\", line 60, in request"," return session.request(method=method, url=url, **kwargs)"," File \"/usr/local/lib/python3.7/site-packages/requests/sessions.py\", line 533, in request"," resp = self.send(prep, **send_kwargs)"," File \"/usr/local/lib/python3.7/site-packages/requests/sessions.py\", line 646, in send"," r = adapter.send(request, **kwargs)"," File \"/usr/local/lib/python3.7/site-packages/requests/adapters.py\", line 449, in send"," timeout=timeout"," File \"/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py\", line 677, in urlopen"," chunked=chunked,"," File \"/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py\", line 426, in _make_request"," six.raise_from(e, None)"," File \"<string>\", line 3, in raise_from"," File \"/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py\", line 421, in _make_request"," httplib_response = conn.getresponse()"," File \"/usr/local/lib/python3.7/site-packages/sentry_sdk/integrations/stdlib.py\", line 54, in getresponse"," rv = real_getresponse(self, *args, **kwargs)"," File \"/usr/local/lib/python3.7/http/client.py\", line 1321, in getresponse"," response.begin()"," File \"/usr/local/lib/python3.7/http/client.py\", line 296, in begin"," version, status, reason = self._read_status()"," File \"/usr/local/lib/python3.7/http/client.py\", line 257, in _read_status"," line = str(self.fp.readline(_MAXLINE + 1), \"iso-8859-1\")"," File \"/usr/local/lib/python3.7/socket.py\", line 594, in readinto"," if e.args[0] in _blocking_errnos:","IndexError: tuple index out of range","",

Retry did help, and we can't reproduce it, but it seems to be inside cloudconverter http calls can you please suggest what could have happened here?

How to set document parameters

I'm passing raw html markup and want to get a document in docx format. This I have already done. Question: is it possible to set document parameters? I need to set page margins and automatic numbering.

Intermittent "ConnectionError: ('Connection aborted.', BadStatusLine(""''''"))" on .refresh()

This started Late August / Early September of 2018.

This happens maybe 20% of the time.

I recently added from __future__ import unicode_literals in the code.

See also: psf/requests#2364

Python 2.7.15

pip show cloudconvert
Name: cloudconvert
Version: 1.0.0
pip show requests
Name: requests
Version: 2.11.1
[2018-09-04 17:49:16,857] INFO in handin: Convertible handin found for https://work-test.s3-eu-west-1.amazonaws.com/987872540624879.docx that isn't converted yet. Initiating conversion.
Captured exception in MockSentry: (('HTTP request failed error', ConnectionError(ProtocolError('Connection aborted.', BadStatusLine("''",)),)))
Traceback (most recent call last):
  File "/path/application/handin/views/handin.py", line 49, in handin_convert
    convert_handin_file(handin)
  File "/path/application/doc_conversion/convert.py", line 208, in convert_handin_file
    process = poll_cloudconvert.delay(handin, process)
  File "/path/.venv/local/lib/python2.7/site-packages/rq/decorators.py", line 64, in delay
    meta=self.meta, description=self.description)
  File "/path/.venv/local/lib/python2.7/site-packages/rq/queue.py", line 252, in enqueue_call
    job = self.enqueue_job(job, at_front=at_front)
  File "/path/.venv/local/lib/python2.7/site-packages/rq/queue.py", line 328, in enqueue_job
    job = self.run_job(job)
  File "/path/.venv/local/lib/python2.7/site-packages/rq/queue.py", line 257, in run_job
    job.perform()
  File "/path/.venv/local/lib/python2.7/site-packages/rq/job.py", line 573, in perform
    self._result = self._execute()
  File "/path/.venv/local/lib/python2.7/site-packages/rq/job.py", line 579, in _execute
    return self.func(*self.args, **self.kwargs)
  File "/path/application/doc_conversion/jobs.py", line 95, in poll_cloudconvert
    process.refresh()
  File "/path/.venv/local/lib/python2.7/site-packages/cloudconvert/process.py", line 38, in refresh
    self.data = self.api.get(self.url, parameters)
  File "/path/.venv/local/lib/python2.7/site-packages/cloudconvert/api.py", line 58, in get
    return self.rawCall('GET', path, None, is_authenticated)
  File "/path/.venv/local/lib/python2.7/site-packages/cloudconvert/api.py", line 145, in rawCall
    raise HTTPError("HTTP request failed error", error)
HTTPError: ('HTTP request failed error', ConnectionError(ProtocolError('Connection aborted.', BadStatusLine("''",)),))

127.0.0.1 - - [04/Sep/2018 17:49:29] "GET /api/handin/convert/5b8f0bc99f6cf00e67dd9001 HTTP/1.1" 500 -

KeyError: 'tasks'

def convert_oga_to_wav(audio_url, name):
    cloudconvert.configure(api_key=API_KEY, sandbox=False)
job = cloudconvert.Job.create(payload={
    "tasks": {
        "import-1": {
            "operation": "import/url",
            "url": audio_url,
            "filename": name
        },
        "task-1": {
            "operation": "convert",
            "input_format": "oga",
            "output_format": "wav",
            "engine": "ffmpeg",
            "input": [
                "import-1"
            ],
            "audio_codec": "pcm_s16le",
            "audio_bitrate": 128,
            "engine_version": "4.1.4",
            "filename": "output.wav"
        },
        "export-1": {
            "operation": "export/url",
            "input": [
                "task-1"
            ],
            "inline": True,
            "archive_multiple_files": False
        }
    }
})
    job = cloudconvert.Task.wait(id=job['id'])

    for task in job["tasks"]:
        if task.get("name") == "export-it" and task.get("status") == "finished":
            export_task = task

    file = export_task.get("result").get("files")[0]
    return cloudconvert.download(filename=file['filename'], url=file['url'])
  File "----------------", line 107, in convert_oga_to_wav
    for task in job["tasks"]:
KeyError: 'tasks'

the code does not reach its logical conclusion, and catches a dictionary error

Python Wrapper - Api Module

Hi ,

When running the code snippets generated from the console-api to convert pdf to docx getting the below mentioned error.

AttributeError: 'module' object has no attribute 'Api'.

i have installed using
pip install cloudconvert

regards,

rajith

Allow download to variable instead of file

Can the API allow downloading the content into a variable instead of to file? Currently, I have to download the document and then read it back in.

Perhaps something like this :-

content = process.download(inline=True) OR content = process.downloadAsRaw()

api_client parameter ignored and overwritten with default_client

api_client = default_client()

All of these methods (List, Find, Delete, Wait, Show, Create) have a similar api_client = default_client() assignment which overrides any provided api_client parameters.

As this library is currently written it seems impossible to call any of these methods with a non-default client.

Also: I do not want to use the global __client__ variable as I am trying to wrap the cloudconvert library into a more specific use-case library.

Python SDK Job.create() doesn't propagate the request status code

Using the Python SDK and following the instructions the Job.create() return a dict. It would be super useful if it could also carry the status code of the request. In the docs it says that: Jobs and tasks that completed with an error, do have status set to error
It would be nice to have it in the SDK as well. Or maybe there is already, in that case awesome, and you should really include it in the README

Thanks,

Marco

README no longer up-to-date (KeyError)

Hi,

So when I try to use the sample code in README:

file = res.get("result").get("files")[0]
res = cloudconvert.download(filename=file['filename'], url=file['url'])

I got this error: KeyError: 'url'.

I tried to print out the URL object, and I got: {'filename': 'demo-flask.zip', 'size': 679}, which doesn't include URL key.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.